Skip to content

PDX-509: validity bridge, tri-state status + quality threshold#211

Merged
mrdailey99 merged 4 commits into
developfrom
feature/PDX-509-validity-bridge
Jun 4, 2026
Merged

PDX-509: validity bridge, tri-state status + quality threshold#211
mrdailey99 merged 4 commits into
developfrom
feature/PDX-509-validity-bridge

Conversation

@mrdailey99
Copy link
Copy Markdown
Collaborator

PDX-509 — Validity-surfacing bridge + quality threshold

Re-aligns the local MCP validator with Provar's test-case-validity model. Previously the validator ran a two-layer split the Quality Hub API does not have: critical best-practice violations only moved quality_score and never gated is_valid, so a test that won't load in Provar (e.g. an unknown apiId) could still come back is_valid: true.

What changed

  • Validity bridge — every un-suppressed critical best-practice violation is surfaced into issues[] as an ERROR and now gates is_valid. quality_score is computed from the untouched BP list, so QH Lambda score parity is preserved (a bridged critical appears in both issues[] and best_practices_violations[]).
  • Concept-ownership dedup — schema rules (TC_*) are authoritative for the concepts they own (root, identifier, steps presence, testItemId, comparisonType); the matching BP criticals are never double-reported. An empty-but-present <steps> still loads.
  • Tri-state status (invalid / needs_improvement / valid) + meets_quality_threshold + effective quality_threshold, on provar_testcase_validate and the hierarchy tools (testsuite/testplan/project).
  • Quality threshold default 80 → 90, new PROVAR_MCP_QUALITY_THRESHOLD env var, shared resolveQualityThreshold (precedence: per-call arg → env → 90).
  • varStringLiteral — dropped the incorrect sfUiTargetObjectId/sfUiTargetResultName exemption (a bare {Var} there lands in the URL as %7B…%7D and hard-fails). Stays major.
  • Generator date emission fixed — it emitted valueClass="date"/"datetime" with raw ISO strings. The real Provar corpus (AllPOCProjects, 651 cases) stores date/dateTime exclusively as epoch milliseconds with camelCase dateTime, and RENDER-DATE-VALUECLASS-001 confirms an ISO string fails to load. The generator now converts ISO → epoch ms (bare date = UTC midnight, tz-less datetime = UTC) with canonical casing, and RENDER-CASE-001 (validateValueClassCasing) was corrected to accept camelCase dateTime.

Severity audit

All 29 implemented critical rules are genuinely load-blocking except VAR-NAMING-001, whose own description says "RUNTIME FAILURES" (taxonomy → major). Flagged, not changed, to preserve QH score parity — needs a product decision.

Quality Hub backend follow-ups (to coordinate separately)

  1. VarStringLiteral: drop the sfUiTargetObjectId/sfUiTargetResultName exemption.
  2. ValueClassCasing: treat dateTime as camelCase-canonical (don't flag valueClass="dateTime").
  3. VAR-NAMING-001: confirm severity criticalmajor.

Docs

docs/mcp.md updated (new input/output fields, the bridge note, threshold-90, PROVAR_MCP_QUALITY_THRESHOLD). Customer-facing docs (provar-mcp-public-docs.md, university course) reference the validate contract — they need the new status/threshold-90 added by the Provar team (not touched here).

Tests

Adds PDX-509 bridge / tri-state / threshold tests + a resolveQualityThreshold suite, and updates fixtures to use fully-qualified apiIds (matching real Provar files). 1427 passing, smoke 60/60, lint + compile clean.

🤖 Generated with Claude Code

RCA: The local MCP validator runs a two-layer model the Quality Hub API does
not have. Critical best-practice violations (e.g. unknown apiId) only moved
quality_score and never gated is_valid, so a test that will not load in Provar
could still be reported as valid. The quality threshold also defaulted to 80
with no env override and collapsed sub-threshold cases to a bare "invalid".
Fix: Bridge critical best-practice violations into Layer-1 issues so they gate
is_valid (deduped against hand-coded rules; score parity preserved). Add a
shared resolveQualityThreshold (arg -> PROVAR_MCP_QUALITY_THRESHOLD -> 90) and a
tri-state status (invalid / needs_improvement / valid) with meets_quality_threshold
across the testcase and hierarchy validators.
…dit severities

RCA: With the validity bridge live, critical best-practice violations now gate
is_valid. This surfaced two latent, previously score-only defects: (1) the
generator emitted valueClass="date"/"datetime" with raw ISO strings, but the
real Provar corpus (651 cases) stores date/dateTime exclusively as epoch
milliseconds with camelCase dateTime — an ISO string fails to load (RENDER-DATE-
VALUECLASS-001), and the casing rule wrongly lowercase-enforced dateTime; (2)
the varStringLiteral back-end exemption for UI-target args is wrong (a bare
{Var} there lands in the URL as %7B...%7D and hard-fails).
Fix: Suppress Layer-1-owned concepts from the bridge so the authoritative schema
rule reports them once (empty <steps> still loads). Convert ISO date/datetime to
epoch ms (UTC) with canonical dateTime casing in the generator and accept that
casing in RENDER-CASE-001. Drop the varStringLiteral UI-target exemption. Add
PDX-509 bridge/threshold/tri-state tests and a resolveQualityThreshold suite;
update docs/mcp.md. Severity audit: all implemented criticals are load-blocking
except VAR-NAMING-001 (its own text says "RUNTIME FAILURES" = major) — flagged
for review, not changed, to preserve QH score parity.
Copilot AI review requested due to automatic review settings June 4, 2026 19:08
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Quality Orchestrator

🟢 LOW · 33 / 100 · Touches: utils. 1 file(s) with no test coverage detected.


🧪 Tests to Run · Running 9 of 55 tests

  • unit/mcp/bestPracticesEngine.test.ts
  • unit/mcp/hierarchyValidate.test.ts
  • unit/mcp/projectValidateFromPath.test.ts
  • unit/mcp/testCaseGenerate.test.ts
  • unit/mcp/testCaseValidate.test.ts
  • unit/mcp/testPlanValidate.test.ts
  • unit/mcp/testSuiteValidate.test.ts
  • unit/mcp/qualityThreshold.test.ts
  • unit/services/projectValidation.test.ts
▶ Run command
npx vitest run \
  unit/mcp/bestPracticesEngine.test.ts \
  unit/mcp/hierarchyValidate.test.ts \
  unit/mcp/projectValidateFromPath.test.ts \
  unit/mcp/testCaseGenerate.test.ts \
  unit/mcp/testCaseValidate.test.ts \
  unit/mcp/testPlanValidate.test.ts \
  unit/mcp/testSuiteValidate.test.ts \
  unit/mcp/qualityThreshold.test.ts \
  unit/services/projectValidation.test.ts

⚠️ Missing Coverage · 1 file

These source files have no mapped test — consider adding coverage before merge.

  • src/commands/provar/automation/project/validate.ts

💡 Run locally: qo stub src/commands/provar/automation/project/validate.ts to generate a test scaffold.


⚡ quality-orchestrator  ·  /qo stub <file>  ·  qo analyze-local

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR re-aligns local Provar MCP validation with Provar’s load-validity model by bridging load-blocking (critical) best-practice violations into issues[] as ERRORs (so they gate is_valid), and by introducing a tri-state per-test verdict (invalid / needs_improvement / valid) driven by an explicit quality threshold that now defaults to 90 (with env override support).

Changes:

  • Bridge unsuppressed critical best-practice violations into issues[] to gate is_valid, while preserving Quality Hub score parity.
  • Introduce tri-state status, meets_quality_threshold, and effective quality_threshold across validation tools; add shared resolveQualityThreshold (arg → env → 90).
  • Fix generator emission for valueClass="date"/"dateTime" by converting ISO inputs to epoch milliseconds; update valueClass casing handling and VAR string-literal rule behavior/tests/docs.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/unit/mcp/testSuiteValidate.test.ts Updates fixtures and adds suite-level tri-state/threshold behavior assertions.
test/unit/mcp/testPlanValidate.test.ts Updates apiId fixtures to fully-qualified forms.
test/unit/mcp/testCaseValidate.test.ts Adds validity-bridge and tri-state/threshold handler tests.
test/unit/mcp/testCaseGenerate.test.ts Updates generator expectations for epoch-ms date/dateTime emission and escaping coverage.
test/unit/mcp/qualityThreshold.test.ts Adds dedicated tests for threshold precedence and bounds.
test/unit/mcp/bestPracticesEngine.test.ts Updates VAR-STRING-LITERAL behavior tests and clarifies severity expectations.
src/services/projectValidation.ts Uses shared threshold resolution instead of hardcoded default.
src/mcp/utils/qualityThreshold.ts Introduces shared threshold resolution logic and default=90.
src/mcp/tools/testSuiteValidate.ts Applies shared threshold resolution and updates schema description text.
src/mcp/tools/testPlanValidate.ts Applies shared threshold resolution and updates schema description text.
src/mcp/tools/testCaseValidate.ts Adds tri-state verdict fields; implements critical-BP bridging and baseline diff de-dupe logic.
src/mcp/tools/testCaseGenerate.ts Converts ISO date/datetime inputs to epoch ms and emits canonical dateTime casing.
src/mcp/tools/projectValidateFromPath.ts Removes hardcoded default and documents new threshold precedence/default.
src/mcp/tools/hierarchyValidate.ts Adds needs_improvement status for test cases and updates summary counts.
src/mcp/tools/bestPracticesEngine.ts Accepts canonical dateTime casing and removes UI-target arg exemption for VAR literals.
src/mcp/rules/provar_best_practices_rules.json Updates VAR-STRING-LITERAL rule description to reflect runtime failure behavior.
src/commands/provar/automation/project/validate.ts Updates CLI default threshold to 90 and minor formatting adjustments.
messages/sf.provar.automation.project.validate.md Updates CLI flag help text to reflect default threshold 90.
docs/mcp.md Updates MCP docs for new fields, threshold precedence, and bridge semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +572 to +580
if (inferred === 'date' || inferred === 'datetime') {
// PDX-509: Provar stores date/dateTime as epoch MILLISECONDS, never an ISO string —
// an ISO string fails to load in the IDE (RENDER-DATE-VALUECLASS-001), and the real
// corpus uses epoch ms exclusively with camelCase `dateTime`. Convert here.
const ms = isoToEpochMs(val, inferred);
const valueClass = inferred === 'datetime' ? 'dateTime' : 'date';
const emitted = ms !== null ? String(ms) : escapeXmlContent(val);
return `${indent}<value class="value" valueClass="${valueClass}">${emitted}</value>`;
}
Comment thread src/mcp/tools/testCaseValidate.ts Outdated
Comment on lines +245 to +247
// A critical BP violation bridged into issues[] shares its rule_id with the
// surfaced issue (the issue/BP rule_id namespaces are disjoint), so drop it
// from the BP list here to avoid double-counting it in the baseline diff.
…ment

RCA: Copilot flagged two issues. (1) When a value matches the ISO date/datetime
shape but is not a real date (e.g. "2026-99-99"), isoToEpochMs returns null and
the generator still emitted valueClass="date"/"dateTime" with the raw text — a
load-breaking value. (2) The diff-dedup comment was internally inconsistent: it
said the issue/BP rule_id namespaces are disjoint yet that a bridged critical
shares its rule_id with the surfaced issue.
Fix: Fall back to valueClass="string" when a date/datetime value cannot be
converted to epoch ms, with a regression test. Reword the diff-dedup comment to
state plainly that hand-coded issue rule_ids never collide with BP rule_ids, so a
BP rule_id in issues[] can only be a bridged critical.
RCA: VAR-NAMING-001 was tagged critical/weight 8, but a variable name with a
space is a RUNTIME failure (the test still loads and fails at execution), which
the severity taxonomy classifies as major, not load-blocking critical. As a
critical it was being surfaced by the PDX-509 bridge and wrongly flipping
is_valid for a runtime defect. The Quality Hub backend is making the same change
(major / weight 5) so score parity is preserved.
Fix: Set VAR-NAMING-001 severity=major, weight=5 in the local rules JSON, and
note in the description that it is a runtime defect, not a load-blocking one. It
is no longer bridged, so a bad variable name no longer flips is_valid (it docks
quality_score and surfaces as a major best-practice violation).
@mrdailey99 mrdailey99 merged commit cf4f38a into develop Jun 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants