Release 1.6.0 — merge develop → main#215
Merged
Merged
Conversation
RCA: buildArgumentValue had no `interaction` dispatch, so the argument fell through to class="value" valueClass="string". The Provar IDE step editor binds its Action widget only from a typed uiInteraction value, so the Action field rendered blank/raw URI even though the CLI ran the step green. Fix: add an interaction -> uiInteraction dispatch in buildArgumentValue gated on the shared UI_ACTION_API_IDS set, add validator rule UI-INTERACTION-001 (ERROR) via a checkUiInteraction helper, ship reproduce/cleared/generator regression tests, document the code in docs/mcp.md, and add a smoke entry.
…-typed PDX-506: fix(mcp): serialise UiDoAction interaction as uiInteraction
RCA: provar_testcase_generate emitted UiAssert field assertions as flat top-level arguments (fieldLocator/attributeName/comparisonType/expectedValue). That runs green from the CLI but renders the Provar IDE Result Assertions tab blank, because the editor binds only from the nested fieldAssertions/ uiFieldAssertion structure. Confirmed against the real corpus (AllPOCProjects): 0 of 3,778 UiAssert steps use a flat fieldLocator; ~99% use the nested form with a bare <fieldLocator uri/> element (never class="uiLocator"). Fix: add buildUiAssertXml to assemble flat attributes into the nested fieldAssertions/uiFieldAssertion structure (bare fieldLocator + attributeAssertions + empty column/page containers); add local validator rule UI-ASSERT-STRUCTURE-001 (ERROR) for the flat shape via checkUiAssertStructure (extracted checkSetValuesStructure to stay within complexity budget); correct the Contact_Lead_nested fixture to the nested form; ship reproduce/cleared/generator tests, docs, and a smoke entry.
RCA: Copilot flagged that buildUiAssertXml treated a UiAssert `locator` argument as a fieldLocator fallback, which would override the documented locator->uiLocator contract for non-field-assert UiAssert steps. Fix: only `fieldLocator` triggers the nested fieldAssertions structure; a plain `locator` now falls through to the existing dispatch. Documented that captureAfter is intentionally valueClass="string" (corpus-confirmed, not boolean) and that the self-closing uiAttributeAssertion is a real corpus form. Reworded the reproduce test comment to state the test (not the old validator) fails on main.
…-assertions PDX-507: fix(mcp): nest UiAssert field assertions for IDE rendering
RCA: 21 required-argument rules (control-flow conditions, assertion comparison/expected values, BDD descriptions, SOQL/SQL/DB/REST connection+result args) shipped in provar_best_practices_rules.json with check.type "mustContainArgument", but the local engine never registered that check type — every one hit the silent-pass fallback and never fired, so external MCP users got none of these load/exec-blocking checks offline. Fix: implement validateMustContainArgument in bestPracticesEngine.ts and register it; it flags steps whose apiId matches check.apiId when the required check.argument is absent or an empty self-closing <argument/>, reusing the existing argumentHasValue/getCallArguments helpers for parity with the NitroX required-arg validators. Exact apiId match (plus :variant suffix), disabled steps skipped, nested control-flow steps checked, aggregated one-violation-per-rule with count. Adds 7 reproduce/cleared unit tests and documents the newly-enforced rule class in docs/mcp.md.
RCA: Validated the local validator against quality-hub-agents MustContainArgumentValidator (best_practices_engine.py) and found four divergences: (1) empty <value/> or bare <value class="variable"/> was accepted but the backend treats present-but-empty as missing; (2) disabled steps were skipped but the backend checks them (a missing required arg is load-blocking regardless); (3) the legacy If/DoWhile condition-in-title format was not honored, risking false positives; (4) aggregating with count=N diverged the weighted-deduction score from the Lambda, which returns one violation per rule. Fix: add argumentHasMeaningfulValue mirroring the backend content checks (variable needs path/text, funcCall and gt/lt/eq/ne/ge/le/and/or/not operator classes count, compound needs parts, simple needs text); stop skipping disabled steps; honor the condition-in-title legacy exception for If/DoWhile; emit a single violation per rule with no count so the score stays in parity (message still names every offender); use exact apiId match. Extract callSatisfiesRequiredArg/stepLabel helpers to keep complexity under 20. Updates tests (12 cases) and docs/mcp.md to the backend-faithful semantics.
RCA: Copilot review flagged a nested ternary in validateUiActionNestingStructure where both non-UiWithRow branches return 'UiWithScreen', so the inner verdict.includes('UiWithScreen') test is dead and the expression is harder to read. Prettier reformatting in this branch surfaced the pre-existing line in the diff.
Fix: collapse to verdict.includes('UiWithRow') ? 'UiWithRow' : 'UiWithScreen' — behavior-preserving (the eliminated branch returned the same value) and clearer. No test impact; the nesting message tests still pass.
…in-argument PDX-508: enforce mustContainArgument best-practice rules in the local validator
…lidator RCA: Eight load-blocking check types (valueClassCasing, booleanCasing, invalidValueClass, dateValueClassFormat, the three apexConnect* checks, and setValuesInvalidElements) were referenced by rules but unregistered, so the local validator silently skipped them and diverged from the Quality Hub score. Fix: Add backend-faithful validators for all eight (whole-tree casing scans, per-apiCall value-class checks with nested double-counting, ApexConnect arg whitelist + connectionId valueClass, SetValues invalid-element detection), register them, and add 22 parity tests plus docs. One violation per rule, no count inflation, preserving weighted-deduction score parity with the Lambda.
…ing-checks PDX-508: Port render/load-blocking check types to local validator
…MCP resource RCA: The canonical Provar test-step JSON Schema (the structured contract the local validator's API-ID and value-class checks derive from) lived only in the Quality Hub back-end, so MCP clients had no machine-readable source of truth for test-step XML structure — only the prose step-reference doc. Fix: Vendor provar_test_step_schema.json into src/mcp/rules/ (auto-bundled to lib/mcp/rules/ by the existing compile copy step) and register a new MCP resource provar://schema/test-step that serves it, with resolveRulesDir + readTestStepSchema helpers (graceful schema_not_found fallback). Adds unit tests and docs.
… on read RCA: Copilot review flagged two issues — (1) describing the vendored document as a "JSON Schema (draft-07)" is misleading since it is domain-keyed (testCase/apiCalls), not a constraint schema, so clients might run it through a JSON-Schema validator; (2) readTestStepSchema served the file verbatim without verifying it parses, so a corrupted bundle could advertise application/json yet return invalid JSON. Fix: Reword the resource description and docs to call it a Provar-specific schema reference (not a constraint JSON Schema) and add a clarifying $comment to the file; parse the file once in readTestStepSchema and fall back to schema_not_found on parse failure. Adds a corrupted-file unit test.
…schema-resource PDX-508: Expose vendored Provar test-step schema as an MCP resource
… validator RCA: Seven rules existed only in the Quality Hub back-end rule set, so the local validator never evaluated them and its score could diverge from the API for tests that hit these runtime anti-patterns (variable refs stored as string literals, SetValues funcCall/zero-index string templates, DbConnect/dbConnectionName mismatch, and wrong-cased SF flow-button / record-type locators). Fix: Add the 7 rule definitions to the local rules JSON and implement backend-faithful validators (varStringLiteral as one-violation-per-value to match the back-end's list semantics; the other six as single-violation first-offender, with log-scaled count only on the two UI-locator checks). comparisonTypeEnum is intentionally excluded (ComparisonType is step-scoped). Adds 22 parity tests and docs. Preserves weighted-deduction score parity with the Lambda.
…ions RCA: Copilot review flagged two user-facing copy errors carried over verbatim from the back-end rule text: UI-LOCATOR-BUTTON-CASING-001 called 'name=cancel' "camelCase" when it is lowercase, and both new UI-locator rules spelled the product "ProVar" instead of "Provar". Fix: Correct "camelCase" to "lowercase" for the Cancel locator and normalise "ProVar" to "Provar" in both rule descriptions. Description text only — no validator logic or score change.
…ly-rules PDX-508: Port 7 back-end-only best-practice rules to local validator
RCA: The local MCP validator runs a two-layer model the Quality Hub API does not have. Critical best-practice violations (e.g. unknown apiId) only moved quality_score and never gated is_valid, so a test that will not load in Provar could still be reported as valid. The quality threshold also defaulted to 80 with no env override and collapsed sub-threshold cases to a bare "invalid". Fix: Bridge critical best-practice violations into Layer-1 issues so they gate is_valid (deduped against hand-coded rules; score parity preserved). Add a shared resolveQualityThreshold (arg -> PROVAR_MCP_QUALITY_THRESHOLD -> 90) and a tri-state status (invalid / needs_improvement / valid) with meets_quality_threshold across the testcase and hierarchy validators.
…dit severities
RCA: With the validity bridge live, critical best-practice violations now gate
is_valid. This surfaced two latent, previously score-only defects: (1) the
generator emitted valueClass="date"/"datetime" with raw ISO strings, but the
real Provar corpus (651 cases) stores date/dateTime exclusively as epoch
milliseconds with camelCase dateTime — an ISO string fails to load (RENDER-DATE-
VALUECLASS-001), and the casing rule wrongly lowercase-enforced dateTime; (2)
the varStringLiteral back-end exemption for UI-target args is wrong (a bare
{Var} there lands in the URL as %7B...%7D and hard-fails).
Fix: Suppress Layer-1-owned concepts from the bridge so the authoritative schema
rule reports them once (empty <steps> still loads). Convert ISO date/datetime to
epoch ms (UTC) with canonical dateTime casing in the generator and accept that
casing in RENDER-CASE-001. Drop the varStringLiteral UI-target exemption. Add
PDX-509 bridge/threshold/tri-state tests and a resolveQualityThreshold suite;
update docs/mcp.md. Severity audit: all implemented criticals are load-blocking
except VAR-NAMING-001 (its own text says "RUNTIME FAILURES" = major) — flagged
for review, not changed, to preserve QH score parity.
…ment RCA: Copilot flagged two issues. (1) When a value matches the ISO date/datetime shape but is not a real date (e.g. "2026-99-99"), isoToEpochMs returns null and the generator still emitted valueClass="date"/"dateTime" with the raw text — a load-breaking value. (2) The diff-dedup comment was internally inconsistent: it said the issue/BP rule_id namespaces are disjoint yet that a bridged critical shares its rule_id with the surfaced issue. Fix: Fall back to valueClass="string" when a date/datetime value cannot be converted to epoch ms, with a regression test. Reword the diff-dedup comment to state plainly that hand-coded issue rule_ids never collide with BP rule_ids, so a BP rule_id in issues[] can only be a bridged critical.
RCA: VAR-NAMING-001 was tagged critical/weight 8, but a variable name with a space is a RUNTIME failure (the test still loads and fails at execution), which the severity taxonomy classifies as major, not load-blocking critical. As a critical it was being surfaced by the PDX-509 bridge and wrongly flipping is_valid for a runtime defect. The Quality Hub backend is making the same change (major / weight 5) so score parity is preserved. Fix: Set VAR-NAMING-001 severity=major, weight=5 in the local rules JSON, and note in the description that it is a runtime defect, not a load-blocking one. It is no longer bridged, so a bad variable name no longer flips is_valid (it docks quality_score and surfaces as a major best-practice violation).
…ridge PDX-509: validity bridge, tri-state status + quality threshold
…alues RCA: fast-xml-parser parses a numeric element body (e.g. <value>123456</value>) to a JS number, not a string. Two best-practices validators (detectDuplicatesLiterals and the mustContainArgument value check) read the body as ((node['#text'] ?? '').trim()), which throws "(...).trim is not a function" on a number. The exception propagates out of validateTestCase, so the MCP tool returns VALIDATE_ERROR instead of a result — observed on ~25% of a real 60-file test-case corpus. A safe nodeText() helper that coerces with String() already exists; these two sites simply did not use it. Fix: Route both sites through nodeText() so numeric bodies coerce to string before trim. Add a regression test that a numeric <value> body does not crash runBestPractices. Verified against the corpus: 60/60 files now validate without throwing (previously ~15 crashed).
RCA: validateValueClassCasing (RENDER-CASE-001) only inspects the valueClass attribute, but its valid set carried 14 class="..." tokens (variable, compound, funcCall, value, valueList, and the gt/lt/eq/... operators) plus integer — none of which ever appear as a valueClass, so they were dead entries — while omitting id, a real corpus valueClass (46 uses). A full AllPOCProjects scan confirms exactly six valueClass values exist: string, boolean, decimal, id, date, dateTime. The QH back-end made the identical correction to keep score parity. Fix: Reduce VALUE_CLASS_CASING_VALID to those six (drops the out-of-scope class tokens and integer, adds the missing id) and keep the datetime -> dateTime canonical-casing entry. valueClass="ID"/"Id" now normalizes to id; camelCase dateTime stays canonical. Adds id coverage tests.
…optional
RCA: TC_010 required the testCase id attribute to be the literal "1" ("Provar
requires id=1"), but that belief is wrong. Test case ids are project-unique
integers numbered sequentially, and the guid — not id — is the cross-project
unique identifier. A full AllPOCProjects scan shows real, loadable test cases
with ids 0..51 and many with no id attribute at all. The strict rule therefore
fired TC_010 (an is_valid-gating ERROR) on ~28 of 29 real files — a false
positive that wrongly marked valid test cases invalid.
Fix: Make id optional (missing id is no longer an error — guid identifies the
test case) and, when present, require only a non-negative integer; a present-but
-non-numeric id (e.g. a UUID) is still rejected. Verified against the corpus:
TC_010 now fires on 0 of 123 files (previously ~28/29). Tests updated to cover
missing id, id="0", id="42", and a UUID-as-id.
fix(mcp): best-practices engine corpus-driven fixes (numeric crash + valueClass scope)
…alidator RCA: Nine structural check types referenced by existing rules were unregistered, so the local validator silently skipped them and its score could diverge from the Quality Hub API for tests with malformed SetValues/UiAssert structure, bad funcCall ids, wrong binding-parameter order, variable uiConnectionName, or unknown root attributes. Fix: Implement backend-faithful validators for validFuncCallId, rootAttributes, setValuesStructure, namedValueName, namedValueValue, uiAssertHallucinatedGeneratedParameters, uiAssertMissingArguments, bindingParameterOrder, and uiConnectionNameLiteral; register in VALIDATOR_REGISTRY. Six are single first-offender violations; validFuncCallId / the two UiAssert checks / bindingParameterOrder emit one violation with count (capped at 5 for funcCall) to preserve weighted-deduction score parity. Adds 19 parity tests and docs.
…ot load) RCA: Tier 5 ported UI-BINDING-ORDER-001 as critical/weight 10, but a wrong UI binding parameter order produces an 'Unknown control' error at RUNTIME — the test case still loads. The severity taxonomy classifies a runtime failure as major, not load-blocking critical. With the PDX-509 validity bridge live, a critical is surfaced into issues[] and flips is_valid; a corpus scan caught this firing on a real test case, wrongly marking it invalid. Fix: Set UI-BINDING-ORDER-001 severity=major, weight=5 (matching the recorded taxonomy call) so it scores/surfaces as a runtime defect without gating is_valid. Update the firing test to assert major. The Quality Hub backend should make the same change to preserve score parity.
…lid files RCA: A full AllPOCProjects corpus sweep (with the PDX-509 bridge live) showed two Tier 5 criticals flipping is_valid on real, loadable test cases. SETVALUES-VALUE -001 flagged an empty <namedValue name="value"/> — which intentionally sets the field to blank — and empty structural slots in an unused QEditor row. SETVALUES- STRUCTURE-001 flagged a data-driven SetValues that reads from Excel/CSV (declared in <parameterValueSources>) and therefore carries an empty <value class= "valueList"/> with no inline <namedValues>. Both patterns load fine, so the rules were false-positives over-gating validity. Fix: SETVALUES-VALUE-001 now treats the QEditor structural slots (valuePath, value, valueScope) as blankable and only flags a non-structural namedValue with no value. SETVALUES-STRUCTURE-001 exempts a SetValues that declares <parameterValueSources>. Re-swept all 651 corpus files: zero Tier 5 criticals fire. Added regression tests for empty value, blank row, and data-driven SetValues.
fix(mcp): TC_010 — accept any integer testCase id, make id optional
…-checks-rebased PDX-508 Tier 5: structural/load-affecting validators (bridge-verified)
…urce Req: PDX-508 Tier 6 — provide a single canonical registry of every validation rule across both layers (Layer-1 structural validity + Layer-2 best practices), and expose it to AI clients as an MCP resource, so a client can see what each rule checks, its severity/weight, and — post-PDX-509 — whether it gates is_valid. Fix: Add scripts/build-validation-rule-registry.cjs which generates docs/VALIDATION_RULE_REGISTRY.md from provar_best_practices_rules.json plus the hand-coded Layer-1 rules (23 Layer-1 + 178 Layer-2 = 201 rules), computing the "Gates is_valid?" column from the bridge model (critical bridged unless Layer-1 owns the concept; major/minor/info never gate). Register the provar://docs/validation-rules MCP resource serving the generated doc, wire the doc into the compile copy + bundle, document it in docs/mcp.md, and add a unit test that guards the registry against drift and verifies the gating column.
…-rule-registry PDX-508 Tier 6: canonical Validation Rule Registry + MCP resource
Req: provar_testcase_generate hard-coded id="1" on every generated test case, but a 651-file corpus sweep proved id is a human label not a uniqueness key (guid is) and duplicate ids occur freely within a real project, so id="1" landing in an existing project is misleading hygiene. Fix: add allocateTestCaseId(outputPath, allowedPaths) which, on the write path, finds the enclosing .testproject root (bounded by the allowed roots) and emits highest-in-use + 1, preserves an existing numeric id on overwrite, and defaults to 1 for preview/no-project; surfaced as test_case_id with docs and unit tests. Closes PDX-510. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…id-allocation PDX-510: feat(mcp): allocate testCase id from project context
…alog Req: Extract the Layer-1 structural-validity rule catalog into a shared JSON so the validator, registry generator, and drift test stop hand-duplicating the rule metadata. Fix: New provar_layer1_rules.json is the single source for Layer-1 id/severity/applies_to/description and the best-practice ownership map; the registry generator and testCaseValidate.ts both read it, and validationRuleRegistry.test.ts guards against drift. Behavior-preserving: regenerated registry byte-identical; 0 diffs across the 651-file corpus; 1466 passing; smoke 60/60. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Req: Adversarial review found the Layer-1 drift guard asserted only rule_id and severity, leaving the validator's per-issue applies_to free to drift from the catalog (and the published registry column) with CI still green. Fix: The source-scan guard now also captures and asserts each detection site's applies_to against the catalog, and its failure message names the field-order assumption so a benign push reorder is not misread as a stale catalog. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Req: Adversarial review of the PDX-510 id allocator found maxProjectTestCaseId read every .testcase under the project root without an allowed-path check, so a symlinked .testcase could pull an id from a file outside the configured --allowed-paths, breaking the containment the allocator documents. Fix: maxProjectTestCaseId now reads only real files (isFile and not a symlink) so an out-of-root symlinked .testcase is skipped, and readRootTestCaseId ignores ids beyond Number.isSafeInteger so the generator never emits a scientific-notation id like id="1e+22". Adds unit tests for both; the symlink test skips on hosts without symlink privileges. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PDX-511: Single-source the Layer-1 validation-rule catalog (registry drift hardening)
…-path-safety PDX-510: keep testCase id allocation inside the allowed roots (review follow-up)
Req: cut the 1.6.0 release bundling PDX-501..511 (validator overhaul: structural + backend-only rules ported local, validity bridge, tri-state status, quality threshold 80->90, project-aware test_case_id, two new MCP resources) and sweep the user-facing docs so the release ships accurate. Fix: bump package.json + server.json 1.5.4 -> 1.6.0; correct docs/mcp.md tool count 38 -> 42 and document the provar://docs/tool-guide resource; drop two internal ticket refs from mcp.md prose; expand mcp-pilot-guide Scenario 2 with status / quality_threshold / validity-bridge; add the missing provar_org_describe tool to the mcp-start message; refresh README counts and replace the stale dotted-name tool list with a pointer to docs/mcp.md; add the v1.6.0 CHANGELOG entry. Gates: compile clean, 1479 unit tests, smoke 60/60, lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
chore(release): bump version to 1.6.0 + release doc sweep
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release 1.6.0
Promotes
develop(now at 1.6.0) tomain. Bundles PDX-501 → PDX-511 plus the release bump (#220).Headline
criticalbest-practice violations now gateis_valid(PDX-509).status(valid/needs_improvement/invalid) +quality_thresholdinput +PROVAR_MCP_QUALITY_THRESHOLDenv; default quality bar 80 → 90 (PDX-509).provar://schema/test-step,provar://docs/validation-rules(PDX-508).comparisonTypevalidation (PDX-501/502); project-awaretest_case_idallocation (PDX-510); single-source Layer-1 rule catalog (PDX-511);PROVAR_PLUGIN_NOT_FOUND(PDX-505); UiDoAction/UiAssert generator fidelity (PDX-506/507).Version
package.json+server.jsonat 1.6.0 (synced).Publish
CHANGELOG.mdcarries the v1.6.0 entry for the release notes.