feat(provenance): content-addressed verdict cache + report attestation#237
Merged
Conversation
- canonicalJson: strict stable stringify (sorted keys; throws on undefined/function/symbol/NaN/Map/Set — ambiguity is an error, not a coercion) + contentHash (sync node:crypto sha-256). Strict/sync counterpart to pre-registration's permissive async canonicalize/hashJson. - cachedJudge wraps any JudgeConfig generically: cache key covers artifact content, scenarioId, judge name, full rubric dimensions, and a REQUIRED judgeVersion (silent judge upgrades must never serve stale verdicts). Hit path never invokes score(); stats() exposes hits/misses. Caches judge verdicts ONLY — never agent rollouts (judging is pure; rollout caching destroys best-of-N diversity). - VerdictCacheStore with inMemoryVerdictCache + fileVerdictCache (JSONL append + in-memory index; corrupt line throws with file:line). - attest/verifyAttestation: reproducibility attestation for any serializable report (reportHash + modelVersions/seeds/priceTableHash/ codeSha/inputsHash provenance, algorithm 'sha256/canonical-json'). Content-addressing is the substrate's layer; cryptographic signing is the consumer's.
tangletools
previously approved these changes
Jun 10, 2026
tangletools
left a comment
Contributor
There was a problem hiding this comment.
✅ Auto-approved PR — 9a68fe40
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-10T10:39:38Z
tangletools
approved these changes
Jun 10, 2026
tangletools
left a comment
Contributor
There was a problem hiding this comment.
✅ Auto-approved PR — a0ea326e
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-10T10:55:07Z
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Track T6 of the 6-track program: judge-verdict caching (perf) + reproducibility attestation (provenance). Two new root modules, exports appended as one contiguous block at the end of
src/index.ts(rebase-friendly).src/verdict-cache.tscanonicalJson(value)— stable stringify (recursively sorted keys). Throws onundefined/ function / symbol / NaN / ±Infinity / bigint / Map / Set: ambiguity is an error, not a coercion. HonorstoJSONso Dates canonicalize to ISO strings instead of{}. Strict/sync counterpart topre-registration.ts's permissive asynccanonicalize/hashJson(cross-referenced in the doc).contentHash(value)— hex sha-256 over the canonical JSON (node:crypto).VerdictCacheStore(sync-or-asyncget/set) withinMemoryVerdictCache()andfileVerdictCache(path)— JSONL append + in-memory index; a corrupt or wrong-shape line throws at load withfile:line, never skip-and-continue.cachedJudge(judge, store, { judgeVersion })— wraps anyJudgeConfiggenerically (preservesTArtifact/TScenario+appliesTo; singleJudgeScoreimport point so the spine PR's type dedupe rebases as a one-line change). Key =contentHash({ artifact: canonicalJson, scenarioId, judgeName, dimensions, judgeVersion }).judgeVersionis REQUIRED — silent judge upgrades must never serve stale verdicts. Hit path never invokesscore();stats()exposes{ hits, misses }. Thrown judges are NOT cached (a cached failure would pin a transient outage forever).LAW (stated verbatim in the module doc): cache JUDGE VERDICTS only — judging the same artifact with the same judge+rubric is pure. NEVER cache agent rollouts.
src/attestation.tsattest(report, provenance)→AttestedReport { reportHash, provenance, algorithm: 'sha256/canonical-json' }for ANY serializable report (campaign results, fuzz capsules, scorecards) — deliberately not coupled to any report schema. Provenance carriesmodelVersions,seeds?,priceTableHash?,codeSha,inputsHash?, caller-suppliedcreatedAt(substrate stays clock-free).verifyAttestation(report, attested)→{ valid, reason? }typed outcome: names the exact mismatch (unknown algorithm / hash mismatch / non-canonicalizable report).Tests
28 new deterministic tests (no LLM, no RNG): canonicalJson key-order stability + every ambiguity throw; cache hit/miss across artifact / judgeVersion / rubric-dimension / scenario changes; fake-judge call-count proves the hit path never invokes
score(); file-store roundtrip across instances + corrupt-line and wrong-shape loud failures; attest→verify roundtrip + single-field tamper →{ valid: false, reason }.pnpm typecheck/pnpm test(209 files, 2010 passed) /pnpm buildall green. No version bump (release sequenced by program lead).