feat(fingerprint): implement encoder and tag set for fingerprint [1/3]#246
Open
dmcilvaney wants to merge 6 commits into
Open
feat(fingerprint): implement encoder and tag set for fingerprint [1/3]#246dmcilvaney wants to merge 6 commits into
dmcilvaney wants to merge 6 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces the Phase 1 “projection substrate” primitives for the planned lock-file fingerprint reset: a canonical byte encoder, a version-set fingerprint tag parser, and a SHA256-based combiner step. It also adds a substantial set of planning/reporting docs (RFC + phased implementation plan + Phase 1 completion report) to document and gate the multi-PR cutover.
Changes:
- Added canonical encoder (
canonicalBuf) and a version-set tag parser (parseVersionSet) underinternal/fingerprint/, with unit tests. - Added
combineProjectionto fold projection bytes + non-config inputs into a SHA256 digest (not yet wired intoComputeIdentity). - Added/updated RFC + plan/report documentation for Phases 1–3 of the cutover workstream.
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| report/schema-version-parts/report-phase1.md | Phase 1 completion report capturing decisions, testing evidence, and scope boundaries. |
| report/schema-version-parts/.gitkeep | Documents expected phase-report files and their purpose. |
| plan/schema-version-parts/phase3-reset-cutover.md | Phase 3 plan for the destructive cutover (switch-over + token format), including safety gates. |
| plan/schema-version-parts/phase2-projection-golden-vectors.md | Phase 2 plan for projectV1, canonicalizer, and golden vectors (freeze mechanism). |
| plan/schema-version-parts/phase1-encoder-tag-parser.md | Phase 1 plan marked completed; defines scope/acceptance for encoder + tag parser + combiner. |
| plan/schema-version-parts/overview.md | Workstream overview tying phases to the RFC and defining settled decisions/guardrails. |
| plan/schema-version-parts/handoff-prompt.md | Handoff prompt for implementing phases consistently with repo conventions and safety constraints. |
| internal/fingerprint/canonical.go | Implements the canonical <len>:<key>=<len>:<value> encoder and scalar/map/composite emit helpers. |
| internal/fingerprint/canonical_internal_test.go | Unit tests for canonical encoding semantics and default-fail behavior. |
| internal/fingerprint/versiontag.go | Implements fingerprint version-set tag parsing and emit-key resolution. |
| internal/fingerprint/versiontag_internal_test.go | Unit tests for version-set parsing, membership, and rejection cases. |
| internal/fingerprint/combine.go | Adds combineProjection to hash projection bytes + other inputs with domain separation. |
| internal/fingerprint/combine_internal_test.go | Unit tests for combineProjection determinism and input sensitivity. |
| docs/developer/schema-migration/README.md | Executive summary for the schema migration / fingerprint reset proposal. |
| docs/developer/schema-migration/problem-and-motivation.md | Plain-language background explaining the churn problem and why replay is unsound today. |
| docs/developer/schema-migration/part-1-the-reset.md | Summary of the reset (projection substrate + token format) and why it’s safe. |
| docs/developer/schema-migration/part-2-lazy-migration.md | Summary of the deferred post-reset lazy migration mechanism and its contracts. |
| docs/developer/schema-migration/delivery-plan.md | Rollout plan showing which PRs land at cutover vs. which are gated/deferred. |
| docs/developer/rfc/lazy-schema-migration.md | Full RFC specifying the design, invariants, encoding table, and phased delivery plan. |
3a3544b to
38416c6
Compare
38416c6 to
fa6efa4
Compare
…256 combiner Phase 1 (PR A1) of the schema-version-parts cutover. Adds the pure projection-substrate primitives in internal/fingerprint, beside the existing hashstructure path: the canonicalBuf length-prefixed encoder with the split omit-predicate, the fingerprint version-set tag parser, and the sha256 combiner step. Nothing is wired into ComputeIdentity and hashstructure is untouched, so no lock byte or scenario snapshot changes. Includes in-package unit tests and the phase 1 report; updates plan status.
…er, sha256 combiner
fa6efa4 to
24d6ef3
Compare
…er, sha256 combiner
…er, sha256 combiner
…er, sha256 combiner
…er, sha256 combiner
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First of three stacked PRs introducing a new content-fingerprint substrate for
component identity. This PR adds the standalone encoding toolkit in
internal/fingerprint— a canonical-JSON projector, an RFC 8785 digest combiner,and a version-set struct-tag parser. It is not yet wired into
ComputeIdentity(still on the legacy
hashstructurepath), so there is no runtime behaviorchange; wiring and cutover land in parts 2/3.
Stack: [1/3] this PR · [2/3]
projectV1projection + golden vectors (#247) ·[3/3] reset cutover, drop
hashstructure(#243)RFC: #234
What's included
canonical.go— canonical projector.treeBuilderprojects a resolvedconfig struct into a JSON-able
map[string]anywith a deferred, first-error-winsdiscipline.
scalarToJSONpins values by underlying kind, rejecting integersbeyond +-2^53 (can't survive RFC 8785's number model) and
[]byte(must be a string).combine.go— digest combiner.canonicalDigestserializes the projection toRFC 8785 (JCS) canonical JSON via
github.com/gowebpki/jcsand sha256s it →sha256:<hex>. JCS pins key order and number/string formatting, so digests arestable across runs/Go versions and reproducible cross-language (the reason for the
+-2^53 ceiling).
versiontag.go— version-set tag parser. Parses thefingerprintstruct-taggrammar
[key=<ident>,] [!]v<low>[..(v<high>|*)] [, ...](optional emit-keyoverride, inclusive ranges,
*open-ended,!always-emit), rejecting malformed,inverted, future-referencing, overlapping, and duplicate-
key=tags.github.com/gowebpki/jcs v1.0.1.Notes
ComputeIdentityis unchanged. Inert untilparts 2/3 wire it.
[]byteboundaries andfrozen known-answer digest vectors;
mage check+mage unitpass.