Skip to content

feat(fingerprint): implement encoder and tag set for fingerprint [1/3]#246

Open
dmcilvaney wants to merge 6 commits into
microsoft:mainfrom
dmcilvaney:damcilva/schema_version_parts/encoder-tag-parser
Open

feat(fingerprint): implement encoder and tag set for fingerprint [1/3]#246
dmcilvaney wants to merge 6 commits into
microsoft:mainfrom
dmcilvaney:damcilva/schema_version_parts/encoder-tag-parser

Conversation

@dmcilvaney

@dmcilvaney dmcilvaney commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Summary

First of three stacked PRs introducing a new content-fingerprint substrate for
component identity. This PR adds the standalone encoding toolkit in
internal/fingerprint — a canonical-JSON projector, an RFC 8785 digest combiner,
and a version-set struct-tag parser. It is not yet wired into ComputeIdentity
(still on the legacy hashstructure path), so there is no runtime behavior
change
; wiring and cutover land in parts 2/3.

Stack: [1/3] this PR · [2/3] projectV1 projection + golden vectors (#247) ·
[3/3] reset cutover, drop hashstructure (#243)

RFC: #234

What's included

  • canonical.go — canonical projector. treeBuilder projects a resolved
    config struct into a JSON-able map[string]any with a deferred, first-error-wins
    discipline. scalarToJSON pins values by underlying kind, rejecting integers
    beyond +-2^53 (can't survive RFC 8785's number model) and []byte (must be a string).
  • combine.go — digest combiner. canonicalDigest serializes the projection to
    RFC 8785 (JCS) canonical JSON via github.com/gowebpki/jcs and sha256s it →
    sha256:<hex>. JCS pins key order and number/string formatting, so digests are
    stable across runs/Go versions and reproducible cross-language (the reason for the
    +-2^53 ceiling).
  • versiontag.go — version-set tag parser. Parses the fingerprint struct-tag
    grammar [key=<ident>,] [!]v<low>[..(v<high>|*)] [, ...] (optional emit-key
    override, inclusive ranges, * open-ended, ! always-emit), rejecting malformed,
    inverted, future-referencing, overlapping, and duplicate-key= tags.
  • Adds dependency github.com/gowebpki/jcs v1.0.1.

Notes

  • Package-internal and test-only for now; ComputeIdentity is unchanged. Inert until
    parts 2/3 wire it.
  • Covered by table-driven unit tests including the +-2^53 / []byte boundaries and
    frozen known-answer digest vectors; mage check + mage unit pass.

Copilot AI review requested due to automatic review settings June 20, 2026 02:14
@dmcilvaney dmcilvaney marked this pull request as draft June 20, 2026 02:16
@dmcilvaney dmcilvaney changed the title feat(fingerprint): implement encoder and tag set for fingerprint feat(fingerprint): implement encoder and tag set for fingerprint [1/3] Jun 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the Phase 1 “projection substrate” primitives for the planned lock-file fingerprint reset: a canonical byte encoder, a version-set fingerprint tag parser, and a SHA256-based combiner step. It also adds a substantial set of planning/reporting docs (RFC + phased implementation plan + Phase 1 completion report) to document and gate the multi-PR cutover.

Changes:

  • Added canonical encoder (canonicalBuf) and a version-set tag parser (parseVersionSet) under internal/fingerprint/, with unit tests.
  • Added combineProjection to fold projection bytes + non-config inputs into a SHA256 digest (not yet wired into ComputeIdentity).
  • Added/updated RFC + plan/report documentation for Phases 1–3 of the cutover workstream.

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
report/schema-version-parts/report-phase1.md Phase 1 completion report capturing decisions, testing evidence, and scope boundaries.
report/schema-version-parts/.gitkeep Documents expected phase-report files and their purpose.
plan/schema-version-parts/phase3-reset-cutover.md Phase 3 plan for the destructive cutover (switch-over + token format), including safety gates.
plan/schema-version-parts/phase2-projection-golden-vectors.md Phase 2 plan for projectV1, canonicalizer, and golden vectors (freeze mechanism).
plan/schema-version-parts/phase1-encoder-tag-parser.md Phase 1 plan marked completed; defines scope/acceptance for encoder + tag parser + combiner.
plan/schema-version-parts/overview.md Workstream overview tying phases to the RFC and defining settled decisions/guardrails.
plan/schema-version-parts/handoff-prompt.md Handoff prompt for implementing phases consistently with repo conventions and safety constraints.
internal/fingerprint/canonical.go Implements the canonical <len>:<key>=<len>:<value> encoder and scalar/map/composite emit helpers.
internal/fingerprint/canonical_internal_test.go Unit tests for canonical encoding semantics and default-fail behavior.
internal/fingerprint/versiontag.go Implements fingerprint version-set tag parsing and emit-key resolution.
internal/fingerprint/versiontag_internal_test.go Unit tests for version-set parsing, membership, and rejection cases.
internal/fingerprint/combine.go Adds combineProjection to hash projection bytes + other inputs with domain separation.
internal/fingerprint/combine_internal_test.go Unit tests for combineProjection determinism and input sensitivity.
docs/developer/schema-migration/README.md Executive summary for the schema migration / fingerprint reset proposal.
docs/developer/schema-migration/problem-and-motivation.md Plain-language background explaining the churn problem and why replay is unsound today.
docs/developer/schema-migration/part-1-the-reset.md Summary of the reset (projection substrate + token format) and why it’s safe.
docs/developer/schema-migration/part-2-lazy-migration.md Summary of the deferred post-reset lazy migration mechanism and its contracts.
docs/developer/schema-migration/delivery-plan.md Rollout plan showing which PRs land at cutover vs. which are gated/deferred.
docs/developer/rfc/lazy-schema-migration.md Full RFC specifying the design, invariants, encoding table, and phased delivery plan.

Comment thread internal/fingerprint/canonical.go Outdated
Comment thread internal/fingerprint/versiontag.go
@dmcilvaney dmcilvaney force-pushed the damcilva/schema_version_parts/encoder-tag-parser branch 2 times, most recently from 3a3544b to 38416c6 Compare June 24, 2026 23:09
Copilot AI review requested due to automatic review settings June 24, 2026 23:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 17 changed files in this pull request and generated 5 comments.

Comment thread internal/fingerprint/versiontag.go Outdated
Comment thread internal/fingerprint/versiontag.go
Comment thread internal/fingerprint/versiontag.go
Comment thread internal/fingerprint/versiontag.go
Comment thread internal/fingerprint/versiontag.go
@dmcilvaney dmcilvaney force-pushed the damcilva/schema_version_parts/encoder-tag-parser branch from 38416c6 to fa6efa4 Compare June 25, 2026 01:03
@dmcilvaney dmcilvaney requested a review from Copilot June 25, 2026 01:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Comment thread internal/fingerprint/canonical.go
Comment thread internal/fingerprint/versiontag_internal_test.go
Comment thread internal/fingerprint/combine_internal_test.go
Comment thread internal/fingerprint/canonical_internal_test.go
…256 combiner

Phase 1 (PR A1) of the schema-version-parts cutover. Adds the pure projection-substrate primitives in internal/fingerprint, beside the existing hashstructure path: the canonicalBuf length-prefixed encoder with the split omit-predicate, the fingerprint version-set tag parser, and the sha256 combiner step. Nothing is wired into ComputeIdentity and hashstructure is untouched, so no lock byte or scenario snapshot changes. Includes in-package unit tests and the phase 1 report; updates plan status.
@dmcilvaney dmcilvaney force-pushed the damcilva/schema_version_parts/encoder-tag-parser branch from fa6efa4 to 24d6ef3 Compare June 25, 2026 15:33
Copilot AI review requested due to automatic review settings June 25, 2026 15:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Comment thread internal/fingerprint/canonical.go Outdated
Comment thread internal/fingerprint/canonical.go
Copilot AI review requested due to automatic review settings June 25, 2026 17:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Comment thread internal/fingerprint/canonical.go
Comment thread internal/projectconfig/fingerprint_test.go
@dmcilvaney dmcilvaney marked this pull request as ready for review June 25, 2026 18:52
Copilot AI review requested due to automatic review settings June 25, 2026 18:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.

Comment thread internal/fingerprint/canonical.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants