feat(fingerprint): Add handwritten v1 hashing algorithm [2/3]#247
Draft
dmcilvaney wants to merge 7 commits into
Draft
feat(fingerprint): Add handwritten v1 hashing algorithm [2/3]#247dmcilvaney wants to merge 7 commits into
dmcilvaney wants to merge 7 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR advances the “schema-version-parts” workstream by introducing the v1 projection-based fingerprinting substrate (encoder + tag parser + v1 projector + golden vectors) alongside the existing hashstructure path, and adds accompanying documentation/plans/reports for the cutover phases.
Changes:
- Add
internal/fingerprint/projection primitives (canonical encoder, version-set tag parser, projection combiner) plus a hand-writtenprojectV1and golden-vector freeze tests. - Add
fingerprint:"v1..*"/"-"tags across fingerprintedinternal/projectconfig/structs and tightenTestAllFingerprintedFieldsHaveDecisionto require explicit decisions validated viafingerprint.ValidateFieldTag. - Add schema-migration docs plus workstream plan/report artifacts for phases 1–3.
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| report/schema-version-parts/report-phase2.md | Phase 2 completion report for the v1 projector + golden vectors work. |
| report/schema-version-parts/report-phase1.md | Phase 1 completion report for encoder/tag-parser/combiner primitives. |
| report/schema-version-parts/.gitkeep | Documents expected phase report outputs in this directory. |
| plan/schema-version-parts/phase3-reset-cutover.md | Planned Phase 3 cutover steps (switch-over + token format). |
| plan/schema-version-parts/phase2-projection-golden-vectors.md | Phase 2 implementation plan and acceptance criteria. |
| plan/schema-version-parts/phase1-encoder-tag-parser.md | Phase 1 implementation plan and acceptance criteria. |
| plan/schema-version-parts/overview.md | Workstream overview tying phases together and describing validation. |
| plan/schema-version-parts/handoff-prompt.md | Session handoff prompt for implementing phases in order. |
| internal/projectconfig/specsource.go | Adds fingerprint version-set tags to spec source fields. |
| internal/projectconfig/render.go | Adds fingerprint tag to render config field. |
| internal/projectconfig/package.go | Adds hazard comment for a fingerprint:"-" pruned subtree. |
| internal/projectconfig/overlay.go | Adds fingerprint tags to overlay fields (and keeps description excluded). |
| internal/projectconfig/fingerprint_test.go | Enforces mandatory fingerprint tag decisions via ValidateFieldTag. |
| internal/projectconfig/distro.go | Adds fingerprint tags to distro reference fields. |
| internal/projectconfig/component.go | Adds fingerprint tags + hazard comments; keeps Packages measured. |
| internal/projectconfig/build.go | Adds fingerprint tags to build config leaves; hazard comments for pruned composites. |
| internal/fingerprint/versiontag.go | Implements parsing/validation for fingerprint version-set tags. |
| internal/fingerprint/versiontag_internal_test.go | Unit tests for version-set tag parsing and emit-key resolution. |
| internal/fingerprint/testdata/golden_v1.json | Frozen v1 projection digest table (append-only contract). |
| internal/fingerprint/project.go | Hand-written projectV1 projector + ValidateFieldTag and helpers. |
| internal/fingerprint/project_internal_test.go | Projector tests, emission probe, and placeholder/completeness gates. |
| internal/fingerprint/golden_internal_test.go | Golden vector management contract + append-only enforcement test. |
| internal/fingerprint/combine.go | sha256 folding of projection bytes + non-config inputs. |
| internal/fingerprint/combine_internal_test.go | Unit tests for combiner determinism and sensitivity. |
| internal/fingerprint/canonical.go | Canonical <len>:<key>=<len>:<value> encoder and omit/encode rules. |
| internal/fingerprint/canonical_internal_test.go | Unit tests for canonical encoding, omit rules, and failure cases. |
| docs/developer/schema-migration/README.md | Executive summary of the RFC and reset/lazy-migration split. |
| docs/developer/schema-migration/problem-and-motivation.md | Plain-language problem statement and “substrate problem” rationale. |
| docs/developer/schema-migration/part-2-lazy-migration.md | Summary of deferred/provisional Part 2 lazy migration approach. |
| docs/developer/schema-migration/part-1-the-reset.md | Summary of the reset’s goals, token format, and constraints. |
| docs/developer/schema-migration/delivery-plan.md | PR sequencing and cutover bundling plan. |
| .github/instructions/go.instructions.md | Updates contributor guidance for fingerprint tags and projectV1 workflow. |
Comment on lines
+27
to
+29
| // - Scalar leaves (string, bool, sized ints, and slices of those) use plain | ||
| // [reflect.Value.IsZero]: emit omits a zero, emitAlways emits it anyway (the | ||
| // '!' always-emit case for a build-meaningful zero). |
Comment on lines
+60
to
+63
| order. **Empty map / nil-slice handling.** `emit` applies plain `IsZero`, so a | ||
| nil scalar slice is omitted while a non-nil empty slice emits an empty value; | ||
| the nil-vs-`[]` collapse is `canonicalizeForFingerprint`'s job at the hash | ||
| boundary (Phase 2/3), not the primitive's. |
Comment on lines
+18
to
+21
| 1. A scalar-slice **canonicalizer** (`canonicalizeForFingerprint`) - a generic | ||
| reflective normalizer that collapses every nil-or-empty scalar slice to nil at | ||
| the hash boundary, with no hand-maintained field inventory. Its canonical-form | ||
| test was written **first**, before any golden vector (the ordering gate). |
Comment on lines
+36
to
+39
| - `project.go` - `currentContentVersion = 1`; `projectV1` + nested sub-projectors; | ||
| the `proj` accumulator (defers the first emit error, bufio-style); | ||
| `canonicalizeForFingerprint` / `canonicalizeInto` / `canonicalizeSlice`; | ||
| `ValidateFieldTag` (mandatory-decision + composite-`!` gate) and `tomlKeyOf`. |
Comment on lines
+106
to
+107
| | `internal/fingerprint/` | New canonical encoder, tag parser, `projectV1`, `canonicalizeForFingerprint`, sha256 combiner; `ComputeIdentity` switch-over; remove `hashstructure`. (`fingerprint.go`) | | ||
| | `internal/projectconfig/` | `fingerprint:"vN..*"` tags on `ComponentConfig` and nested structs (`component.go`, `build.go`); extend `TestAllFingerprintedFieldsHaveDecision` (`fingerprint_test.go`); `Packages` mandatory-tag correction; resolver scalar-slice canonicalization (`component.go`). | |
Comment on lines
+64
to
+67
| Replace the `hashstructure.Hash(component, ...)` config-hash step with | ||
| `canonicalizeForFingerprint(cfg)` immediately followed by `projectV1` + `sha256`, invoked | ||
| **inside the hash boundary** so every path into the hasher is canonicalized regardless of how | ||
| the caller obtained the config. Retire the `uint64` `ConfigHash` field on `ComponentInputs`; |
6f44065 to
4499493
Compare
4499493 to
7ad6237
Compare
Comment on lines
+37
to
+43
| func (b *treeBuilder) set(key string, value any) { | ||
| if b.out == nil { | ||
| b.out = map[string]any{} | ||
| } | ||
|
|
||
| b.out[key] = value | ||
| } |
7ad6237 to
b814e33
Compare
…256 combiner Phase 1 (PR A1) of the schema-version-parts cutover. Adds the pure projection-substrate primitives in internal/fingerprint, beside the existing hashstructure path: the canonicalBuf length-prefixed encoder with the split omit-predicate, the fingerprint version-set tag parser, and the sha256 combiner step. Nothing is wired into ComputeIdentity and hashstructure is untouched, so no lock byte or scenario snapshot changes. Includes in-package unit tests and the phase 1 report; updates plan status.
…er, sha256 combiner
b814e33 to
272e74c
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
internal/projectconfig/fingerprint_test.go:35
fingerprintedStructsis duplicated here even thoughfingerprint.FingerprintedStructTypes()is intended to be the single source of truth (seeinternal/fingerprint/project.go). Using the exported list avoids the two reflective walks drifting if a struct is added/removed in only one place.
// All struct types whose fields participate in component fingerprinting.
// When adding a new struct that feeds into the fingerprint, add it here.
fingerprintedStructs := []reflect.Type{
reflect.TypeFor[projectconfig.ComponentConfig](),
reflect.TypeFor[projectconfig.ComponentBuildConfig](),
| ## Component Fingerprinting: version-set tags + `projectV1` | ||
|
|
||
| Structs in `internal/projectconfig/` are hashed by `hashstructure.Hash()` to detect component changes. Fields **included by default** (safe: false positive > false negative). | ||
| Structs in `internal/projectconfig/` feed the component fingerprint, and **every field must carry an explicit `fingerprint` tag** - an absent tag fails `TestAllFingerprintedFieldsHaveDecision`. (The live hash is still `hashstructure.Hash()`; the hand-written `projectV1` in `internal/fingerprint/project.go` is wired in parallel and replaces it at the reset.) |
…er, sha256 combiner
272e74c to
973f011
Compare
…er, sha256 combiner
973f011 to
3599227
Compare
Comment on lines
+105
to
+107
| When adding a field to a fingerprinted struct, ask: **"Does changing this field change the build output?"** | ||
| - **No** (human docs, scheduling hints, CI policy, metadata, back-references) → tag it `fingerprint:"-"` and register the exclusion in `expectedExclusions` in `internal/projectconfig/fingerprint_test.go`. | ||
| - **Yes** (build flags, spec source, defines, etc.) → tag it `fingerprint:"v1..*"` (omit-if-zero) and wire it into `projectV1`: add its `emit`/sub-projector line, set it in `emissionProbeConfig`, then append a `<toml-key>-set` golden vector with `go test ./internal/fingerprint -run TestGoldenVectors -update-golden`. The golden table is append-only - never edit an existing digest. The full contract lives in `goldenConfigs`'s doc comment. |
Comment on lines
+19
to
+24
| // FingerprintedStructTypes returns every struct type whose fields participate in | ||
| // the component fingerprint. It is the single source of truth that both the | ||
| // mandatory-tag decision test (`projectconfig.TestAllFingerprintedFieldsHaveDecision`) | ||
| // and the emission probe (`measuredLeafEmitKeys`) walk - keeping it in one | ||
| // exported place is what stops those two walks from drifting silently (a struct | ||
| // dropped from one but not the other would weaken coverage with no failure). |
…er, sha256 combiner
…er, sha256 combiner
Add the v1 projection layer beside the existing hashstructure path, additive and not yet wired into ComputeIdentity: - projectV1 + frozen nested sub-projectors, emitting measured fields by literal Go path under their frozen TOML emit-key - canonicalizeForFingerprint: reflective nil-or-empty scalar-slice normalizer at the hash boundary, pruning at fingerprint:"-" edges (cycle-safe, no field inventory) - fingerprint:"v1..*" tags on every measured field across the 10 fingerprinted structs; hazard comments on each '-'-pruned composite; Packages kept measured - golden vectors with an append-only -update-golden guard; emission probe; composite-'!' placeholder gate; mandatory-tag decision test - frozen maximalConfig vs growing emissionProbeConfig split + documented additive-field workflow and naming convention No lock byte or scenario snapshot moves; hashstructure untouched.
3599227 to
7ffefaa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.