feat(frost/roast): RFC-21 Phase 3.1 -- coordinator skeleton + seed bridge#3968
Merged
mswilkison merged 2 commits intoMay 23, 2026
Conversation
…idge
Introduces the ROAST coordinator state machine surface specified in
RFC-21 Layer B, plus the sterile seed-folding adapter that bridges the
new [32]byte AttemptSeed into the legacy int64 seed accepted by the
existing SelectCoordinator helper.
* pkg/frost/roast/coordinator_state.go
- AttemptState enum (Pending / Collecting / Aggregating /
Succeeded / Transitioned) with String() for log/test
readability.
- AttemptHandle opaque per-attempt identity; ContextHash()
accessor.
- Coordinator interface: BeginAttempt, State, SelectedCoordinator.
Later Phase-3 PRs (3.2, 3.3, 3.4) extend the interface with
TransitionMessage / LocalEvidenceSnapshot types, AggregateBundle,
VerifyBundle, and the NextAttempt policy function.
- NewInMemoryCoordinator factory; concurrent-safe via sync.Mutex
and atomic next-id counter.
- ErrUnknownAttempt sentinel for handle/instance mismatch.
* pkg/frost/roast/seed_bridge.go
- foldAttemptSeed: first 8 bytes BE -> int64 reinterpretation.
Documented as non-cryptographic, deterministic adapter only.
Used by BeginAttempt to call SelectCoordinator without
modifying the legacy helper.
No production code path uses the new Coordinator yet -- consistent
with RFC-21 Phase 3's "ships unused" requirement. Phase 4 wires the
state machine into receivers behind the frost_roast_retry build tag.
Tests:
* coordinator_state_test.go (9 cases)
- handle.ContextHash matches input AttemptContext.Hash
- distinct handles across BeginAttempt calls
- empty included set rejected
- State after Begin is Collecting
- State for unknown handle returns sentinel
- SelectedCoordinator returns member from included set
- SelectedCoordinator deterministic across independent Coordinator
instances given identical context
- Attempt-number rotation: 16 attempts produce >=2 distinct
coordinators (defends ROAST's leader-rotation property)
- Concurrent BeginAttempt (16 goroutines * 50 calls each)
produces unique handles under sync.Mutex
- AttemptState.String for every defined value plus unknown sentinel
* seed_bridge_test.go (5 cases)
- determinism on identical input
- first-8-bytes-BE decode verified against a specific byte pattern
- bytes 8..31 do not influence output (contract documentation)
- 256-value sweep of the high byte produces 256 distinct int64
- golden fixture locks the wire-format reduction; literal
drift is caught at code review
All pass under: go test ./pkg/frost/roast/..., go test -race
./pkg/frost/roast/..., go test -tags 'frost_native frost_tbtc_signer'
./pkg/frost/..., staticcheck -checks '-SA1019' ./pkg/frost/roast/...,
go vet ./pkg/frost/roast/...
The map-literal column alignment in the CI's gofmt -s pass differs from my local alignment; gofmt prefers a single space between the longest key and the value. Apply the formatter's preference so client-format passes. Pure formatting change. No behaviour change.
2f1fd39
into
feat/frost-schnorr-migration-scaffold
15 checks passed
mswilkison
added a commit
that referenced
this pull request
May 23, 2026
…nceSnapshot (#3969) ## Summary Second Phase-3 implementation PR. Adds the wire types the coordinator-aggregation flow (RFC-21 Resolved Decisions) needs: - **\`LocalEvidenceSnapshot\`** — per-signer signed evidence carrying \`SenderID\`, \`AttemptContextHash\`, the canonical sorted \`OverflowEntry\` slice, and an \`OperatorSignature\` placeholder. - **\`TransitionMessage\`** — coordinator-aggregated bundle carrying the attempt context hash, the elected coordinator's member index, a \`SenderID\`-ascending \`Bundle\` of snapshots, and a \`CoordinatorSignature\` placeholder. - **\`OverflowEntry\`** — JSON-friendly canonical key/value pair for one \`attempt.Evidence.Overflows\` entry. **No transport, aggregation, or signature handling yet.** Phase 3.2 is the wire-type slice only. Phase 3.3 introduces the canonical-encoding contract those signatures cover plus the aggregate / verify routines. Stacked on #3968 (Phase 3.1). ## Why this scope The RFC's Phase 3 work was always going to mix \"new types\" with \"new behaviour over those types.\" Landing the types first means Phase 3.3 reviews can focus on the canonical-encoding / signature-verification logic without rereading the JSON shapes. Types are small, mechanical, and review-easy. ## What's in the bundle | Surface | Notes | |---|---| | \`LocalEvidenceSnapshot\` | \`SenderIDValue\`, \`AttemptContextHash\` (32 bytes exact), \`Overflows\` (sorted asc by Sender), \`OperatorSignature\` (≤ 256 bytes). | | \`TransitionMessage\` | \`AttemptContextHash\` (32 bytes exact), \`CoordinatorIDValue\`, \`Bundle\` (sorted asc by Sender, ≤ 256 entries, each binding to the bundle's hash), \`CoordinatorSignature\` (≤ 256 bytes). | | \`OverflowEntry\` | \`{Sender, Count}\`. | | Type prefix | \`frost_signing/roast/\` -- distinct from \`native_frost\` / \`native_tbtc_signer\`. | | Validation in Unmarshal | zero-sender, 32-byte hash, max-snapshots cap, max-signature caps, ordering, intra-bundle hash consistency, snapshot validity. | ## Why JSON Per RFC-21 Resolved Decision 6: matches Phase 1B's FROST/tbtc-signer protocol messages and inherits envelope-level operator-key signing from \`pkg/net\`. Raw JSON never lands on the wire; the messages implement \`net.TaggedUnmarshaler\` and ride the existing \`pkg/net/gen/pb\` envelope when sent. ## Test coverage 22 cases across two test surfaces. Highlights: - Type strings are stable and under the \`frost_signing/roast/\` prefix. - \`NewLocalEvidenceSnapshot\` sorts \`Overflows\` ascending by Sender; empty evidence omits the field via \`omitempty\`. - JSON round-trips preserve all fields. - Validation rejects every malformed shape: zero sender, wrong hash length, oversize signature, unsorted overflows, duplicate overflow sender, empty bundle, oversize bundle, zero coordinator, oversize coordinator signature, invalid contained snapshot, duplicate bundle sender, mismatched bundle-vs-snapshot hash. - Identical inputs produce byte-identical JSON (deterministic canonical form). ## Phase 3 status | PR | Scope | State | |---|---|---| | 3.1 (#3968) | Coordinator skeleton + seed bridge | open | | **3.2 (this)** | **TransitionMessage + LocalEvidenceSnapshot types** | **open** | | 3.3 | Aggregation + bundle verification | next | | 3.4 | NextAttempt policy + thresholds | after 3.3 | ## Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/roast/...\` | pass (29 cases total in this package now) | | \`go test -race ./pkg/frost/roast/...\` | pass | | \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/roast/...\` | silent | | \`go vet ./pkg/frost/roast/...\` | clean | | \`gofmt -l ./pkg/frost/roast/\` | silent | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the size caps (\`MaxSnapshotsPerBundle = 256\`, \`MaxOperatorSignatureBytes = 256\`, \`MaxCoordinatorSignatureBytes = 256\`) are appropriate ceilings for production signing groups. - [ ] Reviewer confirms the deferred-signature contract is acceptable for Phase 3.2 -- signature computation/verification lands in Phase 3.3.
mswilkison
added a commit
that referenced
this pull request
May 23, 2026
…ion (#3970) ## Summary Third Phase-3 implementation PR. Adds the methods that drive the ROAST coordinator-aggregation flow defined in RFC-21's **Resolved Decisions** section: | Method | Role | |---|---| | \`RecordEvidence(handle, snap)\` | Accept a peer's signed \`LocalEvidenceSnapshot\`. Validates structure, verifies the operator signature, checks the snapshot's \`AttemptContextHash\` matches the handle's bound context, applies first-write-wins / equal-or-reject. | | \`AggregateBundle(handle)\` | Called by the elected coordinator. Sorts accumulated snapshots ascending by \`SenderID\`, builds the \`TransitionMessage\`, signs the canonical bundle bytes, transitions state to \`Transitioned\`. | | \`VerifyBundle(handle, msg)\` | Called by every receiver. Verifies coordinator signature, every snapshot's operator signature, and -- if the receiver has submitted its own snapshot -- presence of that snapshot in the bundle (censorship detection). | **Stacked on #3969 (Phase 3.2).** ## What's new ### \`pkg/frost/roast/signature.go\` - \`Signer\` / \`SignatureVerifier\` interfaces (Phase 4 wires them to \`pkg/net\`'s operator-key + member-keys surfaces). - \`NoOpSigner\` / \`NoOpSignatureVerifier\` for tests that don't exercise the crypto pipeline. - \`CanonicalSnapshotBytes\` -- JSON of snapshot fields *excluding* \`OperatorSignature\`. - \`CanonicalBundleBytes\` -- JSON of bundle fields *excluding* \`CoordinatorSignature\` but *including* every snapshot's \`OperatorSignature\`, so the coordinator's signature attests to the exact assembled set. - \`verifySnapshotSignature\` / \`verifyBundleSignature\` / \`verifyOwnObservationsPresent\` -- the receiver-side checks, each testable in isolation. - Sentinels: \`ErrSignatureInvalid\`, \`ErrSignatureMissing\`, \`ErrCensorshipDetected\`. ### \`pkg/frost/roast/coordinator_state.go\` (extended) - \`Coordinator\` interface gains \`RecordEvidence\`, \`AggregateBundle\`, \`VerifyBundle\`. - \`NewInMemoryCoordinatorWithSigning(selfMember, signer, verifier)\` -- production constructor (Phase 4 callers). - \`NewInMemoryCoordinator()\` preserved as a Phase-3.1-compatible convenience that uses no-op signing. - New sentinels: \`ErrNotAggregator\`, \`ErrAttemptStateInvalid\`, \`ErrAttemptContextMismatch\`, \`ErrSnapshotConflict\`. ### \`pkg/frost/roast/transition_message.go\` (touched) - \`validate()\` methods promoted to public \`Validate()\` on both \`LocalEvidenceSnapshot\` and \`TransitionMessage\` so callers that construct messages in memory can validate without a marshal/unmarshal round-trip. ## What's tested ### \`signature_test.go\` (13 cases) Signature interfaces, canonical encodings, signature verification round-trips (via a deterministic SHA-256 fake signer/verifier), tampered-payload rejection, coordinator-mismatch rejection, censorship-detection helper (missing snapshot, mutated signature, skip semantics). ### \`bundle_aggregation_test.go\` (11 cases) - \`RecordEvidence\`: nil rejection, unknown handle, context hash mismatch, bad signature, valid-and-idempotent re-submission, conflict rejection, self-submission tracking. - \`AggregateBundle\`: non-aggregator rejection, signed bundle build (size, ordering, signature, terminal state), **deterministic bundle JSON across different record orderings**. - \`VerifyBundle\`: valid acceptance, **censorship detection**, coordinator-signature forgery, snapshot-signature forgery, attempt-context mismatch, nil message, unknown attempt, concurrent record-and-aggregate safety. ### Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/roast/...\` | pass | | \`go test -race ./pkg/frost/roast/...\` | pass | | \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/roast/...\` | silent | | \`go vet ./pkg/frost/roast/...\` | clean | | \`gofmt -l ./pkg/frost/roast/\` | silent | ## Why the censorship-detection check is what it is A receiver that has submitted its own snapshot but is missing from the bundle has two possible explanations: (1) the elected coordinator maliciously dropped the snapshot, or (2) the bundle was assembled before the receiver's submission arrived. In either case, feeding the bundle into \`NextAttempt\` would penalise the receiver (via silence-parking), so the bundle must be rejected pending re-broadcast on the next attempt. \`ErrCensorshipDetected\` is the unambiguous signal. When the receiver has not yet submitted (selfMember == 0 or selfSubmission == nil), the check is skipped: there is no submitted snapshot whose presence to verify. ## Phase 3 status | PR | Scope | State | |---|---|---| | 3.1 (#3968) | Coordinator skeleton + seed bridge | open | | 3.2 (#3969) | TransitionMessage + LocalEvidenceSnapshot | open | | **3.3 (this)** | **Aggregation + bundle verification** | **open** | | 3.4 | NextAttempt policy + thresholds | next | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the censorship-detection semantics are acceptable (specifically that we *reject* the bundle on missing self-snapshot rather than accept it and let silence-parking catch it). - [ ] Reviewer confirms the canonical-encoding contract (snapshot omits its own signature; bundle includes snapshot signatures but omits its own).
mswilkison
added a commit
that referenced
this pull request
May 23, 2026
#3971) ## Summary **Closes Phase 3 of RFC-21.** Adds the deterministic \`(AttemptContext, TransitionMessage) -> AttemptContext\` policy that makes ROAST-aware retry possible. Two honest signers fed the same previous context and the same verified bundle compute byte-identical next contexts -- the foundational invariant the coordinator-aggregation model exists to enforce. Stacked on #3970 (Phase 3.3). ## What lands ### Policy (RFC-21 Layer B) | Step | Logic | |---|---| | 1. Permanent exclusion (transport) | overflow count summed across bundle ≥ \`OverflowExclusionThreshold\` (constant = 4). | | 2. Permanent exclusion (validation) | reject events; no-op hook (reject category lands in a later phase). | | 3. Silence parking (strictly transient) | senders in prev IncludedSet not present in bundle, not now permanently excluded → moved to TransientlyParked for **one attempt only**. | | 4. Reinstatement | prev TransientlyParked members rejoin IncludedSet automatically. | | 5. Infeasibility | if next IncludedSet < threshold → \`ErrAttemptInfeasible\`. | ### AttemptContext: \`TransientlyParked\` field Parking metadata must flow between attempts because two honest coordinators computing the next attempt must agree on who is parked. Including \`TransientlyParked\` in the canonical hash binds the contract. The Phase 1A pinned-fixture test's inline reference encoder is updated in lockstep, so any future drift between the production encoder and the reference is still caught at code review. ### Why "strictly transient" This was the formal mitigation Gemini's Phase-3 design review asked for. A peer falsely labelled silent (network blip, coordinator censorship caught at \`VerifyBundle\`) is reinstated by the very next attempt without intervention. Permanent exclusion only follows from overflow or validation reject, neither of which can fire on a slow-but-honest peer. ### \`NextAttempt(handle, bundle, threshold, dkgGroupPublicKey)\` | Parameter | Why | |---|---| | \`handle\` | identifies the previous attempt (its IncludedSet + ExcludedSet + TransientlyParked) | | \`bundle\` | verified TransitionMessage (caller must call \`VerifyBundle\` first) | | \`threshold\` | FROST signing threshold \`t\` for the key group; constant across a session | | \`dkgGroupPublicKey\` | per RFC-21 Decision 2: extract from FFI signer material at attempt construction | Threshold = 0 disables the infeasibility check (test seam; never production). ## Test coverage 15 new cases in \`next_attempt_test.go\` covering: - No-evidence baseline (IncludedSet unchanged, AttemptNumber++) - Overflow at exact threshold triggers permanent exclusion - Overflow below threshold does not exclude - Silent member → \`TransientlyParked\`, not \`ExcludedSet\` - Previously parked → reinstated to \`IncludedSet\` - **Full park/reinstate cycle across N → N+1 → N+2** -- the defining test for "strictly transient" parking - Original signer set size preserved across transitions - Determinism: same inputs → same next-context hash - Infeasibility error when below threshold - Threshold = 0 disables check - Overflow summed across observers (not maxed) - Nil bundle rejected - Unknown handle returns \`ErrUnknownAttempt\` - \`OverflowExclusionThreshold\` constant matches RFC-21 spec ## Phase 3 status | PR | Scope | State | |---|---|---| | 3.1 (#3968) | Coordinator skeleton + seed bridge | open, Gemini-approved | | 3.2 (#3969) | TransitionMessage + LocalEvidenceSnapshot | open, Gemini-approved | | 3.3 (#3970) | Aggregation + bundle verification | open, Gemini-approved | | **3.4 (this)** | **NextAttempt policy + thresholds** | **open** | Phase 3 is now feature-complete. **Phase 4** wires receivers to a real coordinator instance behind the \`frost_roast_retry\` build tag. ## Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/roast/...\` | pass | | \`go test -race ./pkg/frost/roast/...\` | pass | | \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/roast/...\` | silent | | \`go vet ./pkg/frost/roast/...\` | clean | | \`gofmt -l ./pkg/frost/roast/\` | silent | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the canonical-hash extension is acceptable (Phase 1B field is optional during Phases 1-5, so live peers running older code can still interop without consuming the hash). - [ ] Reviewer confirms the parking discipline as documented matches what RFC-21's Resolved Decisions section specifies.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First Phase-3 implementation PR for RFC-21. Introduces the ROAST
coordinator state-machine surface (
Coordinatorinterface, in-memoryimplementation, attempt-handle identity, state enum) plus the sterile
seed-folding adapter that lets the new
[32]byteAttemptSeeddrivethe legacy
SelectCoordinatorhelper without modifying it.No production code path uses the new `Coordinator` yet. Phase 3
"ships unused" per the RFC. Phase 4 wires it into receivers behind the
`frost_roast_retry` build tag.
What lands
`pkg/frost/roast/coordinator_state.go`
`pkg/frost/roast/seed_bridge.go`
`BeginAttempt` calls `foldAttemptSeed` and forwards to the existing
`SelectCoordinator` to elect the attempt's coordinator. The legacy
helper itself is not modified -- the bridge is the only thing
between RFC-21 contexts and the legacy seed format.
Why the seed bridge
The legacy `SelectCoordinator` takes `(seed int64, attemptNumber uint)`
and is correct in isolation. RFC-21 widens `AttemptSeed` to
`[32]byte` for the canonical-hash binding. We could rewrite the
shuffle, but rewriting cryptographic-consensus logic that already
agrees across the network is the wrong trade-off; the audit and
behaviour are settled.
The bridge satisfies the resolved decision in RFC-21:
Test coverage
`coordinator_state_test.go` (9 tests)
independent `Coordinator` instances agree on the elected member
-- 16 attempts produce ≥2 distinct leaders, defending the ROAST
leader-rotation property
16 goroutines × 50 calls each, all handles unique
`seed_bridge_test.go` (5 tests)
byte pattern verified
contract: bytes 8..31 don't influence output (still bound at
the `AttemptContext.Hash()` layer)
high byte produces 256 distinct outputs (no collisions)
locks the wire-format reduction; literal drift caught at
code review
Verification
Test plan
acceptable. (Bytes 8..31 still appear in `AttemptContext.Hash()`,
so any mutation is detected at the protocol-message layer in
Phase 1B; the bridge merely reduces 256-bit input to the 64-bit
width `SelectCoordinator` needs.)
appropriate for Phase 3.1 (state surface only). Phase 3.2 will
extend with `TransitionMessage` types.
Refs RFC-21 Phase 3 (`docs/rfc/rfc-21-*`). Stacked at the integration
tip after #3967 merged.