feat(tbtc/signer): Phase 7.2b-1 InteractiveAggregate completion marker#4055
Merged
Conversation
Adds a durable per-attempt "aggregated" marker so re-aggregating a completed interactive attempt is rejected rather than recomputed (frozen Phase 7 spec; Phase 7.2b design section 6). Engine-side only: no blame and no envelopes, preserving the crypto-only engine boundary (the Q1 correction) - all envelope verification and authoritative blame stay Go-side. - New EngineError::InteractiveAttemptAlreadyAggregated (code interactive_attempt_already_aggregated, recovery_class recoverable). - aggregated_interactive_attempt_markers: HashSet<String> on SessionState, mirrored as Vec<String> on PersistedSessionState with serde(default) for backward-compat with pre-7.2b state, bounded via the existing consumed-registry helpers and wired through both persistence conversions. - interactive_aggregate pre-checks the marker before recomputing, keeps the aggregation lock-free, then re-acquires the lock, re-checks (a concurrent-duplicate race guard), inserts the marker, and persists with rollback-on-failure - mirroring the Round2 consume-before-release pattern - before reporting success. The design's section 9 durable-wallet-pubkey-package-retention question (needed by the 7.2b-3 verify-share FFI) is confirmed already satisfied: the DKG public key package persists on the session and survives the interactive-attempt TTL sweep, so no new persistence is added here. Tests: repeat-aggregate rejected; completion marker survives restart+reload; error code/recovery/message pinned. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Addresses Codex's review of #4055: the completion marker made a successfully-computed aggregate unrecoverable from the engine if the host lost the FFI response after the marker persisted - a liveness/idempotency regression and a restart-divergence. Per section 6's own 'deterministic over public data, not security-load-bearing' rationale, rejecting a retry bought nothing, so the engine now re-emits instead of erroring. The per-attempt completion marker becomes a completion record: aggregated_interactive_attempt_signatures maps attempt_id -> the public aggregate signature hex (BTreeMap on SessionState + PersistedSessionState, serde(default) backward-compat, bounded as before via a shared length-based capacity helper). A repeat InteractiveAggregate returns the stored signature; a concurrent duplicate that raced past the pre-check returns the identical signature without double-storing; success telemetry counts only a freshly recorded aggregation. The persisted signature is public (it goes on-chain), so this respects the never-persist-secrets freeze. EngineError::InteractiveAttemptAlreadyAggregated is removed (re-aggregation no longer errors). Tests updated: a repeat aggregate (immediate and across restart+reload) returns the same signature. fmt + clippy --all-targets --all-features -D + full suite (273) + Phase-5 chaos suite green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mswilkison
added a commit
that referenced
this pull request
Jun 13, 2026
Match the 7.2b-1 implementation change (PR #4055): the InteractiveAggregate completion marker becomes a completion record (attempt_id -> aggregate signature hex), and a repeat aggregate returns the stored signature (idempotent) rather than an 'already aggregated' error. Addresses a Codex review finding that the reject-based marker made a successfully-computed signature unrecoverable from the engine if the host lost the FFI response. §11 acceptance updated accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…/Gemini P2) Both bots' re-review converged on a race in the idempotency follow-up: when two InteractiveAggregate calls for the same attempt race past the empty pre-check, the loser found the record already present, skipped the store, but still returned the signature IT recomputed. Different responsive subsets can produce different VALID signatures for the same attempt, so the value a caller received could diverge from the persisted one that restarts and later re-emissions return - breaking the idempotent completion-record contract. The post-lock path now returns the canonical PERSISTED signature when the attempt is already recorded (early return, no re-store, no success count), matching the pre-check; dropped the recorded_new flag. Strengthened the idempotency test to overwrite the stored record with a sentinel and assert the repeat returns exactly that (proving return-of-recorded-value, which a plain equality check could not distinguish from a deterministic recompute). fmt + clippy --all-targets --all-features -D + full suite (273) + Phase-5 chaos green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…it (Codex P2) Codex's 3rd-round re-review: the completion record is keyed by attempt_id, which does NOT bind taproot_merkle_root, and the coordinator is in the threat model. A reused attempt_id carrying a different root/message would re-emit the stored signature, which would not verify under the caller's requested tweaked key. Re-emission (both the pre-check and the post-race collision path) now returns the recorded signature only when it verifies under THIS request's tweaked group key and message; a reused attempt_id with mismatched aggregate inputs is rejected with a validation error rather than handed a non-verifying signature. Added recorded_aggregate_matches_request (decode + tweak + verify, mirroring the aggregate self-verify). Test rework: dropped the garbage sentinel (now correctly rejected by the verify), kept the same-inputs idempotent + restart re-emission tests, added a mismatched-root rejection assertion. fmt + clippy -D + full suite (273) + Phase-5 chaos green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mswilkison
added a commit
that referenced
this pull request
Jun 13, 2026
Match the 7.2b-1 fix (PR #4055): re-emission of the completion record is validated against the request's tweaked key/message - attempt_id does not bind the taproot root, so a reused attempt_id with different aggregate inputs is rejected rather than handed a non-verifying signature (Codex re-review finding). Drops the now-inaccurate 'or erroring' on the same-inputs path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dex P2) Codex 4th-round re-review: a completed aggregate persists a completion record keyed by attempt_id, and the reload path rejects an empty key, so an empty attempt_id (malformed or malicious) could write durable state that fails to reload - bricking the engine on restart. interactive_aggregate now rejects an empty attempt_id up front (before any decode/aggregate/store), matching the loader's non-empty invariant. Not idempotency-specific: the write path simply didn't enforce what the read path requires. Test: aggregate with an empty attempt_id returns a validation error and persists nothing. fmt + clippy -D + full suite (274) + Phase-5 chaos green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mswilkison
added a commit
that referenced
this pull request
Jun 13, 2026
The open-questions doc's closing paragraph still said owner sign-off was pending and 7.2b-1 could only start afterward, contradicting the signed-off status header + Decision Log entry 9 added in the same PR. Updated to reflect the sign-off and that 7.2b-1 is implemented (PR #4055). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per design decision after 5 review rounds on the re-emission path: the InteractiveAggregate completion record reverts from idempotent signature re-emission to a simple completion MARKER that rejects re-aggregation of a completed attempt (InteractiveAttemptAlreadyAggregated). Matches frozen spec section 6 ('mark the session complete'); a lost signature is recovered with a fresh ROAST attempt, not re-aggregate.
Eliminates the re-emission surface that accumulated findings: the concurrent-race signature divergence, the request package/root binding on re-emit, and the lost-shares recovery gap - none apply when a completed attempt is simply rejected. The empty-attempt_id guard + restart-safety rationale are kept (the marker set's loader still rejects empty keys).
- aggregated_interactive_attempt_markers: HashSet<String> on SessionState / Vec<String> on PersistedSessionState (serde default, bounded, sorted), replacing the attempt_id->signature map. - interactive_aggregate rejects a completed attempt at the pre-check and the post-aggregation re-check; no signature stored or re-emitted. - Re-added EngineError::InteractiveAttemptAlreadyAggregated.
Tests: re-aggregation rejected (immediate + across restart+reload); empty attempt_id rejected. fmt + clippy -D + full suite (275) + chaos green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mswilkison
added a commit
that referenced
this pull request
Jun 13, 2026
Match the implementation decision (PR #4055): section 6 reverts from idempotent signature re-emission to a completion MARKER that rejects re-aggregation; recovery of a lost signature is via a fresh ROAST attempt. Records why the re-emit variant was dropped (concurrent-race divergence, request-input binding, lost-shares recovery gap). section 11 acceptance + 'completion record'->'completion marker' terminology updated across the design + open-questions docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d57237d
into
extraction/frost-signer-mirror-2026-05-26
19 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 7.2b-1: InteractiveAggregate completion marker
First implementation PR of Phase 7.2b (design signed off in #4054 — gates-doc Decision Log entry 9). Engine-side only — no blame, no envelopes — preserving the crypto-only engine boundary (the Q1 correction): all envelope verification and authoritative blame remain Go-side. Targets the mirror branch directly and is independent of #4054 (the code touches no docs; #4054 touches no source).
What it does
Adds a durable per-attempt "aggregated" marker so re-aggregating a completed interactive attempt is rejected (
InteractiveAttemptAlreadyAggregated) rather than recomputed (frozen Phase 7 spec; design §6). This is the only engine-side state 7.2b adds.Changes
errors.rs: newEngineError::InteractiveAttemptAlreadyAggregated { session_id, attempt_id }— codeinteractive_attempt_already_aggregated, recovery classrecoverable(benign duplicate; the engine is healthy).engine/state.rs:aggregated_interactive_attempt_markers: HashSet<String>onSessionState.engine/persistence.rs: mirroredVec<String>onPersistedSessionStatewith#[serde(default, skip_serializing_if = "Vec::is_empty")](loads pre-7.2b state; writes nothing when empty), bounded via the existing consumed-registry helpers, wired through bothTryFromconversions.engine/interactive.rs:interactive_aggregatepre-checks the marker before recomputing; keeps the aggregation lock-free; then re-acquires the lock, re-checks (a concurrent-duplicate race guard), inserts the marker, and persists with rollback-on-failure — mirroring the Round2 consume-before-release pattern — before reporting success.Durable-retention confirmation (design §9)
The §9 question — is the group public key package durably retained beyond the signing-session TTL (needed by the later 7.2b-3 verify-share FFI)? — is already satisfied:
dkg_public_key_packagelives on the persisted session andsweep_expired_interactive_stateclears only the live attempt's nonces. No new persistence is added here.Tests
simulate_process_restart_for_tests()+ reload (persistence round-trip).Verification
cargo fmt,cargo clippy --all-targets --all-features -- -D warningsclean, full suite 274 lib pass / 0 fail / 1 ignored, Phase-5 chaos suite all scenarios pass.🤖 Generated with Claude Code