Skip to content

feat(diagnose): causal sweep, responsibility scoring, replay-validated repair#240

Merged
drewstone merged 2 commits into
mainfrom
feat/diagnose-causal-chain
Jun 10, 2026
Merged

feat(diagnose): causal sweep, responsibility scoring, replay-validated repair#240
drewstone merged 2 commits into
mainfrom
feat/diagnose-causal-chain

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What

New ./diagnose subpath (tsup entry + package export, mirroring ./rl) that turns the dormant counterfactual primitives into a complete remediation chain:

fuzz finds → sweep blames → repair prescribes (validated) → findings / corpus / invariant remediate → gates verify

WHY a run failed — causalSweep

  • Orchestrates reps × steps × mutations within a hard replay budget, composed entirely over the existing runCounterfactual seam (CounterfactualRunner.executeFrom stays the execution boundary).
  • Per-step responsibility = mean of per-rep score deltas + bootstrap CI (confidenceInterval, seeded), ranked by |meanEffect|. reps is REQUIRED and >= 2 — a single intervention delta is one stochastic draw, not a measurement.
  • Kind-level aggregate reuses the existing attributeCounterfactuals (exposed as byMutationKind).
  • Budget exhaustion halts mid-cell rather than emitting weakened CIs; every unprobed step is named in uncovered — never silent.
  • Default probes are the payload-free existing mutation kinds: swap-tool-result knockout (newResult: null) for tool spans, truncate-after re-roll for llm spans. swap-model / inject-system-message are opt-in via mutationsPerStep since they need consumer payloads.

WHAT should have happened — prescribeRepair

  • Consumer-supplied proposeFix(step, context) (LLM-backed in live use) proposes candidate mutations for the blamed steps.
  • A candidate becomes a repair ONLY when EVERY validation replay crosses flipThreshold — machine-verified, never speculated. Non-flippers land in rejected with reason: 'did-not-flip' + the observed delta; replay errors land with reason: 'error' + the message.
  • First validated repair per step is the prescription; remaining candidates stay untried (no fabricated verdicts).

HOW to make it happen — remediation adapters into existing machinery

  • toAnalystFindings(report, repairs?)AnalystFinding[] via the real makeFinding; severity scales with |meanEffect| but is CI-gated (an effect whose CI includes zero is info and must not steer priority); evidence carries stepRefs + raw deltas + CI + replay run ids; validated repairs set recommended_action + validation_plan.
  • toCorpusRecord(run, repair)CorpusRecord pinning the failure as a permanent scenario (fresh runId so corpus dedup keeps both; validateRunRecord at the boundary).
  • suggestInvariant(repair) → plain-data { description, never?, without? } hint in the shape the trace-contracts track consumes.

Grounding (recon-first)

Read before building: src/counterfactual.ts, src/causal-attribution.ts, src/replay.ts, src/bisector.ts, src/statistics.ts, src/analyst/types.ts, src/rl/corpus.ts, plus the runCounterfactual tests in tests/tier2.test.ts. Nothing duplicated — the sweep is pure orchestration over runCounterfactual + attributeCounterfactuals + confidenceInterval.

One spec deviation: mutationsPerStep is (step) => CounterfactualMutation[] rather than a flat CounterfactualMutation[], because mutations carry a step-bound at field and applicability is span-kind-dependent; returned mutations are validated to target the step they were asked for (fail-loud).

Tests

16 deterministic tests in tests/diagnose.test.ts faking the execution seam the same way tier2.test.ts does (seeded mulberry32 noise, no LLM calls): fault step ranked #1 with CI excluding zero vs no-effect step CI including zero; uncovered named under tight budgets; repairs emit only flipping mutations with non-flippers/errors in rejected; every-rep (not on-average) flip enforcement; adapters produce schema-valid outputs.

pnpm typecheck ✓ · pnpm test 208 files / 1998 passed ✓ · pnpm build ✓ (dist/diagnose.js + .d.ts verified importable)

No version bump (release sequenced by the program lead). No root src/index.ts changes — subpath-only, rebase-friendly.

…d repair

New ./diagnose subpath orchestrating the dormant counterfactual
primitives into a three-stage remediation chain:

- causalSweep — reps x steps x mutations within a hard replay budget,
  composed over runCounterfactual; per-step mean effect + bootstrap CI
  (confidenceInterval) ranked by |meanEffect|; kind-level aggregate via
  attributeCounterfactuals; budget exhaustion names uncovered steps.
- prescribeRepair — consumer-supplied proposeFix candidates are
  machine-verified by replaying WITH the mutation; a repair counts only
  when every validation rep crosses flipThreshold; non-flippers and
  replay errors land in rejected with typed reasons.
- Remediation adapters into existing machinery: toAnalystFindings
  (makeFinding, severity from effect size, CI-gated), toCorpusRecord
  (pins the failure as a permanent corpus scenario, validateRunRecord
  at the boundary), suggestInvariant (never/without hint shape for
  trace contracts).

Deterministic tests fake the CounterfactualRunner seam with seeded
mulberry32 noise; no LLM calls.
tangletools
tangletools previously approved these changes Jun 10, 2026

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — a17dca38

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-10T10:43:52Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — f98d73ae

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-10T10:55:18Z

@drewstone drewstone merged commit ea03b8c into main Jun 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants