Skip to content

fix(loops): funnel alignment — the search tie-band matches the gate under the cost objective#254

Merged
drewstone merged 1 commit into
mainfrom
fix/funnel-alignment
Jun 11, 2026
Merged

fix(loops): funnel alignment — the search tie-band matches the gate under the cost objective#254
drewstone merged 1 commit into
mainfrom
fix/funnel-alignment

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

The cost-objective run returned HOLD identical-champion, and the discriminating-set recompute found why — a clean design lesson:

gen1 (discriminating n=14) score $/task
sampleThenRefine 73.3% $0.0249
scoutThenAct 72.1% (gap 1.2pp) $0.0235
researchThenExecute 69.6% (gap 3.7pp) $0.0141 (−43%)

researchThenExecute is exactly the candidate the non-inferiority gate (tolerance 5pp + significant savings) was built to promote — and it never reached the gate, because search-side displacement required a 1pp tie. The funnel was stricter than the promotion criterion, so cost-frontier candidates died in search and the gate saw identical-champion every time (this also retro-explains the prior run's dualPersonaRefine, 2.1pp gap at half cost, filtered).

The principle: the search filter must be no stricter than the promotion criterion. Under objective: 'cost', championEpsilon now defaults to scoreTolerance (5pp); CHAMPION_EPSILON env for explicit control. +1 test pinning the displacement behavior at both bands with the run's real numbers. Suite 789 ✓. Cost run relaunches on this fix with a fresh holdout offset.

…nder the cost objective

The cost run's HOLD was a funnel misalignment, not an author failure: the
gate accepts score within −scoreTolerance (5pp) + significant savings, but
search-side champion displacement required a 1pp tie — so the author's
researchThenExecute (−3.7pp at 43% cheaper on the gen1 discriminating set,
exactly the candidate the gate was built to judge) died in search and the
gate saw identical-champion. The principle: the search filter must be no
stricter than the promotion criterion. Under objective='cost',
championEpsilon now defaults to scoreTolerance; CHAMPION_EPSILON env on the
bench runner for explicit control.

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 25c00a43

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-11T00:20:47Z

@drewstone drewstone merged commit 2fb86be into main Jun 11, 2026
1 check passed
@drewstone drewstone deleted the fix/funnel-alignment branch June 11, 2026 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants