Bench-runner LLM transport fix: HTTP/1.1 + no pool + 15s header timeout (LlmPolicy) by adithyn7 · Pull Request #135 · TransformerOptimus/SuperCoder

adithyn7 · 2026-06-12T11:36:27Z

Summary

Bench-runner-only LLM transport fix for the chronic "error decoding response body" decode deaths on long Kimi/Fireworks runs. Introduces an LlmPolicy on LlmClientConfig mirroring the existing ToolPolicy pattern: default = byte-identical to today's desktop-app behavior (pooled HTTP/2 via shared client); bench-runner opts into LlmPolicy::bench() (HTTP/1.1, fresh TCP per request, 15s header-arrival timeout). Matches opencode's working transport shape.

Evidence (off-grid validation, chronic dead cell)

teleport-1b08 × kimi × ON — 6 consecutive deterministic decode deaths across eras with v0.1.5:

Attempt (old binary)	Survived to	Outcome
g6/a0	20 turns	decode death
g6/a1	11 turns	decode death
g6/a2	6 turns	decode death
g6/a3	9 turns	decode death
g6/a4 (this PR)	45 turns	done — real 3.1 KB Go patch, 12.2 min, $0.50

stderr was completely empty on the survivor — no [bench] llm-error retrying=… lines. The truncations didn't happen; they were prevented at the transport layer. Matches the hypothesis: opencode (HTTP/1.1) had zero decode deaths on the same router/account/model across ~16 heavy Kimi cells.

The three changes (all bench-only)

http1_only — pin the LLM client to HTTP/1.1 (no h2 ALPN negotiation).
no_pool — pool_max_idle_per_host(0) → fresh TCP+TLS connection per request, kills connection-reuse / pool-poisoning failure modes.
header_timeout_ms: Some(15_000) — abort the request if response headers don't arrive within 15 s. New retryable AgentError::HeaderTimeout joins the existing HttpError/ChunkTimeout retry arm.

Plus: AgentEvent::Error { retrying, message } mirrored to stderr so the grid's stderr_tail captures LLM retry attempts directly.

Ships as v0.1.6 (patch).

Introduce a LlmPolicy carried on LlmClientConfig so the LLM HTTP client can run stricter under headless eval without changing the desktop app. Default policy is permissive (byte-identical to current app behavior — pooled HTTP/2 via the shared client); LlmPolicy::bench() enables: - http1_only: pin the LLM client to HTTP/1.1 (no h2 negotiation) - no_pool: fresh TCP+TLS per request (pool_max_idle_per_host(0)) - header_timeout_ms: 15s abort if response headers don't arrive LlmClient::new builds a dedicated reqwest::Client when the policy deviates from default; otherwise still clones the shared client. A new retryable AgentError variant HeaderTimeout joins HttpError/ChunkTimeout in the existing turn-level retry path. Matches opencode's working transport shape (HTTP/1.1, fresh sockets, header timeout + SDK retries) which has zero decode deaths on the same router that gave bench-runner deterministic mid-stream resets on long Kimi runs.

The eval harness sets LlmPolicy::bench() so the frozen binary uses the same HTTP transport shape opencode does on the same router. Always on — part of the harness identity, no CLI flag. Also mirror AgentEvent::Error { retrying, message } to stderr so the grid's stderr_tail captures LLM retry attempts directly. Without this the result JSON only records the final outcome, leaving us blind to whether retries fired. Validated off-grid on the chronic dead cell (teleport-1b08 × kimi × ON): old binary died on decode at turns 6, 9, 11, 20 across 4 attempts; this binary completed cleanly at 45 turns with empty stderr (no retries triggered — the underlying truncations stopped happening at the transport layer).

adithyn7 added 2 commits June 12, 2026 17:05

adithyn7 added the patch Patch version bump label Jun 12, 2026

adithyn7 merged commit 6e1048e into main Jun 12, 2026
4 checks passed

adithyn7 changed the title ~~Bench-runner LLM transport fix: HTTP/1.1 + no pool + 15s header timeout (LlmPolicy)~~ Bench-runner v0.1.7: HTTP/1.1 transport fix + tunable LLM/tool policy via CLI/env (LlmPolicy) Jun 13, 2026

adithyn7 changed the title ~~Bench-runner v0.1.7: HTTP/1.1 transport fix + tunable LLM/tool policy via CLI/env (LlmPolicy)~~ Bench-runner LLM transport fix: HTTP/1.1 + no pool + 15s header timeout (LlmPolicy) Jun 13, 2026

adithyn7 mentioned this pull request Jun 13, 2026

Bench-runner v0.1.7: expose LLM + tool policy knobs as CLI / env tunables #136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bench-runner LLM transport fix: HTTP/1.1 + no pool + 15s header timeout (LlmPolicy)#135

Bench-runner LLM transport fix: HTTP/1.1 + no pool + 15s header timeout (LlmPolicy)#135
adithyn7 merged 2 commits into
mainfrom
feat/bench-runner-llm-policy

adithyn7 commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adithyn7 commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Evidence (off-grid validation, chronic dead cell)

The three changes (all bench-only)

Ships as v0.1.6 (patch).

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adithyn7 commented Jun 12, 2026 •

edited

Loading