Skip to content

feat(pricing): market/limit price + Surplus credits in chat, agents, and pi#41

Open
drewstone wants to merge 1 commit into
mainfrom
feat/router-chat-agent-transport
Open

feat(pricing): market/limit price + Surplus credits in chat, agents, and pi#41
drewstone wants to merge 1 commit into
mainfrom
feat/router-chat-agent-transport

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What

Lets a user pick market vs limit price and spend Surplus prepaid credits from every tcloud surface — SDK, agent harness, pi extension, and CLI. This is the client side of the Surplus redemption seam (Phase 8 box in the Surplus ROADMAP: “a harness call spends a credit”).

A Surplus credit is a prepaid, price-locked, metered quota: N tokens of model:tokenKind at a locked strike (micro-tsUSD per 1M tokens), debited per call, with the operator paid from the credit's escrowed backing instead of the buyer's USD balance.

Surface

  • ChatOptions.pricing{ mode: 'market' | 'limit', maxInputMicroPerM?, maxOutputMicroPerM?, credits?: boolean | { creditId } }. Serialized as snake_case body.pricing on /v1/chat/completions; limit mode without a cap is rejected client-side before any request; pricing is blocked from the providerOptions escape hatch. serializePricing is exported for raw router callers.
  • ChatCompletion.surplus — per-token-kind redemption blocks: { credit_id, token_kind, tokens_debited, overflow_tokens, strike_micro_per_m, payout_micro }. trackCost excludes credit-debited tokens from the USD spend meter (they were prepaid at purchase); overflow tokens still bill normally.
  • Agent harnessrouterChatTransport(client, { pricing }) flows pricing into every call; non-streaming runs aggregate redemptions into AgentRunResult.surplus.
  • pi — a pricing block in ~/.tcloud/config.json applies to all extension inference; the tangle chat capability takes pricing per call; the widget and wallet status report credit tokens spent; 402 limit_price_exceeded gets a distinct message from credit exhaustion.
  • CLItcloud chat --max-input-price/--max-output-price <usd per 1M> (implies limit mode) and --credits [creditId|off]; non-stream output prints each credit debit at its strike.

Wire contract (for the router-side implementation)

The router implements the matching debit via the Surplus RedemptionAdapter (@surplus/redemption, proven by its simulated-router harness): select credit pre-flight → serve + meter as today → debit metered tokens at strike → pay operator from backing → bill overflow_tokens at market. Units are identical on both sides: integer micro-tsUSD per 1M tokens, per-token-kind debits.

Tests

  • packages/tcloud/tests/pricing.test.ts (9): wire serialization (limit + caps + credits, pinned credit, omission), client-side validation, providerOptions smuggling blocked, credit-funded / partial-overflow / no-credit cost tracking.
  • packages/tcloud-agent/tests/agent-runner.test.ts (+2): a harness call spends a credit — pricing reaches the router and redemptions land in AgentRunResult.surplus; surplus: null when no credit funded the run.
  • Full suites green except the pre-existing cli-version corepack/pnpm environment failure (fails identically on a clean checkout of main).

🤖 Generated with Claude Code

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 60ebd71d

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-11T01:18:30Z

@tangletools

Copy link
Copy Markdown
Contributor

✅ No Blockers — 60ebd71d

Readiness 70/100 · Confidence 70/100 · 12 findings (2 medium, 10 low)

deepseek glm aggregate
Readiness 70 70 70
Confidence 70 70 70
Correctness 70 70 70
Security 70 70 70
Testing 70 70 70
Architecture 70 70 70

Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision.

🟠 MEDIUM resume silently ignored in routerChatTransport — packages/tcloud-agent/src/agent-runner.ts

RouterChatAgentSession constructor (line 346) receives input.resume as the id parameter and stores it as a readonly property (line 352), but never uses it. In contrast, BridgeAgentSessionTransport passes resume through to the bridge API as both a bridge config field and a sandbox sessionId (lines 288-295), and SandboxSdkAgentSessionTransport uses it as the Sandbox SDK sessionId (

🟠 MEDIUM No test for non-streaming chat() path on routerChatTransport — packages/tcloud-agent/tests/agent-runner.test.ts

RouterChatAgentSession.chat() (agent-runner.ts:364-368) has a distinct code path from chatStream(): it calls this.client.chat() and appends the assistant response synchronously (no try/finally). When stream:false or budget.usd is set, the agent loop uses chat(). No test exercises this path with routerChatTransport. The existing 'forces non-streaming when usd budget is set' test only covers the bridge transport. A test like:

const result = await agent({
transport: routerChatTransport(client as any),
profile: 'openai/gpt-4o-mini',
brief: 'hi',
budget: { usd: 100 },
}).run()
expect(chats[0].__mode).toBe('chat')

would close this gap.

🟡 LOW Empty assistant message appended when API returns null/empty content — packages/tcloud-agent/src/agent-runner.ts

Line 367: this.appendAssistant(completion.choices?.[0]?.message?.content ?? '') — if the API returns no content, an empty-string assistant message is pushed to history. The next iteration's messages array will contain a no-op assistant turn. Not harmful but slightly wasteful context. Consider guarding: if (content) this.appendAssistant(content).

🟡 LOW RouterChatAgentSession.prepareMessages mutates history before chat call; stale on stream error — packages/tcloud-agent/src/agent-runner.ts

prepareMessages (line 388) pushes the new user messages to this.history before chatStream is called. In chatStream (line 371), the finally block (line 379) appends whatever partial assistant text was accumulated. If the stream errors mid-flight, the session's internal history contains user messages plus a truncated assistant response — inconsistent state. This is currently harmless beca

🟡 LOW profileSystemPrompt evaluated twice in routerChatProfile — packages/tcloud-agent/src/agent-runner.ts

Line 681: profileSystemPrompt(profile) is called twice — once for the truthiness guard, once for the value. Since it's a pure function with no side effects, this is not a bug, just a minor inefficiency. Could cache in a local: const sys = profileSystemPrompt(profile); return { ...(sys ? { systemPrompt: sys } : {}) }.

🟡 LOW streaming partial-consume history drift under early break — packages/tcloud-agent/src/agent-runner.ts

Lines 371-382: If a consumer breaks out of the chatStream async iterable early (e.g. wallSec budget breach mid-stream), the finally block saves partial content to history. The outer Agent.stream() loop would then also have partial assistantContent in its transcript. This is consistent (both partial) but the consumer should be aware. The existing Agent.stream() wallSec check at line 527 only fires between iterations, not mid-stream, so this is currently theoretical — the for-await in Agent.stream() always drains

🟡 LOW No coverage for allowSandboxProfileFields escape hatch — packages/tcloud-agent/tests/agent-runner.test.ts

routerChatTransport accepts allowSandboxProfileFields: true to intentionally bypass the sandbox-field guard. The fail-closed test (line 342) validates rejection when it's false (default), but no test validates acceptance when it's true. The unsupportedRouterProfileFields guard and hasProfileValue() helper both have only one tested branch. Add a test with allowSandboxProfileFields: true confirming the profile is accepted and the sandbox fields are silently dropped.

🟡 LOW No coverage for routerChatTransport non-streaming chat() path — packages/tcloud-agent/tests/agent-runner.test.ts

RouterChatAgentSession implements both chat() and chatStream(). The new tests only exercise chatStream() (the default streaming path). The non-streaming chat() path — used when stream:false or a usd budget is set — is untested for routerChatTransport. This is masked at the Agent.stream() level by the existing non-streaming test (line 567) which uses the bridge transport. Add a test with stream:false or budget.usd covering routerChatTransport's chat() path through RouterChatAgentSession.prepareMessages → chatOptions → appendAssistant.

🟡 LOW No coverage for streaming error mid-iteration with routerChatTransport — packages/tcloud-agent/tests/agent-runner.test.ts

RouterChatAgentSession.chatStream has a try/finally that appends partial assistant content to history even on error. The existing bridge-transport error test (line 248) validates the agent loop's catch block, but that error path flows through bridge(). No equivalent test exercises a chatStream() error through routerChatTransport to validate that the partial-history append in finally doesn't corrupt the session state or produce unexpected behavior on resume. Add a test with makeFakeChatClient([new Error('router down')]) exercising the error-verdict path.

🟡 LOW No test for mid-stream error in routerChatTransport — packages/tcloud-agent/tests/agent-runner.test.ts

The 'fails closed' test (line ~340) only covers startup-time validation errors (profile field rejection in routerChatProfile). There is no test for errors thrown during chatStream iteration (e.g., client.chatStream throwing mid-yield). The chatStream method has a try/finally that calls appendAssistant with partial content — this behavior is untested for the routerChat path. The bridge transport tests cover this pattern (makeFakeClient with Error responses), but routerChatTransport's different history management means the failure mode could differ.

🟡 LOW No test for resume/sessionId propagation with routerChatTransport — packages/tcloud-agent/tests/agent-runner.test.ts

RouterChatAgentSession accepts an id parameter (agent-runner.ts:358) from input.resume, which is passed through from transport.start(). No test verifies this id is correctly set on the session. While this is a minor surface, the localCliBridgeTransport has a dedicated test for resume='feature-1' asserting sessionId propagation — parity would be valuable.

🟡 LOW no test coverage for non-streaming routerChatTransport path or resume — packages/tcloud-agent/tests/agent-runner.test.ts

The four routerChatTransport tests (lines 293-356) only exercise the streaming path via chatStream (confirmed by chats[0].__mode === 'chatStream'). No test verifies the non-streaming fallback (stream: false) with routerChatTransport, nor does any test verify behavior when resume is passed to a routerChatTransport session. The production code checks wantStream at line 481 which forces non-streaming when usd budget is set — if usd budget is used with routerChatTransport, the non-streaming path would


tangletools · 2026-06-11T01:29:01Z · trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants