Skip to content

Node.js subprocess crashes with V8 OOM (SIGABRT) during long sessions with MiniMax-M3 and Kimi2.7 #33705

Description

@ketema

Environment

  • opencode: 1.17.9
  • macOS: 26.5.1 (ARM64, 36GB RAM)
  • Node.js: v26.3.1 (Homebrew)
  • Models: minimax-m3 (via opencode-go), kimi2.7 (via opencode-go)

Symptom

After ~1–2 hours of use in a long agentic session, the UI displays "process has terminated." The TUI remains open but no further prompts execute. Using /connect to switch providers at this point causes full application termination (Abort trap: 6).

After restart, the session is intact and continues normally.

Crash Evidence (macOS Diagnostic Report)

exception:   EXC_CRASH / SIGABRT
termination: "Abort trap: 6" (byProc: node)
asi:         { "libsystem_c.dylib": ["abort() called"] }

Faulting thread stack:

node::OOMErrorHandler
v8::Utils::ReportOOMFailure
v8::internal::V8::FatalProcessOutOfMemory
v8::internal::Heap::FatalProcessOutOfMemory
v8::internal::Heap::CheckIneffectiveMarkCompact   <- GC running but ineffective
v8::internal::Heap::CollectGarbage
[V8 JIT frames]
Builtins_ArrayPrototypeSlice                      <- slicing message array at crash
node::StreamBase::CallJSOnreadMethod              <- actively receiving stream data

All 4 V8 worker threads were in Sweeper::RawSweep at crash time — the GC was actively running but could not reclaim enough heap before the ceiling was hit.

macOS incident ID: CFCB28C1-762E-4BCB-B4F1-CAD72B7F463B (2026-06-20)

Reproduction Pattern

  • Occurred twice with minimax-m3, once with kimi2.7
  • Both route through providerID=opencode-go per opencode logs
  • Sessions were ~1.5–2 hours with frequent tool use (bash, file reads, git operations)
  • Mid-session model switches (via /connect) preceded at least two of the crashes
  • Has not occurred (to date) with Anthropic or OpenAI models

opencode log at time of crash:

message=stream providerID=opencode-go modelID=minimax-m3 ...
message="llm runtime selected" llm.runtime=ai-sdk llm.provider=opencode-go llm.model=minimax-m3

Root Cause Hypothesis

Part 1 — Unbounded messages array growth to 95% ceiling

The AI SDK defers message pruning until the messages array reaches ~95% of the current model's context window. For a 1M-token model like MiniMax-M3 or Kimi2.7, this means the array is allowed to accumulate up to ~950,000 tokens of content before any pruning occurs.

950,000 tokens of structured AI SDK message objects in a V8 heap is substantial:

Layer Estimated size
Raw UTF-16 text ~7.5 GB
AI SDK object overhead (~10x) significant

In practice the V8 heap runs out at ~4.2GB (Node.js default ceiling) well before steady-state — the GC cannot compact a fragmented heap of this size, which is why all 4 Sweeper threads were active but the process still hit FatalProcessOutOfMemory.

Part 2 — Mid-session model switching triggers the transient spike

When the user switches models mid-session via /connect (e.g., from MiniMax-M3 at 1M context to a model with a 200K context window), the messages array may already be sized for the previous model's 95% threshold (~950K tokens).

The AI SDK must now re-evaluate the array against the new model's threshold (190K tokens) and create a pruned copy. During this operation both the original and the new array exist simultaneously in the V8 heap:

Before switch:  ~4.2GB heap (at ceiling, full MiniMax session)
During pruning: original + copy -> transient spike beyond ceiling
Result:         OOM -> SIGABRT

This precisely explains why /connect is the reliable crash trigger once the session is already "hung" — it is the specific operation that forces a copy-and-prune of the full messages array.

Workaround

Raise the Node.js heap ceiling in the shell environment:

# Add to ~/.zshenv (sourced for all shells, inherited by subprocesses)
export NODE_OPTIONS="--max-old-space-size=16384"

This delays but does not fix the root cause.

Suggested Fix

Two complementary changes:

1. Proactive context budgeting per session:
Maintain a running token estimate for the messages array. Once it exceeds a configurable threshold (e.g. 75% of the model's declared context window), proactively drop the oldest non-system messages rather than waiting until 95%. This keeps heap usage stable throughout a session.

2. Context window rebase on model switch:
When the user switches models mid-session, re-evaluate and prune the messages array against the new model's context window before resuming streaming — not lazily during the next API call. This avoids the transient double-allocation spike described above.

An alternative lightweight mitigation: expose a per-session max_context_tokens config option so users working with large-context models can set a conservative ceiling explicitly.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions