Node.js subprocess crashes with V8 OOM (SIGABRT) during long sessions with MiniMax-M3 and Kimi2.7

## Environment

- **opencode**: 1.17.9
- **macOS**: 26.5.1 (ARM64, 36GB RAM)
- **Node.js**: v26.3.1 (Homebrew)
- **Models**: `minimax-m3` (via `opencode-go`), `kimi2.7` (via `opencode-go`)

## Symptom

After ~1–2 hours of use in a long agentic session, the UI displays **"process has terminated."** The TUI remains open but no further prompts execute. Using `/connect` to switch providers at this point causes full application termination (Abort trap: 6).

After restart, the session is intact and continues normally.

## Crash Evidence (macOS Diagnostic Report)

```
exception:   EXC_CRASH / SIGABRT
termination: "Abort trap: 6" (byProc: node)
asi:         { "libsystem_c.dylib": ["abort() called"] }
```

**Faulting thread stack:**

```
node::OOMErrorHandler
v8::Utils::ReportOOMFailure
v8::internal::V8::FatalProcessOutOfMemory
v8::internal::Heap::FatalProcessOutOfMemory
v8::internal::Heap::CheckIneffectiveMarkCompact   <- GC running but ineffective
v8::internal::Heap::CollectGarbage
[V8 JIT frames]
Builtins_ArrayPrototypeSlice                      <- slicing message array at crash
node::StreamBase::CallJSOnreadMethod              <- actively receiving stream data
```

All 4 V8 worker threads were in `Sweeper::RawSweep` at crash time — the GC was actively running but could not reclaim enough heap before the ceiling was hit.

**macOS incident ID**: `CFCB28C1-762E-4BCB-B4F1-CAD72B7F463B` (2026-06-20)

## Reproduction Pattern

- Occurred **twice with `minimax-m3`**, once with **`kimi2.7`**
- Both route through `providerID=opencode-go` per opencode logs
- Sessions were ~1.5–2 hours with frequent tool use (bash, file reads, git operations)
- Mid-session model switches (via `/connect`) preceded at least two of the crashes
- Has **not** occurred (to date) with Anthropic or OpenAI models

opencode log at time of crash:
```
message=stream providerID=opencode-go modelID=minimax-m3 ...
message="llm runtime selected" llm.runtime=ai-sdk llm.provider=opencode-go llm.model=minimax-m3
```

## Root Cause Hypothesis

### Part 1 — Unbounded messages array growth to 95% ceiling

The AI SDK defers message pruning until the messages array reaches ~95% of the current model's context window. For a 1M-token model like MiniMax-M3 or Kimi2.7, this means the array is allowed to accumulate up to ~950,000 tokens of content before any pruning occurs.

950,000 tokens of structured AI SDK message objects in a V8 heap is substantial:

| Layer | Estimated size |
|---|---|
| Raw UTF-16 text | ~7.5 GB |
| AI SDK object overhead (~10x) | significant |

In practice the V8 heap runs out at ~4.2GB (Node.js default ceiling) well before steady-state — the GC cannot compact a fragmented heap of this size, which is why all 4 Sweeper threads were active but the process still hit `FatalProcessOutOfMemory`.

### Part 2 — Mid-session model switching triggers the transient spike

When the user switches models mid-session via `/connect` (e.g., from MiniMax-M3 at 1M context to a model with a 200K context window), the messages array may already be sized for the previous model's 95% threshold (~950K tokens).

The AI SDK must now re-evaluate the array against the **new model's** threshold (190K tokens) and create a pruned copy. During this operation both the original and the new array exist simultaneously in the V8 heap:

```
Before switch:  ~4.2GB heap (at ceiling, full MiniMax session)
During pruning: original + copy -> transient spike beyond ceiling
Result:         OOM -> SIGABRT
```

This precisely explains why `/connect` is the reliable crash trigger once the session is already "hung" — it is the specific operation that forces a copy-and-prune of the full messages array.

## Workaround

Raise the Node.js heap ceiling in the shell environment:

```sh
# Add to ~/.zshenv (sourced for all shells, inherited by subprocesses)
export NODE_OPTIONS="--max-old-space-size=16384"
```

This delays but does not fix the root cause.

## Suggested Fix

Two complementary changes:

**1. Proactive context budgeting per session:**
Maintain a running token estimate for the `messages` array. Once it exceeds a configurable threshold (e.g. 75% of the model's declared context window), proactively drop the oldest non-system messages rather than waiting until 95%. This keeps heap usage stable throughout a session.

**2. Context window rebase on model switch:**
When the user switches models mid-session, re-evaluate and prune the messages array against the *new* model's context window *before* resuming streaming — not lazily during the next API call. This avoids the transient double-allocation spike described above.

An alternative lightweight mitigation: expose a per-session `max_context_tokens` config option so users working with large-context models can set a conservative ceiling explicitly.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Node.js subprocess crashes with V8 OOM (SIGABRT) during long sessions with MiniMax-M3 and Kimi2.7 #33705

Environment

Symptom

Crash Evidence (macOS Diagnostic Report)

Reproduction Pattern

Root Cause Hypothesis

Part 1 — Unbounded messages array growth to 95% ceiling

Part 2 — Mid-session model switching triggers the transient spike

Workaround

Suggested Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Layer	Estimated size
Raw UTF-16 text	~7.5 GB
AI SDK object overhead (~10x)	significant

Uh oh!

Node.js subprocess crashes with V8 OOM (SIGABRT) during long sessions with MiniMax-M3 and Kimi2.7 #33705

Description

Environment

Symptom

Crash Evidence (macOS Diagnostic Report)

Reproduction Pattern

Root Cause Hypothesis

Part 1 — Unbounded messages array growth to 95% ceiling

Part 2 — Mid-session model switching triggers the transient spike

Workaround

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions