Skip to content

openai-compatible: can't disable thinking on ollama models (snake_case reasoning_effort dropped; no --variant none) #75

@byapparov

Description

@byapparov

Summary

Running a local ollama thinking-model (gemma3 / "gemma4", qwen3, etc.) through the default @ai-sdk/openai-compatible provider, there is no ergonomic/discoverable way to disable the model's thinking. ollama's /v1/chat/completions only turns thinking off when the request body contains reasoning_effort: "none" (refs: ollama/ollama#14820, ollama/ollama#12004). The CLI sends that field only if the user sets the exact camelCase reasoningEffort option; every other natural attempt silently no-ops, and the run then hangs / returns empty content because the model spends its budget in the hidden reasoning channel.

Impact

Thinking-capable ollama models are effectively unusable for structured-output tasks (e.g. code review) via the CLI unless the user happens to know the exact camelCase key. Symptom: aictrl run produces empty output / appears to hang (130–240 s, no result).

What works / what doesn't (measured: gemma 12B @ local ollama, openai-compatible provider → http://localhost:11434/v1)

model options / flag result
{ "reasoningEffort": "none" } (camelCase) ✅ thinking off, real output, ~14 s
{ "reasoning_effort": "none" } (snake_case) ❌ silently dropped → thinking on → empty content
{ "think": false } / { "reasoning": false } / { "thinking": {"type":"disabled"} } ❌ not /v1 fields → ignored
--variant none / --variant minimal ❌ no such variant for openai-compatible

Raw curl /v1/chat/completions with reasoning_effort:"none" confirms ollama honours it (real content, empty reasoning), so the gap is purely on the CLI side.

Root cause (traced in packages/cli/src/provider/)

  1. snake_case silently dropped, then overwritten with undefined. The options schema declares only camelCase reasoningEffort (sdk/copilot/chat/openai-compatible-chat-options.ts:15), so snake_case reasoning_effort is stripped by parseProviderOptions. The body builder then unconditionally sets reasoning_effort: compatibleOptions.reasoningEffort (sdk/copilot/chat/openai-compatible-chat-language-model.ts:175), overwriting any spread snake_case value with undefined (dropped by JSON.stringify).
  2. No none/minimal variant for @ai-sdk/openai-compatible. ProviderTransform.variants() (transform.ts:~467) emits only low/medium/high (WIDELY_SUPPORTED_EFFORTS) for this provider, and only when capabilities.reasoning === true. So --variant none resolves to {} (no-op). OPENAI_EFFORTS (which already includes none/minimal) is defined nearby but unused for this provider.
  3. Robustness: with thinking left on, ollama returns a reasoning field that the upstream openai-compatible handling chokes on (Bug: OpenAI-compatible provider rejects Ollama's reasoning field, causing infinite spin/hang anomalyco/opencode#21903) → spin/empty instead of a clear error.

Proposed fix

  • A (primary): accept snake_case and prefer it — add reasoning_effort to openaiCompatibleProviderOptions, and at openai-compatible-chat-language-model.ts:175 use
    reasoning_effort: compatibleOptions.reasoning_effort ?? compatibleOptions.reasoningEffort.
  • B: populate none/minimal variants for @ai-sdk/openai-compatible (use OPENAI_EFFORTS at transform.ts:~467) so --variant none works; don't require capabilities.reasoning:true just to expose a "disable" variant.
  • C (docs): document reasoningEffort: "none" as the way to disable thinking for ollama models.

Workaround

Set "reasoningEffort": "none" (camelCase) in the model's options in the provider config.

Refs

Filed from the skill-md-research cr-local-kg experiment (local-model KG-augmented code review).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions