You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running a local ollama thinking-model (gemma3 / "gemma4", qwen3, etc.) through the default @ai-sdk/openai-compatible provider, there is no ergonomic/discoverable way to disable the model's thinking. ollama's /v1/chat/completions only turns thinking off when the request body contains reasoning_effort: "none" (refs: ollama/ollama#14820, ollama/ollama#12004). The CLI sends that field only if the user sets the exact camelCasereasoningEffort option; every other natural attempt silently no-ops, and the run then hangs / returns empty content because the model spends its budget in the hidden reasoning channel.
Impact
Thinking-capable ollama models are effectively unusable for structured-output tasks (e.g. code review) via the CLI unless the user happens to know the exact camelCase key. Symptom: aictrl run produces empty output / appears to hang (130–240 s, no result).
What works / what doesn't (measured: gemma 12B @ local ollama, openai-compatible provider → http://localhost:11434/v1)
Raw curl /v1/chat/completions with reasoning_effort:"none" confirms ollama honours it (real content, empty reasoning), so the gap is purely on the CLI side.
Root cause (traced in packages/cli/src/provider/)
snake_case silently dropped, then overwritten with undefined. The options schema declares only camelCase reasoningEffort (sdk/copilot/chat/openai-compatible-chat-options.ts:15), so snake_case reasoning_effort is stripped by parseProviderOptions. The body builder then unconditionally sets reasoning_effort: compatibleOptions.reasoningEffort (sdk/copilot/chat/openai-compatible-chat-language-model.ts:175), overwriting any spread snake_case value with undefined (dropped by JSON.stringify).
No none/minimal variant for @ai-sdk/openai-compatible.ProviderTransform.variants() (transform.ts:~467) emits only low/medium/high (WIDELY_SUPPORTED_EFFORTS) for this provider, and only when capabilities.reasoning === true. So --variant none resolves to {} (no-op). OPENAI_EFFORTS (which already includes none/minimal) is defined nearby but unused for this provider.
A (primary): accept snake_case and prefer it — add reasoning_effort to openaiCompatibleProviderOptions, and at openai-compatible-chat-language-model.ts:175 use reasoning_effort: compatibleOptions.reasoning_effort ?? compatibleOptions.reasoningEffort.
B: populate none/minimal variants for @ai-sdk/openai-compatible (use OPENAI_EFFORTS at transform.ts:~467) so --variant none works; don't require capabilities.reasoning:true just to expose a "disable" variant.
C (docs): document reasoningEffort: "none" as the way to disable thinking for ollama models.
Workaround
Set "reasoningEffort": "none" (camelCase) in the model's options in the provider config.
Summary
Running a local ollama thinking-model (gemma3 / "gemma4", qwen3, etc.) through the default
@ai-sdk/openai-compatibleprovider, there is no ergonomic/discoverable way to disable the model's thinking. ollama's/v1/chat/completionsonly turns thinking off when the request body containsreasoning_effort: "none"(refs: ollama/ollama#14820, ollama/ollama#12004). The CLI sends that field only if the user sets the exact camelCasereasoningEffortoption; every other natural attempt silently no-ops, and the run then hangs / returns emptycontentbecause the model spends its budget in the hidden reasoning channel.Impact
Thinking-capable ollama models are effectively unusable for structured-output tasks (e.g. code review) via the CLI unless the user happens to know the exact camelCase key. Symptom:
aictrl runproduces empty output / appears to hang (130–240 s, no result).What works / what doesn't (measured: gemma 12B @ local ollama, openai-compatible provider →
http://localhost:11434/v1)options/ flag{ "reasoningEffort": "none" }(camelCase){ "reasoning_effort": "none" }(snake_case)content{ "think": false }/{ "reasoning": false }/{ "thinking": {"type":"disabled"} }/v1fields → ignored--variant none/--variant minimalRaw
curl /v1/chat/completionswithreasoning_effort:"none"confirms ollama honours it (real content, empty reasoning), so the gap is purely on the CLI side.Root cause (traced in
packages/cli/src/provider/)undefined. The options schema declares only camelCasereasoningEffort(sdk/copilot/chat/openai-compatible-chat-options.ts:15), so snake_casereasoning_effortis stripped byparseProviderOptions. The body builder then unconditionally setsreasoning_effort: compatibleOptions.reasoningEffort(sdk/copilot/chat/openai-compatible-chat-language-model.ts:175), overwriting any spread snake_case value withundefined(dropped byJSON.stringify).none/minimalvariant for@ai-sdk/openai-compatible.ProviderTransform.variants()(transform.ts:~467) emits onlylow/medium/high(WIDELY_SUPPORTED_EFFORTS) for this provider, and only whencapabilities.reasoning === true. So--variant noneresolves to{}(no-op).OPENAI_EFFORTS(which already includesnone/minimal) is defined nearby but unused for this provider.reasoningfield that the upstream openai-compatible handling chokes on (Bug: OpenAI-compatible provider rejects Ollama'sreasoningfield, causing infinite spin/hang anomalyco/opencode#21903) → spin/empty instead of a clear error.Proposed fix
reasoning_efforttoopenaiCompatibleProviderOptions, and atopenai-compatible-chat-language-model.ts:175usereasoning_effort: compatibleOptions.reasoning_effort ?? compatibleOptions.reasoningEffort.none/minimalvariants for@ai-sdk/openai-compatible(useOPENAI_EFFORTSattransform.ts:~467) so--variant noneworks; don't requirecapabilities.reasoning:truejust to expose a "disable" variant.reasoningEffort: "none"as the way to disable thinking for ollama models.Workaround
Set
"reasoningEffort": "none"(camelCase) in the model'soptionsin the provider config.Refs
reasoning_effortsupport in OpenAI-compatible/v1/chat/completionsAPI ollama/ollama#14820 — documentreasoning_efforton/v1reasoning_effort:falseerrors,minimalunsupported,nonedisablesreasoningfield, causing infinite spin/hang anomalyco/opencode#21903 — hang on ollamareasoningfieldFiled from the
skill-md-researchcr-local-kg experiment (local-model KG-augmented code review).