feat: add retry mechanism for ModelBehaviorError#3587
Conversation
Adds max_model_retries to RunConfig. When set, ModelBehaviorError triggers automatic retry with error feedback to the model. The error message is fed back as a synthetic user message so the model can self-correct malformed responses (invalid tool call JSON, nonexistent tools, etc.). Supports both non-streaming (run_single_turn) and streaming (run_single_turn_streamed) paths. Closes openai#325 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62fe089a74
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ), | ||
| stream_failed_retry_attempts[0], | ||
| async for event in retry_stream: | ||
| streamed_result._event_queue.put_nowait(RawResponsesStreamEvent(data=event)) |
There was a problem hiding this comment.
Buffer streamed events before retrying malformed turns
When max_model_retries is enabled for streamed runs, a malformed completed response (for example, a nonexistent tool) is only detected later in get_single_step_result_from_response, but this line has already delivered all raw events from the failed attempt to the caller. The retry then emits a second attempt into the same stream, so stream consumers can observe invalid response.output_item.done/tool_called events that are not part of the final accepted turn and may act on a tool call the SDK is about to discard. Consider buffering per-attempt events until the response is accepted, or disabling this retry path for streaming after any events are emitted.
Useful? React with 👍 / 👎.
| tool_use_tracker=tool_use_tracker, | ||
| server_manages_conversation=server_conversation_tracker is not None, | ||
| ) | ||
| except ModelBehaviorError as e: |
There was a problem hiding this comment.
Avoid retrying after tool side effects
This catches ModelBehaviorError around the entire step resolution, not just response parsing. When the error is raised from tool execution (for example, one malformed shell/apply_patch call while another function/computer/custom tool in the same turn has already run, since _execute_tool_plan runs tool groups concurrently), the retry asks the model again and can execute those side-effecting tools a second time. Restrict the retry to validation before any tool side effects, or only retry when no tool execution has started.
Useful? React with 👍 / 👎.
Summary
Adds automatic retry for
ModelBehaviorErrorwith configurablemax_model_retriesonRunConfig.When a model produces a malformed response (invalid tool call JSON, nonexistent tool, etc.), the SDK raises
ModelBehaviorError. Previously there was no built-in retry — the agent just failed. This PR adds an automatic retry mechanism that feeds the error message back to the model so it can self-correct.Changes
RunConfig.max_model_retries: int = 0— new parameter controlling how many automatic retries are attempted whenModelBehaviorErroroccurs (defaults to 0, preserving existing behavior)run_single_turn(non-streaming) — wraps model call + response processing in a retry loop; on error, appends a synthetic user message with the error details and retries the model callrun_single_turn_streamed(streaming) — same retry logic for the streaming path; events from failed attempts are emitted (consistent with existing network-level retry behavior)How it works
get_single_step_result_from_responseraisesModelBehaviorError"Your previous response was invalid: <error message>"Usage
Closes #325