fix(bidi): add configurable event_queue_size to prevent choppy audio#2260
Draft
madhavi-joshi-nutrien wants to merge 1 commit intostrands-agents:mainfrom
Draft
Conversation
The internal event queue between the model receive loop and the output handler was hardcoded to maxsize=1. This causes the model receive loop to block whenever the output handler performs any async I/O (e.g., websocket.send_json()), resulting in audio chunks piling up and delivering in bursts — perceived as choppy audio. This is particularly noticeable with Gemini Live which sends many small audio chunks rapidly (~50/sec), unlike Nova Sonic which sends fewer, larger chunks. Add a configurable event_queue_size parameter to BidiAgent (default: 1 to preserve existing behavior). Users experiencing choppy audio with fast-delivering models can increase this value to provide buffering between the model receive loop and the output handler.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
When using BidiAgent with BidiGeminiLiveModel, audio playback is choppy and bursty. The internal event queue between the model receive loop and the output handler is hardcoded to maxsize=1. This causes the model receive loop to block whenever the output handler performs any async I/O (e.g., websocket.send_json()), resulting in audio chunks piling up on the model SDK side and arriving in bursts.
This is particularly noticeable with Gemini Live, which sends many small audio chunks rapidly (~50 chunks/sec), unlike Nova Sonic which sends fewer, larger chunks with built-in TTS buffering.
Public API Changes
BidiAgent.init() accepts a new event_queue_size parameter:
Default behavior preserved (maxsize=1)
agent = BidiAgent(model=model, tools=[...])
Opt in to larger buffer for smooth audio with fast-delivering models
agent = BidiAgent(
model=BidiGeminiLiveModel(...),
tools=[...],
event_queue_size=32, # ~640ms buffer at typical Gemini chunk rates
)
The parameter controls the asyncio.Queue maxsize between the model receive loop and the output handler. Higher values absorb I/O latency spikes without stalling the model loop.
Use Cases
Gemini Live audio streaming: Prevents choppy playback when the output handler has non-trivial async I/O latency (WebSocket, WebRTC)
Custom output handlers: Any output handler doing network I/O benefits from decoupling via a larger buffer
Tuning backpressure: Users can balance memory usage vs. smoothness for their specific model and handler combination