-
Notifications
You must be signed in to change notification settings - Fork 3.3k
fix(openai-realtime): cancel timed-out response.create to prevent late playback #6238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1584,6 +1584,9 @@ def _on_timeout() -> None: | |
| self._response_created_futures.pop(event_id, None) | ||
| if fut and not fut.done(): | ||
| fut.set_exception(llm.RealtimeError("generate_reply timed out.")) | ||
| # The response.create was already sent; ask the server to cancel it so | ||
| # that any audio it produces does not arrive and play back unexpectedly. | ||
| self.send_event(ResponseCancelEvent(type="response.cancel")) | ||
|
|
||
| handle = asyncio.get_event_loop().call_later(10.0, _on_timeout) | ||
|
|
||
|
|
@@ -1722,6 +1725,7 @@ def _handle_response_created(self, event: ResponseCreatedEvent) -> None: | |
| response_id=event.response.id, | ||
| ) | ||
|
|
||
| timed_out = False | ||
| if ( | ||
| isinstance(event.response.metadata, dict) | ||
| and (client_event_id := event.response.metadata.get("client_event_id")) | ||
|
|
@@ -1731,9 +1735,22 @@ def _handle_response_created(self, event: ResponseCreatedEvent) -> None: | |
| generation_ev.user_initiated = True | ||
| fut.set_result(generation_ev) | ||
| else: | ||
| logger.warning("response of generate_reply received after it's timed out.") | ||
| # The generate_reply caller already received a timeout error. The | ||
| # server kept running and delivered response.created anyway. Cancel it | ||
| # so no audio frames are queued for playback. We keep | ||
| # _current_generation alive so that the subsequent response.output_item.* | ||
| # and response.done events (which may arrive before the server honours | ||
| # the cancel) do not trip their assertions; response.done will close the | ||
| # generation normally. | ||
| logger.warning( | ||
| "response of generate_reply received after it's timed out; " | ||
| "cancelling to prevent unexpected playback." | ||
| ) | ||
| self.send_event(ResponseCancelEvent(type="response.cancel")) | ||
| timed_out = True | ||
|
|
||
| self.emit("generation_created", generation_ev) | ||
| if not timed_out: | ||
| self.emit("generation_created", generation_ev) | ||
|
Comment on lines
1735
to
+1753
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Timeout detection in response handler is unreachable, so unexpected audio playback still occurs after timeout The late-arriving response is never detected as timed-out ( Impact: After a generate_reply timeout, audio from the late server response can still play back to the user — the exact scenario the PR intends to prevent. Detailed mechanism: _on_timeout pops the future before _handle_response_created can find itIn
Later, when the server delivers
The To fix this, (Refers to lines 1728-1753) Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
|
|
||
| def _handle_response_output_item_added(self, event: ResponseOutputItemAddedEvent) -> None: | ||
| assert self._current_generation is not None, "current_generation is None" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚩 The cancel sent by _on_timeout may cancel an unrelated active response
At
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:1589, the newly addedResponseCancelEventis sent without specifying aresponse_id. If the generate_reply's response hasn't actually started on the server (noresponse.createdreceived yet), but another response IS active (e.g., a VAD-triggered server-initiated response), the cancel may target that unrelated response instead. The OpenAI Realtime API'sresponse.cancelcancels the currently in-progress response, which may not be the one that timed out. This is the same pattern used elsewhere in the file (e.g., line 1610 ininterrupt()), so it's consistent with existing behavior, but worth noting as a potential race condition in multi-response scenarios.Was this helpful? React with 👍 or 👎 to provide feedback.