Skip to content

feat: add buffered Chat Completions tool-call streaming#3506

Merged
seratch merged 7 commits into
openai:mainfrom
incoffeemonster:feat/buffer-chatcmpl-tool-streams
Jun 22, 2026
Merged

feat: add buffered Chat Completions tool-call streaming#3506
seratch merged 7 commits into
openai:mainfrom
incoffeemonster:feat/buffer-chatcmpl-tool-streams

Conversation

@incoffeemonster

@incoffeemonster incoffeemonster commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Some OpenAI-compatible Chat Completions providers can return valid function tool calls in the final assembled stream, but their streamed tool_calls chunks may not be reliable enough for the SDK to process incrementally. This can cause the Agents SDK to emit partial tool-call events too early or build invalid assistant/tool message ordering for later turns.

This PR adds an opt-in buffered streaming mode for those providers. When buffer_streamed_tool_calls=True is enabled, OpenAIChatCompletionsModel continues to stream normal text deltas immediately, but buffers function tool-call deltas until the provider stream completes. It then emits one complete synthetic tool-call chunk into the existing ChatCmplStreamHandler, so downstream Responses-format events and session history remain protocol-safe.

Default behavior is unchanged.

Key changes:

  • Adds buffer_streamed_tool_calls to OpenAIChatCompletionsModel and OpenAIProvider.
  • Adds openai_buffer_streamed_tool_calls passthrough to MultiProvider.
  • Preserves usage chunks and provider-specific fields such as Gemini/LiteLLM thought signatures.
  • Raises ModelBehaviorError when a stream finishes with finish_reason="tool_calls" but no tool-call deltas were observed.

Test plan

Automated checks run locally:

python3 -m py_compile src/agents/models/chatcmpl_stream_handler.py src/agents/models/openai_chatcompletions.py src/agents/models/openai_provider.py src/agents/models/multi_provider.py tests/models/test_openai_chatcompletions_stream.py
ruff check src/agents/models/chatcmpl_stream_handler.py src/agents/models/openai_chatcompletions.py src/agents/models/openai_provider.py src/agents/models/multi_provider.py tests/models/test_openai_chatcompletions_stream.py
ruff format --check src/agents/models/chatcmpl_stream_handler.py src/agents/models/openai_chatcompletions.py src/agents/models/openai_provider.py src/agents/models/multi_provider.py tests/models/test_openai_chatcompletions_stream.py
mypy src/agents/models/chatcmpl_stream_handler.py src/agents/models/openai_chatcompletions.py src/agents/models/openai_provider.py src/agents/models/multi_provider.py tests/models/test_openai_chatcompletions_stream.py
pyright src/agents/models/chatcmpl_stream_handler.py src/agents/models/openai_chatcompletions.py src/agents/models/openai_provider.py src/agents/models/multi_provider.py tests/models/test_openai_chatcompletions_stream.py
pytest tests/models/test_openai_chatcompletions_stream.py

Result: 27 passed.

Manual smoke test:

  • Loaded the local fork via test.py.
  • Confirmed agents imports from local src/agents.
  • Confirmed OpenAIChatCompletionsModel(..., buffer_streamed_tool_calls=True).
  • Verified normal text streaming.
  • Verified a streamed tool-call run with DeepSeek completed without tool_calls/tool_call_id protocol errors.
  • Verified two consecutive tool questions in the same SQLiteSession completed without stale assistant/tool message ordering errors.

Issue number

N/A

Checks

  • I've added new tests (if relevant)
  • [] I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 09033e9b6c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/models/chatcmpl_stream_handler.py Outdated
Comment thread src/agents/models/chatcmpl_stream_handler.py
@incoffeemonster

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This PR is stale because it has been open for 10 days with no activity.

@github-actions github-actions Bot added the stale label Jun 9, 2026

@seratch seratch left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. The underlying problem is real: when an OpenAI-compatible provider emits tool-call deltas before later assistant text, the current output can be replayed as assistant(tool_calls) -> assistant(text) -> tool, which is invalid. Buffering produces a single assistant message containing both the content and tool calls, followed by the tool output, so this direction is viable.

Before merging, please add a regression test that reproduces this exact ordering through the next-turn Chat Completions conversion, ideally using the session path described in the PR. The current tests cover coalescing and edge cases but not the reported multi-turn failure.

@incoffeemonster incoffeemonster requested a review from seratch June 22, 2026 06:16
@seratch

seratch commented Jun 22, 2026

Copy link
Copy Markdown
Member

Thanks for the update. The new regression test covers the next-turn replay failure I was concerned about, and the buffered Chat Completions approach looks viable.

Before merging, please add one focused test that verifies MultiProvider(openai_buffer_streamed_tool_calls=True) forwards the setting into the OpenAI Chat Completions model.

@seratch seratch changed the title feat: Add buffered Chat Completions tool-call streaming feat: add buffered Chat Completions tool-call streaming Jun 22, 2026
@seratch seratch added this to the 0.17.x milestone Jun 22, 2026
@seratch seratch enabled auto-merge (squash) June 22, 2026 07:11
@incoffeemonster incoffeemonster requested a review from seratch June 22, 2026 07:12
@seratch seratch merged commit 28d2a6c into openai:main Jun 22, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants