Skip to content

examples: add multi-turn tool-call loop for the Responses API#3439

Open
akrishnash wants to merge 2 commits into
openai:mainfrom
akrishnash:examples/responses-tool-call-loop
Open

examples: add multi-turn tool-call loop for the Responses API#3439
akrishnash wants to merge 2 commits into
openai:mainfrom
akrishnash:examples/responses-tool-call-loop

Conversation

@akrishnash

Copy link
Copy Markdown

Summary

Every existing example in examples/responses/ stops after the first model turn: they show how a function_call is generated but not what happens next. The complete agent pattern (execute tool locally, pass result back, repeat until final text answer) is not demonstrated anywhere.

This PR adds examples/responses/tool_call_loop.py to fill that gap.

What the example shows

  • Multi-turn loop using previous_response_id so conversation state is carried automatically across turns (no manual input-list rebuilding)
  • Tool execution: two local tools (get_weather and calculate) are called, results returned as function_call_output items, passed back in the next turn
  • Bounded by MAX_TURNS to prevent runaway loops
  • Exits cleanly when the model produces a turn with no function_call items

How to run

OPENAI_API_KEY=your-key python examples/responses/tool_call_loop.py

The example asks about weather in two cities and a seconds-in-N-weeks calculation, exercising parallel tool calls across two turns before producing a final text answer.

Every existing example in examples/responses/ shows a single turn —
the model generates a function_call, and the example stops there.
None show what to do next: execute the tool, feed the result back,
and loop until the model produces a final text answer.

This example fills that gap with a minimal, self-contained agent loop:
- Two local tools: get_weather() and calculate()
- Uses previous_response_id to carry conversation state across turns
  instead of manually reconstructing the input list each round
- Guards against unbounded loops with MAX_TURNS
- Prints tool invocations so the flow is easy to follow

The complete pattern is:
  send message
    → model returns function_call items
    → execute tools locally
    → pass function_call_output items + previous_response_id
    → repeat until model returns plain text
@akrishnash akrishnash requested a review from a team as a code owner June 25, 2026 11:38

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b7ed73c02

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread examples/responses/tool_call_loop.py Outdated
def calculate(expression: str) -> str:
"""Safely evaluate a Python arithmetic expression (no builtins, math module available)."""
try:
result = eval(expression, {"__builtins__": {}}, vars(math)) # noqa: S307

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Replace unsafe eval in calculate

If this example is reused with arbitrary user prompts, the model controls expression, so this eval can run non-arithmetic Python expressions despite the empty __builtins__ sandbox; for example, dunder introspection or resource-exhausting expressions can execute on the host before the error handler returns. Since the docstring presents this as safe, use an AST/operator whitelist or a small arithmetic parser instead of evaluating model-supplied text.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 82173e8. Replaced eval() with a whitelisted AST walk in calculate: only arithmetic operators (+ - * / // % **) and a small math subset (sqrt, floor, ceil, abs, pi, e, tau) are permitted; everything else is rejected rather than executed. Attribute access ((1).__class__), out-of-whitelist calls (__import__(...)), and resource bombs (9**9**9, via an exponent cap) now return a structured error. Verified the example's own use case still works (7 * 24 * 3600604800).

Codex review flagged that eval(), even with empty __builtins__, can still
run dunder introspection or resource-exhausting expressions on host since
the model controls the expression string. Walk the AST and permit only
arithmetic operators plus a small math subset; cap exponents. Rejects
attribute access, function calls outside the whitelist, and huge powers.
@DTiming24

Copy link
Copy Markdown

This example is quite useful because it shows the full tool-call loop end to end. A small note about where to place per-turn timeout or retry logic would make the pattern even safer to copy into production code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants