-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Automatically use Responses API previous_response_id
instead of sending message history when possible
#2756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Automatically use Responses API previous_response_id
instead of sending message history when possible
#2756
Conversation
8473734
to
afd05bd
Compare
ec26270
to
b5af35f
Compare
if isinstance(message, ModelResponse) and message.model_name: | ||
if self._model_name in message.model_name: | ||
previous_response_id = message.provider_response_id | ||
messages = [messages[-1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the matched ModelResponse
was not the most recent one, we should include all the messages that came after it, not just the last one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I was also thinking, if the user sends message_history
with results from different models combined, it becomes tricky to handle. Something like
[
ModelResponse(..., model_name='claude-3-5-sonnet-20241022',....),
ModelResponse(..., model_name='gpt-5-2025-08-07',...., provider_response_id=resp_123),
ModelResponse(..., model_name='claude-3-5-sonnet-20241022',....)
]
The first claude response will be excluded. wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be OK since the model_name='gpt-5-2025-08-07',...., provider_response_id=resp_123
indicates that that response (and the history before it) has already been sent to the Responses API, right? So it will have that model response in the server-side history, even though it originally came from another model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed that is true.
I was thinking of a scenario like (not sure how common this is):
agent = Agent('anthropic:claude-3-5-sonnet-latest')
r1 = await agent.run(user_prompt="tell me something about rockets")
model = OpenAIResponsesModel('gpt-5')
agent = Agent(model)
r2 = await agent.run('what are rockets used for')
agent = Agent('anthropic:claude-3-5-sonnet-latest')
r3 = await agent.run(user_prompt="which countries have sent rockets to space")
history = r1.all_messages() + r2.all_messages() + r3.all_messages()
model = OpenAIResponsesModel('gpt-5')
agent = Agent(model)
result = await agent.run('summarize the conversation so far', message_history=history)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GDaamn Hmm, I don't think that's common but it is indeed technically supported. Then maybe we should only send previous_response_id
if all responses came from the matching model.
# pass it to the next ModelRequest as previous_response_id to preserve context. | ||
# Since the full history isn't needed, only the latest message is kept. | ||
if isinstance(message, ModelResponse) and message.model_name: | ||
if self._model_name in message.model_name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be ==
right? And be merged with the previous if
statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when the model is set:
model = OpenAIResponsesModel('gpt-5')
agent = Agent(model)
result = await agent.run('what is the capital of france')
print(result.all_messages())
#> [ModelRequest(parts=[UserPromptPart(content='what is the capital of france', timestamp=datetime.datetime(2025, 9, 3, 14, 12, 42, 548683, tzinfo=datetime.timezone.utc))]),
ModelResponse(parts=[TextPart(content='Paris.')], usage=RequestUsage(input_tokens=12, output_tokens=8, details={'reasoning_tokens': 0}), model_name='gpt-5-2025-08-07', timestamp=datetime.datetime(2025, 9, 3, 14, 12, 42, tzinfo=TzInfo(UTC)), provider_name='openai', provider_response_id='resp_68b84cdac694819493f5b822d10a152c0946e65b063355c0')]
OpenAI returns with a dated version(gpt-5-2025-08-07
) of the given model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK, that's worth adding a comment then
@@ -932,6 +952,8 @@ async def _responses_create( | |||
truncation=model_settings.get('openai_truncation', NOT_GIVEN), | |||
timeout=model_settings.get('timeout', NOT_GIVEN), | |||
service_tier=model_settings.get('openai_service_tier', NOT_GIVEN), | |||
previous_response_id=previous_response_id | |||
or model_settings.get('openai_previous_response_id', NOT_GIVEN), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a openai_previous_response_id
was explicitly set, I think we should not do the stuff we do above, and assume the user only passed in the messages in message history that they actually meant to send
b5af35f
to
8e79c33
Compare
previous_response_id
instead of sending message history when possible
Co-authored-by: Douwe Maan <[email protected]>
else: | ||
# Mixed model responses invalidate response_id, | ||
# so the history is kept intact. | ||
response_id = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we find a ModelResponse
with out a model_name
, we also want to reset response_id
+ break
right? I think having the third branch made sense. We can change the if isinstance(m, ModelResponse) and m.model_name
line to if isinstance(m, ModelResponse) and m.model_name and self.model_name in m.model_name
|
||
By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history. | ||
|
||
If message history is provided and all responses come from the same OpenAI model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should account for https://ai.pydantic.dev/message-history/#processing-message-history. It would be unexpected if that feature is ignored entirely when using the Responses API. Maybe we should not automatically use previous_response_id
if a history processor was configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, https://platform.openai.com/docs/guides/reasoning?api-mode=responses#encrypted-reasoning-items mentions the Responses API "stateless mode (either with store set to false, or when an organization is enrolled in zero data retention)", in which case previous_response_id
wouldn't work. So there needs to be a way to turn it off.
Perhaps it's better to make it opt-in by setting OpenAIModelSettings(openai_previous_response_id='auto')
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- for
history_processor
, I did some quick experiments and it seems it isn't intefering. Also checking the code I can see that the the history is processed inside_prepare_request
, a step before we pass themessages
to model request where theprevious_response_id
logic happens.
pydantic-ai/pydantic_ai_slim/pydantic_ai/_agent_graph.py
Lines 422 to 442 in 3c2f1cf
model_settings, model_request_parameters, message_history, _ = await self._prepare_request(ctx) model_response = await ctx.deps.model.request(message_history, model_settings, model_request_parameters) ctx.state.usage.requests += 1 return self._finish_handling(ctx, model_response) async def _prepare_request( self, ctx: GraphRunContext[GraphAgentState, GraphAgentDeps[DepsT, NodeRunEndT]] ) -> tuple[ModelSettings | None, models.ModelRequestParameters, list[_messages.ModelMessage], RunContext[DepsT]]: ctx.state.message_history.append(self.request) ctx.state.run_step += 1 run_context = build_run_context(ctx) # This will raise errors for any tool name conflicts ctx.deps.tool_manager = await ctx.deps.tool_manager.for_run_step(run_context) message_history = await _process_message_history(ctx.state, ctx.deps.history_processors, run_context)
Please lmk if there are other scenarios where this is not the case or I misunderstood the flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Regarding "stateless mode", thanks for pointing out. In that case we can modify this:
previous_response_id = model_settings.get('openai_previous_response_id')
if not previous_response_id:
messages, previous_response_id = self._get_response_id_and_trim(messages)
to this right ?
previous_response_id = model_settings.get('openai_previous_response_id')
if previous_response_id == 'auto':
messages, previous_response_id = self._get_response_id_and_trim(messages)
-> when user passes message_history
, it will be trimmed only whenopenai_previous_response_id = 'auto'
and the ModelResponses
are from the same model.
But in this case, we'll still have to explicitly set openai_previous_response_id = 'auto'
, when mutliple requests are made within the same agent.run
(like part of a tool call etc).
set response_id=None when there is a valid response_id but latest_model_request doesn't exist Co-authored-by: Douwe Maan <[email protected]>
Closes #2663
openai_previous_response_id
field to store response id (provider_response_id
) from the latest response sent by Responses API, which can then be reused asprevious_response_id
in subsequent requests.provider_response_id
asprevious_response_id
in the next request, allowing the Responses API to access prior reasoning on the server side. This is used when message history is passed instead ofopenai_previous_response_id
, or during singleagent.run
with multiple request/response.