Automatically use Responses API `previous_response_id` instead of sending message history when possible #2756

GDaamn · 2025-09-01T21:48:16Z

Add openai_previous_response_id field to store response id (provider_response_id) from the latest response sent by Responses API, which can then be reused as previous_response_id in subsequent requests.
Update logic to trim message history: instead of sending the entire history, pass the provider_response_id as previous_response_id in the next request, allowing the Responses API to access prior reasoning on the server side. This is used when message history is passed instead of openai_previous_response_id, or during single agent.run with multiple request/response.
Supporting docs and unit tests.

DouweM · 2025-09-02T19:00:21Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+            if isinstance(message, ModelResponse) and message.model_name:
+                if self._model_name in message.model_name:
+                    previous_response_id = message.provider_response_id
+                    messages = [messages[-1]]


If the matched ModelResponse was not the most recent one, we should include all the messages that came after it, not just the last one

Here, I was also thinking, if the user sends message_history with results from different models combined, it becomes tricky to handle. Something like

[ ModelResponse(..., model_name='claude-3-5-sonnet-20241022',....), ModelResponse(..., model_name='gpt-5-2025-08-07',...., provider_response_id=resp_123), ModelResponse(..., model_name='claude-3-5-sonnet-20241022',....) ]

The first claude response will be excluded. wdyt ?

That should be OK since the model_name='gpt-5-2025-08-07',...., provider_response_id=resp_123 indicates that that response (and the history before it) has already been sent to the Responses API, right? So it will have that model response in the server-side history, even though it originally came from another model.

Indeed that is true.
I was thinking of a scenario like (not sure how common this is):

agent = Agent('anthropic:claude-3-5-sonnet-latest') r1 = await agent.run(user_prompt="tell me something about rockets") model = OpenAIResponsesModel('gpt-5') agent = Agent(model) r2 = await agent.run('what are rockets used for') agent = Agent('anthropic:claude-3-5-sonnet-latest') r3 = await agent.run(user_prompt="which countries have sent rockets to space") history = r1.all_messages() + r2.all_messages() + r3.all_messages() model = OpenAIResponsesModel('gpt-5') agent = Agent(model) result = await agent.run('summarize the conversation so far', message_history=history)

@GDaamn Hmm, I don't think that's common but it is indeed technically supported. Then maybe we should only send previous_response_id if all responses came from the matching model.

DouweM · 2025-09-02T19:00:33Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+            # pass it to the next ModelRequest as previous_response_id to preserve context.
+            # Since the full history isn't needed, only the latest message is kept.
+            if isinstance(message, ModelResponse) and message.model_name:
+                if self._model_name in message.model_name:


This can be == right? And be merged with the previous if statement?

Even when the model is set:

model = OpenAIResponsesModel('gpt-5') agent = Agent(model) result = await agent.run('what is the capital of france') print(result.all_messages()) #> [ModelRequest(parts=[UserPromptPart(content='what is the capital of france', timestamp=datetime.datetime(2025, 9, 3, 14, 12, 42, 548683, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[TextPart(content='Paris.')], usage=RequestUsage(input_tokens=12, output_tokens=8, details={'reasoning_tokens': 0}), model_name='gpt-5-2025-08-07', timestamp=datetime.datetime(2025, 9, 3, 14, 12, 42, tzinfo=TzInfo(UTC)), provider_name='openai', provider_response_id='resp_68b84cdac694819493f5b822d10a152c0946e65b063355c0')]

OpenAI returns with a dated version(gpt-5-2025-08-07) of the given model

Ah OK, that's worth adding a comment then

DouweM · 2025-09-02T19:01:05Z

pydantic_ai_slim/pydantic_ai/models/openai.py

@@ -932,6 +952,8 @@ async def _responses_create(
                truncation=model_settings.get('openai_truncation', NOT_GIVEN),
                timeout=model_settings.get('timeout', NOT_GIVEN),
                service_tier=model_settings.get('openai_service_tier', NOT_GIVEN),
+                previous_response_id=previous_response_id
+                or model_settings.get('openai_previous_response_id', NOT_GIVEN),


If a openai_previous_response_id was explicitly set, I think we should not do the stuff we do above, and assume the user only passed in the messages in message history that they actually meant to send

tests/models/test_openai_responses.py

docs/models/openai.md

pydantic_ai_slim/pydantic_ai/models/openai.py

Co-authored-by: Douwe Maan <[email protected]>

DouweM · 2025-09-05T17:05:13Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+                else:
+                    # Mixed model responses invalidate response_id,
+                    # so the history is kept intact.
+                    response_id = None


If we find a ModelResponse with out a model_name, we also want to reset response_id + break right? I think having the third branch made sense. We can change the if isinstance(m, ModelResponse) and m.model_name line to if isinstance(m, ModelResponse) and m.model_name and self.model_name in m.model_name

pydantic_ai_slim/pydantic_ai/models/openai.py

DouweM · 2025-09-05T20:13:19Z

docs/models/openai.md

+
+By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history.
+
+If message history is provided and all responses come from the same OpenAI model,


I think we should account for https://ai.pydantic.dev/message-history/#processing-message-history. It would be unexpected if that feature is ignored entirely when using the Responses API. Maybe we should not automatically use previous_response_id if a history processor was configured?

Additionally, https://platform.openai.com/docs/guides/reasoning?api-mode=responses#encrypted-reasoning-items mentions the Responses API "stateless mode (either with store set to false, or when an organization is enrolled in zero data retention)", in which case previous_response_id wouldn't work. So there needs to be a way to turn it off.

Perhaps it's better to make it opt-in by setting OpenAIModelSettings(openai_previous_response_id='auto') or something?

for history_processor, I did some quick experiments and it seems it isn't intefering. Also checking the code I can see that the the history is processed inside _prepare_request, a step before we pass the messages to model request where the previous_response_id logic happens.

pydantic-ai/pydantic_ai_slim/pydantic_ai/_agent_graph.py

Lines 422 to 442 in 3c2f1cf

model_settings, model_request_parameters, message_history, _ = await self._prepare_request(ctx)

model_response = await ctx.deps.model.request(message_history, model_settings, model_request_parameters)

ctx.state.usage.requests += 1

return self._finish_handling(ctx, model_response)

async def _prepare_request(

self, ctx: GraphRunContext[GraphAgentState, GraphAgentDeps[DepsT, NodeRunEndT]]

) -> tuple[ModelSettings | None, models.ModelRequestParameters, list[_messages.ModelMessage], RunContext[DepsT]]:

ctx.state.message_history.append(self.request)

ctx.state.run_step += 1

run_context = build_run_context(ctx)

# This will raise errors for any tool name conflicts

ctx.deps.tool_manager = await ctx.deps.tool_manager.for_run_step(run_context)

message_history = await _process_message_history(ctx.state, ctx.deps.history_processors, run_context)

Please lmk if there are other scenarios where this is not the case or I misunderstood the flow

Regarding "stateless mode", thanks for pointing out. In that case we can modify this:

previous_response_id = model_settings.get('openai_previous_response_id') if not previous_response_id: messages, previous_response_id = self._get_response_id_and_trim(messages)

to this right ?

previous_response_id = model_settings.get('openai_previous_response_id') if previous_response_id == 'auto': messages, previous_response_id = self._get_response_id_and_trim(messages)

-> when user passes message_history, it will be trimmed only whenopenai_previous_response_id = 'auto' and the ModelResponses are from the same model.
But in this case, we'll still have to explicitly set openai_previous_response_id = 'auto', when mutliple requests are made within the same agent.run (like part of a tool call etc).

set response_id=None when there is a valid response_id but latest_model_request doesn't exist Co-authored-by: Douwe Maan <[email protected]>

GDaamn force-pushed the feat/add-support-previous_response_id branch from 8473734 to afd05bd Compare September 1, 2025 22:01

GDaamn changed the title ~~Add support for previous_response_id from Responses API~~ add support for previous_response_id from Responses API Sep 1, 2025

GDaamn force-pushed the feat/add-support-previous_response_id branch 2 times, most recently from ec26270 to b5af35f Compare September 2, 2025 15:51

DouweM self-assigned this Sep 2, 2025

DouweM requested changes Sep 2, 2025

View reviewed changes

DouweM added the awaiting author revision label Sep 2, 2025

GDaamn added 2 commits September 4, 2025 19:53

add support for previous_response_id from Responses API

9c0d7fd

update link

8e79c33

GDaamn force-pushed the feat/add-support-previous_response_id branch from b5af35f to 8e79c33 Compare September 4, 2025 17:53

GDaamn added 2 commits September 4, 2025 19:56

update logic and docs

96a9118

update import in test

e679a67

GDaamn commented Sep 4, 2025

View reviewed changes

tests/models/test_openai_responses.py Outdated Show resolved Hide resolved

fix test

3029a95

DouweM requested changes Sep 5, 2025

View reviewed changes

DouweM changed the title ~~add support for previous_response_id from Responses API~~ Automatically use Responses API previous_response_id instead of sending message history when possible Sep 5, 2025

GDaamn and others added 2 commits September 5, 2025 18:54

add fixes for types

d327750

Update docs

5d9a846

Co-authored-by: Douwe Maan <[email protected]>

DouweM requested changes Sep 5, 2025

View reviewed changes

GDaamn and others added 2 commits September 5, 2025 22:49

update conditional check

16c19c7

update conditional check

8351ebe

set response_id=None when there is a valid response_id but latest_model_request doesn't exist Co-authored-by: Douwe Maan <[email protected]>


		By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history.

		If message history is provided and all responses come from the same OpenAI model,


	model_settings, model_request_parameters, message_history, _ = await self._prepare_request(ctx)
	model_response = await ctx.deps.model.request(message_history, model_settings, model_request_parameters)
	ctx.state.usage.requests += 1

	return self._finish_handling(ctx, model_response)

	async def _prepare_request(
	self, ctx: GraphRunContext[GraphAgentState, GraphAgentDeps[DepsT, NodeRunEndT]]
	) -> tuple[ModelSettings \| None, models.ModelRequestParameters, list[_messages.ModelMessage], RunContext[DepsT]]:
	ctx.state.message_history.append(self.request)

	ctx.state.run_step += 1

	run_context = build_run_context(ctx)

	# This will raise errors for any tool name conflicts
	ctx.deps.tool_manager = await ctx.deps.tool_manager.for_run_step(run_context)

	message_history = await _process_message_history(ctx.state, ctx.deps.history_processors, run_context)

Automatically use Responses API previous_response_id instead of sending message history when possible #2756

Are you sure you want to change the base?

Automatically use Responses API previous_response_id instead of sending message history when possible #2756

Conversation

GDaamn commented Sep 1, 2025 • edited by DouweM Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GDaamn Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GDaamn Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Automatically use Responses API `previous_response_id` instead of sending message history when possible #2756

Automatically use Responses API `previous_response_id` instead of sending message history when possible #2756

GDaamn commented Sep 1, 2025 •

edited by DouweM

Loading

GDaamn Sep 3, 2025 •

edited

Loading

GDaamn Sep 3, 2025 •

edited

Loading