[agentserver-responses] Harden response model, type safety, and builder API#46302
Open
[agentserver-responses] Harden response model, type safety, and builder API#46302
Conversation
Default model to empty string when not provided in the request, ensuring the field is always present in the response payload. The OpenAI SDK requires model to be present to deserialize the response object. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Ensures the Responses hosting layer always stamps a model field into response payloads (even when omitted from the request), preventing downstream clients (notably the OpenAI SDK) from failing to deserialize responses when model is missing.
Changes:
- Default
modelto""when building the per-request execution context soapply_common_defaults()will always includemodelin lifecycle snapshots.
RaviPidaparthi
approved these changes
Apr 14, 2026
Address PR review feedback: add contract tests verifying the model field is present in the response payload when omitted from the request, for both sync (stream=False) and streaming (stream=True) modes.
…eleted'
The OpenAI spec returns {id, object: 'response', deleted: true} for
DELETE /responses/{id}. Our handler was returning 'response.deleted'
which doesn't match. Fixed the handler and updated all 5 test
assertions.
ResponseExecution now carries agent_session_id and conversation_id so that _RuntimeState.to_snapshot can forcibly stamp them (S-038/S-040) on both the response.as_dict() path and the minimal fallback dict. All four orchestrator ResponseExecution creation sites pass both fields from the execution context.
The manual _patch.py override of ResponseObject.output erased the element type (list instead of list[OutputItem]), preventing the model framework from deserializing nested dicts into OutputItem instances. This caused get_history to return plain dicts instead of typed models. Changes: - Remove output:list override; use generated list[OutputItem] - Remove ToolChoiceAllowed override (generated type is identical) - Move Sphinx docstring fixes into models_patch.py shim so make generate-models preserves them instead of overwriting - Accept emitter upgrade to model_base.py (XML refactor) - Regenerate _validators.py from current TypeSpec sources
6fa2b47 to
a141311
Compare
…type tests - Fix track_completed_output_item to use OutputItem._deserialize(dict, []) instead of OutputItem(dict) so response.output contains proper discriminated subtypes (OutputItemMessage, OutputItemFunctionToolCall, etc.) instead of base OutputItem instances. This ensures handler devs can use isinstance() and attribute access on output items. - Add test_public_contract_types.py with 22 tests covering every public handler/consumer surface for type fidelity: * context.request → CreateResponse * context.get_input_items() → Item subtypes * context.get_input_text() → str * context.get_history() → OutputItem subtypes (first-ever coverage) * stream.response → ResponseObject * stream.response.output → OutputItem subtypes * Builder emit_* → ResponseStreamEvent subtypes * Generator convenience → ResponseStreamEvent subtypes * InMemoryProvider round-trip preserves subtypes - Add isinstance assertions to existing tests in test_builders.py, test_event_stream_generators.py, and test_response_event_stream_builder.py
Replace random UUID fallback for agent_session_id with deterministic SHA-256 derivation matching .NET SessionIdDerivation logic: Priority chain: 1. Explicit agent_session_id from payload (unchanged) 2. Platform env FOUNDRY_AGENT_SESSION_ID (unchanged) 3. Deterministic: SHA256(agent_name:agent_version:partition_hint) where partition_hint is extracted from conversation_id or previous_response_id via IdGenerator.extract_partition_key 4. Random 63-char lowercase hex (one-shot, no conversational context) This ensures session affinity: the same conversation + agent identity always resolves to the same session ID, enabling stateful backends to route consistently without requiring explicit session IDs. New functions in _request_parsing.py: - derive_session_id() — public deterministic derivation - _compute_hex_hash() — SHA-256 → 63-char hex - _generate_random_hex() — os.urandom fallback - _extract_agent_identity() — name/version from agent_reference Updated _resolve_session_id() signature to accept agent_reference. Updated call site in _endpoint_handler.py to pass agent_reference. Updated all tests (unit + contract) from UUID to 63-char hex format. Added 14 new derivation tests covering determinism, agent isolation, version isolation, priority, and non-standard ID formats.
Port .NET pattern: every emit_* method now returns its specific event subtype (e.g. ResponseCreatedEvent, ResponseOutputItemAddedEvent) via typing.cast() instead of the base ResponseStreamEvent. Covers all builders: - ResponseEventStream: 6 lifecycle methods - OutputItemBuilder / BaseOutputItemBuilder: emit_added, emit_done - OutputItemMessageBuilder, TextContentBuilder, RefusalContentBuilder - FunctionCallBuilder, FunctionCallOutputBuilder - ReasoningSummaryPartBuilder, ReasoningItemBuilder - FileSearchCall, WebSearchCall, CodeInterpreter, ImageGen, McpCall, McpListTools, CustomToolCall builders Adds test_emit_return_types.py with 70 isinstance assertions covering every public emit_* method across all 16 builder classes.
…tputItem only Remove dict[str, Any] from the public signature — all item types are generated models. Internal callers use _emit_added/_emit_done directly. Also: fix handler guide (emit_failed/emit_incomplete kwargs, request= pattern), revert CHANGELOG to initial-release form, remove session ID derivation docs (internal detail).
…del types - ResponseEventStream constructor: agent_reference, request, response now accept only their respective model types (no dict[str, Any]) - Terminal methods (emit_completed/failed/incomplete): usage accepts only ResponseUsage (no dict[str, Any]) - Convenience generators (output_item_computer_call, _computer_call_output, _local_shell_call, _function_shell_call, _function_shell_call_output, _apply_patch_call): all action/output/environment params accept only their respective generated model types (no dict[str, Any]) - Async mirrors: same tightening as sync counterparts - emit_annotation_added: annotation accepts only Annotation (no dict) - _set_terminal_fields: usage tightened - Internal _build_events: coerce dict→AgentReference before passing to ResponseEventStream - Tests updated to use model constructors instead of raw dicts - Docs updated to show ResponseUsage model usage
…[Any] types - emit_event → _emit_event: internal only, all callers are sibling emit_* methods and _builders subpackage - with_output_item_defaults → _with_output_item_defaults: internal only, called only by _builders._base - validate_response_event_stream → _validate_response_event_stream: internal only, called only by _normalize_lifecycle_events - normalize_lifecycle_events → _normalize_lifecycle_events: internal only, called only by hosting._endpoint_handler - Removed both from streaming/__init__.py exports - output_item_custom_tool_call_output: output tightened from str | list[Any] to str | list[FunctionAndCustomToolCallOutput] - OutputItemFunctionCallOutputBuilder.emit_added/emit_done: output tightened from str | list[Any] to str | list[InputTextContentParam | InputImageContentParamAutoParam | InputFileContentParam] - Removed unused Any import from _function.py
…alize 22 symbols - Remove EVENT_TYPE alias: replaced all ~80 usages across 10 files with generated_models.ResponseStreamEventType directly - Remove from streaming exports: EVENT_TYPE, encode_sse_event, encode_keep_alive_comment - Remove from hosting exports: CreateSpan, CreateSpanHook, InMemoryCreateSpanHook, RecordedSpan, build_create_span_tags, build_platform_server_header, start_create_span, build_api_error_response, build_invalid_mode_error_response, build_not_found_error_response, parse_and_validate_create_response, parse_create_response, to_api_error_response, validate_create_response - Remove from models exports: ResponseExecution, StreamEventRecord, StreamReplayState, get_instruction_items, get_output_item_id - Remove from top-level exports: to_output_item - Keep public: get_conversation_id, get_input_expanded, get_content_expanded, get_conversation_expanded, get_tool_choice_expanded, all builder classes, ResponseEventStream, TextResponse, all store/Foundry types
...ntserver/azure-ai-agentserver-responses/azure/ai/agentserver/responses/streaming/_helpers.py
Outdated
Show resolved
Hide resolved
...r-responses/azure/ai/agentserver/responses/models/_generated/sdk/models/_utils/model_base.py
Show resolved
Hide resolved
...er/azure-ai-agentserver-responses/azure/ai/agentserver/responses/hosting/_request_parsing.py
Show resolved
Hide resolved
...r-responses/azure/ai/agentserver/responses/models/_generated/sdk/models/_utils/model_base.py
Show resolved
Hide resolved
…ve .NET references
…utput, ToolChoiceAllowed.tools)
RaviPidaparthi
approved these changes
Apr 15, 2026
…valid TYPE_CHECKING import) CI failures
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[agentserver-responses] Harden response model, type safety, and builder API
Summary
Comprehensive hardening of the
azure-ai-agentserver-responsespackage to ensure strict type safety, correct model lifecycle, robust builder APIs, and minimal public API surface.Changes
Response Model Always Present (Bug Fix)
ResponseEventStreamnow always initialises aResponseObjectenvelope at constructionNone-reference errors when handlers accessstream.responsebeforeemit_created()Type Safety Audit
emit_*methods across 16 builder classes return specific event subtypes viatyping.cast()instead of the baseResponseStreamEventContract Type Tests
Deterministic Session ID Derivation
derive_session_id()produces SHA-256 based IDs from conversation context, matching .NETSessionIdDerivation.DeriveOutputItemBuilder Tightening
OutputItemBuilder.emit_added()/emit_done()accept onlyOutputItemmodel instances (no raw dicts)Public API Parameter Tightening (dict → generated models)
agent_reference,request,responseaccept only their respective model typesemit_completed/emit_failed/emit_incomplete):usageaccepts onlyResponseUsageaction,output,environment,operationparams accept only generated model typesemit_annotation_added: accepts onlyAnnotation(no dict)outputtightened tostr | list[FunctionAndCustomToolCallOutput]outputtightened tostr | list[InputTextContentParam | InputImageContentParamAutoParam | InputFileContentParam]API Surface Reduction — Internalized Methods
emit_event()→_emit_event(): low-level dict-based emitterwith_output_item_defaults()→_with_output_item_defaults(): item stamping helpervalidate_response_event_stream()→_validate_response_event_stream()normalize_lifecycle_events()→_normalize_lifecycle_events()API Surface Reduction — Removed Exports & EVENT_TYPE Alias
EVENT_TYPEalias entirely: replaced ~80 usages across 10 files withgenerated_models.ResponseStreamEventTypedirectlyEVENT_TYPE,encode_sse_event,encode_keep_alive_commentCreateSpan,CreateSpanHook,InMemoryCreateSpanHook,RecordedSpan,build_create_span_tags,build_platform_server_header,start_create_span) and all validation functions (build_api_error_response,build_invalid_mode_error_response,build_not_found_error_response,parse_and_validate_create_response,parse_create_response,to_api_error_response,validate_create_response)ResponseExecution,StreamEventRecord,StreamReplayState,get_instruction_items,get_output_item_idto_output_itemDocs & Samples
model=→request=pattern, usage examples useResponseUsagemodelTest Results