Improve OpenAI Agents conformance and metrics#49
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Updates the OpenAI Agents v2 instrumentation to better support current Agents SDK tracing payload shapes, while updating the supported openai-agents version range and expanding test coverage for newer span types.
Changes:
- Add support for additional Agents SDK span data types (task/turn/custom/MCP tools/speech group) in the span processor.
- Improve message normalization to handle string
inputpayloads (e.g., response spans). - Update compatibility range to
openai-agents >= 0.17.0and expand tests accordingly.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/tests/test_z_span_processor_unit.py | Updates unit test expectation for unknown span operation handling (now None). |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/tests/test_tracer.py | Adds new tracer tests for string inputs and current Agents SDK span types. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/tests/stubs/agents/tracing/init.py | Extends tracing stubs with additional span types used in tests. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/tests/requirements.oldest.txt | Bumps oldest tested openai-agents to 0.17.0. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/tests/requirements.latest.txt | Bumps latest tested openai-agents to 0.17.2. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/src/opentelemetry/instrumentation/openai_agents/span_processor.py | Adds new span-type handling, safer naming, usage extraction, and string message normalization. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/src/opentelemetry/instrumentation/openai_agents/package.py | Updates declared instrumented package minimum version. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/pyproject.toml | Updates optional dependency range for instruments extra. |
| instrumentation/opentelemetry-instrumentation-openai-agents-v2/.changelog/49.fixed | Adds changelog entry for the fix. |
3b2fd00 to
1eefc83
Compare
36e29d1 to
6b63fd3
Compare
|
Hi @lzchen, PR is updated. Could you please take another look when you have a chance? |
6b63fd3 to
1e23f36
Compare
|
I think it would be best to fix existing issues in OpenAI Agents instrumentation before adding more features to it - #86 |
0f7ef4e to
a9dbc10
Compare
8697ef8 to
7e22d8a
Compare
7e22d8a to
1a107d1
Compare
|
Hi @nagkumar91, @hectorhdzg, @rads-1996, could one of you take a look at this PR when you have a chance? |
|
|
||
| # ---- Normalization utilities (embedded from utils.py) ---- | ||
|
|
||
| _CUSTOM_ATTRIBUTE_RESERVED_PREFIXES = ( |
There was a problem hiding this comment.
Would it make more sense to move all of these constants to a constant.py file instead of adding it in this one?
There was a problem hiding this comment.
I kept them here because they are only used in span_processor.py, and this package does not currently have a constants module. I’d prefer to avoid the extra refactor in this PR, but we can do it in a follow-up PR once this is merged.
e34d0c9 to
4610bb0
Compare
| GEN_AI_EMBEDDINGS_DIMENSION_COUNT = "gen_ai.embeddings.dimension.count" | ||
| GEN_AI_TOKEN_TYPE = _attr("GEN_AI_TOKEN_TYPE", "gen_ai.token.type") | ||
|
|
||
| _DEFAULT_FINISH_REASON = "unknown" |
There was a problem hiding this comment.
Do we need a default if gen_ai.response.finish_reasons is recommended and not mandatory?
There was a problem hiding this comment.
Updated, this now only emits gen_ai.response.finish_reasons when the SDK provides a finish reason.
| attributes = { | ||
| GEN_AI_PROVIDER_NAME: self.system_name, | ||
| GEN_AI_SYSTEM_KEY: self.system_name, | ||
| GEN_AI_OPERATION_NAME: GenAIOperationName.INVOKE_AGENT, |
There was a problem hiding this comment.
This is kind of problematic as [gen_ai.operation.name](https://github.com/open-telemetry/semantic-conventions-genai/blob/main/docs/registry/attributes/gen-ai.md) is a required field. I understand that the root workflow span shouldn't be marked as invoke_agent however. @lmolkova any thoughts?
| """End root span when trace ends.""" | ||
| if root_span := self._root_spans.pop(trace.trace_id, None): | ||
| if root_span.is_recording(): | ||
| root_span.set_status(Status(StatusCode.OK)) |
9d404c0 to
cdc1908
Compare
Description
Updates the OpenAI Agents instrumentation to better handle current Agents SDK tracing data and the GenAI semantic-convention issues reported in #86.
High-level changes:
gen_ai.operation.name = unknownfor non-GenAI Agents SDK spans such as task, turn, MCP list-tools, speech group, and custom spans.gen_ai.systemfrom OpenAI Agents spans and leaves successful spans with unset status.gen_ai.response.finish_reasonsand output-messagefinish_reasonfallback values.gen_ai.tool.call.idfor tool spans, falling back to the SDK span ID when no call ID is available.invoke_agent, so arbitrary workflow trace names are not validated as invoke-agent span names.meter_provider.CustomSpanData.data(for examplesandbox.*and process exit attributes) without assigning them a GenAI operation.openai-agentstest range for the current Agents SDK.This addresses the OpenAI Agents instrumentation failures called out in #86 without marking that issue closed here, since this PR still does not add a live weaver scenario for the package.
Type of change
How has this been tested?
Latest checks on the final pushed diff:
uvx --with tox-uv tox -e py310-test-instrumentation-genai-openai_agents-oldest,py310-test-instrumentation-genai-openai_agents-latest,py311-test-instrumentation-genai-openai_agents-oldest,py311-test-instrumentation-genai-openai_agents-latest,py312-test-instrumentation-genai-openai_agents-oldest,py312-test-instrumentation-genai-openai_agents-latest,py313-test-instrumentation-genai-openai_agents-oldest,py313-test-instrumentation-genai-openai_agents-latest,py314-test-instrumentation-genai-openai_agents-oldest,py314-test-instrumentation-genai-openai_agents-latest,lint-instrumentation-genai-openai_agents -- -quvx --with tox-uv tox -e precommitgit diff --checkEarlier validation for this PR:
uvx --with tox-uv tox -e lint-license-header-checkuvx ruff format --check instrumentation/opentelemetry-instrumentation-genai-openai-agentsuvx --from towncrier==25.8.0 towncrier build --draft --version Unreleasedopenai-agents==0.17.0andopenai-agents==0.17.2prior_auth_confusion_ct; verified service name, GenAI attrs, no unknown operation names, and sandbox attrs in Jaeger tracedea30d9909ec5238e7347130f25dc4c2.Checklist
See CONTRIBUTING.md for the style guide, changelog guidance, and more.