chore(llmobs): span linking for oai agents sdk #13072

lievan · 2025-04-04T14:48:46Z

Add span linking between tool & llm spans for the openai agents sdk.

We use the core dispatch api since span linking requires cross-integration communication in the case where someone selects "chat completions" as the llm api to use for the agents sdk.

Signals are dispatched

when LLM spans finish (chat completions api) in the oai integration
when LLM spans finish (responses api) in the agents sdk integration
when tool calls/handoffs finish in the agents sdk integration

ToolCallTracker in ddtrace.llmobs._utils contains the functions that handles these signals to add span links.

Links created

[LLM output -> tool input] for the case where an LLM span chooses a tool and that tool is later executed via the agents sdk. We do this by mapping the tool name & arguments to it's tool id. When the tool call is triggered, we have access to it's name and arguments. From there, we can look up it's tool id and the LLM span that is used to generate that argument. We pop the tool name/arg from the lookup dictionary after it's used.

[Tool output -> LLM input] for the case where a tool's output is fed back into a later LLM call, either in the same agent or another agent. We can tell this since the tool_id is present in the LLM's input messages. We then use this tool id to lookup the tool span.

So the general lifecycle is:

An llm chooses a tool. A save the tool id, tool name, and tool arguments and correlate it with the LLM span
The tool is run.
- We look at the argument and name of the tool and use it to look up the LLM span that chose this tool. We then delete the name/arg from the lookup dict. We then
- We save the span/trace id of the tool and correlate it with the tool_id
The tool output is used as input for an LLM span. We have access to the tool id here, and lookup the span/trace id of the tool to link it to the LLM span

A note on handoffs

Hand-offs are implemented as tool calls in the agents SDK, so the span linking logic is largely the same. Two notes

there are no arguments for handoffs, so we use a dummy default lookup key for [LLM output -> tool input] linking step
the tool_id representing a handoff may be continually used as input for an LLM call since the list of messages is kept and added to across agent runs. However, it realistically should only be linked to the first LLM call of the agent being handed-off to since. Unlike other tool calls, a handoff is only an orchestration step and it doesn't provide extra context actually "used" in downstream llm generations
There are two brittle parts of hand-off linking that relies on some implementation details internal to the agents sdk
- We are re-constructing the raw tool name used for hand-offs
  handoff_tool_name = "transfer_to_{}".format("_".join(oai_span.to_agent.split(" ")).lower())
- We are using {} as the placeholder for the hand-off tool call argument. This is what's generated by the LLM when it chooses a handoff.

We can improve on this by inferring these values when an LLM chooses a handoff tool, but this requires a bit more exploring

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-04-04T14:49:18Z

CODEOWNERS have been resolved as:

tests/contrib/openai_agents/cassettes/test_multiple_agent_handoffs_with_chat_completions.yaml  @DataDog/apm-core-python @DataDog/apm-idm-python
ddtrace/llmobs/_constants.py                                            @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py                                  @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai_agents.py                           @DataDog/ml-observability
ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/llmobs/_utils.py                                                @DataDog/ml-observability
tests/contrib/openai_agents/conftest.py                                 @DataDog/apm-core-python @DataDog/apm-idm-python
tests/contrib/openai_agents/test_openai_agents_llmobs.py                @DataDog/apm-core-python @DataDog/apm-idm-python
tests/llmobs/_utils.py                                                  @DataDog/ml-observability

github-actions · 2025-04-04T15:09:30Z

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 228 ± 2 ms.

The average import time from base is: 232 ± 4 ms.

The import time difference between this PR and base is: -3.9 ± 0.1 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 2.103 ms (0.92%)

ddtrace.bootstrap.sitecustomize 1.434 ms (0.63%)

ddtrace.bootstrap.preload 1.434 ms (0.63%)

ddtrace.internal.products 1.434 ms (0.63%)

ddtrace.internal.remoteconfig.client 0.662 ms (0.29%)

ddtrace 0.669 ms (0.29%)

pr-commenter · 2025-04-04T15:29:55Z

Benchmarks

Benchmark execution time: 2025-04-04 15:29:53

Comparing candidate commit 210c362 in PR branch evan.li/span-linking-agents with baseline commit 534fa86 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 498 metrics, 2 unstable metrics.

…nto evan.li/span-linking-agents

sabrenner

general logic lgtm, and the examples you provided look really nice! just a couple small style things nits and one question, will approve after it's answered!

ddtrace/llmobs/_utils.py

ddtrace/llmobs/_integrations/openai.py

ddtrace/llmobs/_utils.py

…race-py into evan.li/span-linking-agents

agents span linking

210c362

lievan marked this pull request as ready for review April 4, 2025 14:49

lievan requested review from a team as code owners April 4, 2025 14:49

lievan requested review from erikayasuda and quinna-h April 4, 2025 14:49

lievan changed the title ~~chore(llmobs): span linking for oai agents sdk~~ chore(llmobs): span linking for oai agents sdk Apr 4, 2025

emmettbutler approved these changes Apr 4, 2025

View reviewed changes

Merge branch 'evan.li/oai-agents' of github.com:DataDog/dd-trace-py i…

6f9897c

…nto evan.li/span-linking-agents

lievan changed the base branch from main to evan.li/oai-agents April 4, 2025 21:58

Merge branch 'evan.li/oai-agents' into evan.li/span-linking-agents

7613a71

sabrenner reviewed Apr 7, 2025

View reviewed changes

ddtrace/llmobs/_utils.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_integrations/openai.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_utils.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_utils.py Show resolved Hide resolved

apply suggestions

ac8595e

sabrenner approved these changes Apr 7, 2025

View reviewed changes

lievan and others added 3 commits April 7, 2025 23:05

sam suggestion

320a939

Merge branch 'evan.li/span-linking-agents' of github.com:DataDog/dd-t…

8a475d4

…race-py into evan.li/span-linking-agents

Merge branch 'evan.li/oai-agents' into evan.li/span-linking-agents

6c00e93

lievan merged commit 65b08a4 into evan.li/oai-agents Apr 8, 2025
37 of 42 checks passed

lievan deleted the evan.li/span-linking-agents branch April 8, 2025 03:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(llmobs): span linking for oai agents sdk #13072

chore(llmobs): span linking for oai agents sdk #13072

Uh oh!

lievan commented Apr 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025 •

edited

Loading

Uh oh!

pr-commenter bot commented Apr 4, 2025

Uh oh!

sabrenner left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chore(llmobs): span linking for oai agents sdk #13072

chore(llmobs): span linking for oai agents sdk #13072

Uh oh!

Conversation

lievan commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Links created

A note on handoffs

Checklist

Reviewer Checklist

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bootstrap import analysis

Summary

Import time breakdown

Uh oh!

pr-commenter bot commented Apr 4, 2025

Benchmarks

Uh oh!

sabrenner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lievan commented Apr 4, 2025 •

edited

Loading

github-actions bot commented Apr 4, 2025 •

edited

Loading