Skip to content

fix(llmobs): emit session_id as top-level field + propagate via context#11371

Draft
jessicagamio wants to merge 2 commits into
masterfrom
jessica.gamio/llmobs-session-id-mlos-646
Draft

fix(llmobs): emit session_id as top-level field + propagate via context#11371
jessicagamio wants to merge 2 commits into
masterfrom
jessica.gamio/llmobs-session-id-mlos-646

Conversation

@jessicagamio
Copy link
Copy Markdown

Summary

Two related dd-trace-java fixes bringing the LLMObs SDK to parity with the documented HTTP intake API span schema:

  1. Emit session_id as a top-level field on the wire — currently it's only placed inside the tags[] array as a key:value string.
  2. Propagate session_id from parent context to child LLMObs spans when no explicit sessionId is passed to a child constructor.

The LLM Trace Explorer's Sessions filter keys off the top-level session_id field (per dd-source/.../shared/libs/spanquerymatcher/field_mapping.go:14). Before this PR, dd-trace-java spans were invisible to that filter — a long-standing parity gap (since LLMObs v2 API launched on 2024-06-21, MLOB-955), not a regression. dd-trace-py and dd-trace-js have always conformed to this schema.

Customer escalation: MLOS-646
MLOB tracking: MLOB-7479
Confluence writeup: https://datadoghq.atlassian.net/wiki/x/RQGckAE

Companion PR

The auto-instrumented openai.request span in dd-java-agent's openai-java module also doesn't inherit session_id from LLMObs context. Tracked separately in MLOB-7480 and stacked on this PR (it uses LLMObsContext.currentSessionId(), an API added here).

Changes

Commit 1 — fix(llmobs): emit session_id as top-level field in LLMObs span event

  • dd-trace-core/.../LLMObsSpanMapper.java — read _ml_obs_tag.session_id from each span; emit as a top-level field in the msgpack wire payload (conditional 11→12 map size). Mirrors the existing parent_id lift pattern, but deliberately does not remove the tag — keeps session_id:<value> in tags[] to match dd-trace-py and dd-trace-js behavior.
  • dd-trace-core/.../LLMObsSpanMapperTest.groovy — 2 new tests that decode the actual msgpack bytes and assert on the wire payload structure.

Commit 2 — fix(llmobs): propagate session_id from parent context to child LLMObs spans

  • internal-api/.../LLMObsContext.java — add SESSION_ID_KEY ContextKey<String>, attach(ctx, sessionId) overload, and currentSessionId() accessor.
  • dd-java-agent/agent-llmobs/.../DDLLMObsSpan.java — when no explicit sessionId is passed, fall back to LLMObsContext.currentSessionId(). Propagate effective sessionId via the new context key so descendants inherit.
  • dd-java-agent/agent-llmobs/.../DDLLMObsSpanTest.groovy — 2 new tests covering inheritance and absence cases.

Test plan

  • ./gradlew :dd-trace-core:test --tests "*LLMObsSpanMapperTest*" — 5 tests pass (2 new + 3 existing)
  • ./gradlew :dd-java-agent:agent-llmobs:test — 11 tests pass (2 new + 9 existing)
  • ./gradlew :dd-trace-core:spotlessCheck :internal-api:spotlessCheck :dd-java-agent:agent-llmobs:spotlessCheck clean
  • End-to-end verified via Spring Boot tourist-guide reproduction app: top-level session_id field confirmed in raw indexed payload; manual child LLMObs spans inherit session_id from workflow root via LLMObsContext

Note on EvP-side tag duplication (not from this PR)

Post-fix indexed events show a duplicate session_id:<value> entry in tags[]. This is not introduced by this change — it originates in Event Platform tag promotion at dd-source/.../apis/llm-obs/internal/core/services/trace.go:138-140, which appends session_id:<value> to tags whenever the top-level field is set. dd-trace-py and dd-trace-js post-EvP events exhibit the same duplicate. Pre-existing and benign.

dd-trace-java was placing session_id only inside the `tags[]` array of
each LLMObs span event on the wire, never as a top-level span field.
The LLM Trace Explorer's Sessions filter keys off the top-level
`session_id` field per the public HTTP intake API schema
(https://docs.datadoghq.com/llm_observability/instrumentation/api?tab=model#span),
so Java spans were invisible to that filter. dd-trace-py and
dd-trace-js have conformed to this schema since the LLMObs v2 API
launched (2024-06-21, MLOB-955).

In LLMObsSpanMapper.map(), read the _ml_obs_tag.session_id tag from
each span before opening the msgpack map; if present, extend the map
size from 11 to 12 entries and emit the value as a top-level field.
Mirrors the existing parent_id lift pattern, with one deliberate
difference: the tag is not removed so the `session_id:<value>` entry
remains in the tags[] array, matching dd-trace-py and dd-trace-js
behavior.

Originating issue: MLOS-646.
… spans

dd-trace-java's DDLLMObsSpan constructor only honored an explicitly
passed sessionId argument. When a caller used the SDK's documented
"set session_id on the root span only" pattern and started a child
LLMObs span with sessionId=null, the child carried no session_id.
dd-trace-py (ddtrace/llmobs/_llmobs.py:1911) and dd-trace-js
(packages/dd-trace/src/llmobs/tagger.js:105-106) both auto-propagate
session_id from parent context to descendant LLMObs spans.

Add a SESSION_ID ContextKey to LLMObsContext alongside the existing
span-context key, plus an attach(ctx, sessionId) overload and a
currentSessionId() accessor. In DDLLMObsSpan's constructor, fall back
to LLMObsContext.currentSessionId() when no explicit sessionId is
passed. The constructor now also propagates the effective sessionId
through the new context key so subsequent descendant spans can inherit
it.

Originating issue: MLOS-646.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant