fix(openai): include responses instructions in captured prompt#1565
fix(openai): include responses instructions in captured prompt#1565D-Joey-G wants to merge 1 commit intolangfuse:mainfrom
Conversation
| if isinstance(input_value, list): | ||
| return [{"role": "system", "content": instructions}, *input_value] |
There was a problem hiding this comment.
Possible duplicate system message in captured prompt
When input_value is a list, instructions is prepended unconditionally as a role: system entry. If the caller already included a {"role": "system", ...} message inside their input list (valid in the Responses API), the captured Langfuse prompt will contain two system-role entries, which can confuse prompt replay or evaluation flows.
Consider checking whether a system message is already present before prepending:
if isinstance(input_value, list):
already_has_system = any(
isinstance(m, dict) and m.get("role") == "system"
for m in input_value
)
if already_has_system:
return input_value
return [{"role": "system", "content": instructions}, *input_value]Alternatively, document clearly that instructions will always be surfaced as the first system message in the captured prompt regardless of existing list contents.
There was a problem hiding this comment.
Multiple developer/system messages isn't uncommon, so that concern doesn't seem important.
I believe that it is safe/correct to put the instructions arg first in what we trace, as earlier docs for the Responses API said the parameter inserts a system (or developer) message as the first item in the model's context.
That said, the docs now say:
A system (or developer) message inserted into the model's context.
When using along with previous_response_id, the instructions from a previous response will not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses.
That is admittedly less certain on that point.
Corrects behaviour such that arguments passed to the
instructionsparameter of the OpenAI responses API are treated like'role': 'system'arguments in the chat.completions API. (see langfuse/langfuse#9775 or langfuse/langfuse#10143 or langfuse/langfuse#8763)Adds tests to validate behaviour.
Disclaimer: Experimental PR review
Greptile Summary
This PR fixes the OpenAI Responses API integration so that the
instructionsparameter is captured in the Langfuse prompt alongsideinput, mirroring howrole: systemmessages are treated in the Chat Completions API. It adds the_extract_responses_prompthelper and updates the existing streaming integration test and adds new unit tests.Key changes:
_extract_responses_promptfunction mergesinstructions+inputinto a unified[{role: system}, {role: user}]message list (or a dict fallback) before logging to LangfuseNotGivensentinel values toNone_get_langfuse_data_from_kwargsnow delegates to_extract_responses_promptinstead of readinginputdirectly for theResponses/AsyncResponsesobjectstest_openai_prompt_extraction.py) covers all documented input combinationsIssues noted:
inputis a list that already contains a{"role": "system", ...}entry andinstructionsis also provided, the captured Langfuse prompt will have two system messages — the new one prepended unconditionally. This could mislead prompt replay or evaluation tools.instructions+inputpath; only the streaming case is covered at the integration level.Confidence Score: 4/5
inputlist —instructionswill be prepended unconditionally, producing duplicate system entries in the captured Langfuse prompt. This doesn't affect the actual API call (only what Langfuse records) and is unlikely in practice, but it's a correctness issue worth addressing. A missing non-streaming integration test is a minor coverage gap. No regressions are expected for existing callers.langfuse/openai.pylines 271-272 — the unconditional prepend of the system message wheninputis a list.Important Files Changed
_extract_responses_promptfunction that mergesinstructionsandinputinto a unified prompt structure for the Responses API; logic is correct but has an untested edge case where a listinputalready containing a system message would get a duplicate prepended system entry.test_response_api_streamingto assert the new merged[system, user]input format; no new non-streaming integration test covers theinstructions+inputpath._extract_responses_prompt, includingNOT_GIVENsentinel values and string/list inputs; well-structured and complete for the happy paths.Sequence Diagram
sequenceDiagram participant User participant LangfuseWrapper as Langfuse Wrapper (_wrap) participant ExtractPrompt as _extract_responses_prompt participant Langfuse participant OpenAI as OpenAI Responses API User->>LangfuseWrapper: responses.create(model, instructions, input, ...) LangfuseWrapper->>ExtractPrompt: kwargs (instructions, input) alt instructions is None ExtractPrompt-->>LangfuseWrapper: input as-is else input is str ExtractPrompt-->>LangfuseWrapper: [{role:system, content:instructions}, {role:user, content:input}] else input is list ExtractPrompt-->>LangfuseWrapper: [{role:system, content:instructions}, ...input] else input is None ExtractPrompt-->>LangfuseWrapper: {instructions: instructions} end LangfuseWrapper->>Langfuse: start_observation(input=merged_prompt) LangfuseWrapper->>OpenAI: original call (unchanged kwargs) OpenAI-->>LangfuseWrapper: response LangfuseWrapper->>Langfuse: update generation (output, usage, model) LangfuseWrapper-->>User: responseLast reviewed commit: 4c8353f
(2/5) Greptile learns from your feedback when you react with thumbs up/down!