fix(inference): preserve reasoning_content across multi-turn conversations (#2800) by staimoorulhassan · Pull Request #2806 · tinyhumansai/openhuman

staimoorulhassan · 2026-05-28T04:35:54Z

Summary

Closes #2800.

Thinking-mode models (DeepSeek-R1, Qwen3 thinking, and compatible variants) return a reasoning_content field alongside content in their assistant messages. When that message is included in a subsequent turn the API requires reasoning_content to be present again — omitting it triggers a 400.

Root cause: build_streaming_response discarded reasoning_content from the ProviderChatResponse (it was not written into the stored text). The next turn therefore sent the assistant turn with content only, which the model rejected.

Changes:

compatible.rs — build_streaming_response: When the model returned reasoning_content and there are no tool calls, encode both content and reasoning_content as a JSON blob and store it as the assistant message text. Adds a debug! log line for observability.
compatible.rs — native_messages_from_transcript: Detect the JSON-blob encoding ({"content":"…","reasoning_content":"…"}) and reconstruct the NativeMessage with reasoning_content populated, so it is forwarded to the API on the next turn.
compatible_types.rs: Three NativeMessage construction sites that previously lacked a reasoning_content field now set it to None explicitly (struct completeness).
compatible_tests.rs: Two new unit tests —
- reasoning_content_preserved_across_turns — round-trips a thinking-mode response through the encoder and transcript decoder, asserting the field survives.
- reasoning_content_absent_when_not_present — asserts normal (non-thinking) responses are unaffected.

Test plan

cargo test -p openhuman -- inference::provider::compatible_tests passes (two new tests + existing suite unchanged)
Multi-turn chat with a DeepSeek-R1 or Qwen3-thinking model no longer returns 400 on the second turn
Non-thinking models (no reasoning_content in response) are unaffected — text stored as plain string as before

Closes #2800

Summary by CodeRabbit

Bug Fixes
- Preserve and replay assistant "reasoning" output across turns for thinking models with OpenAI-compatible providers, ensuring continuity when both visible content and reasoning are returned.
Tests
- Added tests covering multi-turn reasoning replay and streaming error propagation to verify correct handling of reasoning content and plain-text responses.

…tions Thinking-mode models (DeepSeek-R1, Qwen3, etc.) return a reasoning_content field that the API requires to be passed back verbatim on subsequent turns. Previously the field was discarded after the first response, causing every follow-up turn to fail with 400 "reasoning_content must be passed back". Two changes together fix the round-trip: 1. parse_native_response: when the response has reasoning_content and no tool calls, encode both content and reasoning_content as a JSON object in the returned text (matching the existing pattern for tool-call messages). The JSON is transparent to callers — they store it as the ChatMessage content unchanged. 2. convert_messages_for_native: detect the {"content":..,"reasoning_content":..} JSON envelope in stored assistant messages and unpack it into a NativeMessage with the reasoning_content field set. This ensures the field appears in the outbound API payload on the next turn. NativeMessage gains a new optional reasoning_content field with skip_serializing_if = "Option::is_none" so it is only emitted when present and never breaks vanilla providers that do not understand the field. Four unit tests cover: response preservation, plain-text pass-through, round-trip restoration in convert_messages_for_native, and no spurious field for non-thinking models. Fixes tinyhumansai#2800

coderabbitai · 2026-05-28T04:36:09Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ed9282e0-30f7-4131-ac88-d27ae7f3fd4e

📥 Commits

Reviewing files that changed from the base of the PR and between f6ec9b1 and b129db8.

📒 Files selected for processing (1)

src/openhuman/inference/provider/compatible_tests.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/openhuman/inference/provider/compatible_tests.rs

📝 Walkthrough

Walkthrough

The PR preserves and replays thinking-model reasoning_content by adding an optional reasoning_content field to NativeMessage, extracting JSON-encoded reasoning during outbound conversion, encoding both content and reasoning_content into response text when present, and adding tests that validate the multi-turn replay behavior.

Changes

Thinking model reasoning_content multi-turn replay

Layer / File(s)	Summary
NativeMessage reasoning_content field `src/openhuman/inference/provider/compatible_types.rs`	`NativeMessage` adds an optional `reasoning_content: Option<String>` field with Serde skip-if-none handling to carry thinking output from upstream responses.
Assistant message reasoning_content extraction `src/openhuman/inference/provider/compatible.rs`	`convert_messages_for_native` now deserializes JSON-encoded `reasoning_content` from stored assistant message content into `NativeMessage.reasoning_content`; tool and other message conversions explicitly set `reasoning_content` to `None`.
Native response reasoning_content preservation `src/openhuman/inference/provider/compatible.rs`	`parse_native_response` encodes both `content` and `reasoning_content` into a single JSON string when `reasoning_content` is present and no tool calls exist, preserving the field for subsequent turns.
Reasoning replay validation tests `src/openhuman/inference/provider/compatible_tests.rs`	New tests (`#2800`) validate that thinking-model responses with `reasoning_content` emit JSON text for replay, and that message conversion restores JSON-encoded `reasoning_content` into `NativeMessage.reasoning_content` while non-thinking models remain unaffected.

Sequence Diagram

sequenceDiagram
  participant Client
  participant OpenAiCompatibleProvider
  participant ThinkingModel
  participant ThreadStorage
  Client->>OpenAiCompatibleProvider: Turn 1 request
  OpenAiCompatibleProvider->>ThinkingModel: convert_messages_for_native (no prior reasoning)
  ThinkingModel->>OpenAiCompatibleProvider: response with content + reasoning_content
  OpenAiCompatibleProvider->>OpenAiCompatibleProvider: parse_native_response (encode content+reasoning into JSON text)
  OpenAiCompatibleProvider->>ThreadStorage: store message text (JSON with reasoning)
  Client->>OpenAiCompatibleProvider: Turn 2 request
  ThreadStorage->>OpenAiCompatibleProvider: load prior message (JSON text)
  OpenAiCompatibleProvider->>OpenAiCompatibleProvider: convert_messages_for_native (decode JSON, restore reasoning_content)
  OpenAiCompatibleProvider->>ThinkingModel: send message with reasoning_content restored

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#2717: Modifies convert_messages_for_native in the same provider path; related to message-conversion logic.

Poem

🐰 A rabbit hops through thinking's maze,
Each thought preserved in encoded haze,
Turn to turn the reasoning stays,
No missing clues in future days,
Replayed thoughts light up the ways!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: preserving reasoning_content across multi-turn conversations, which is the core objective of the PR.
Linked Issues check	✅ Passed	The PR implements all coding requirements from issue `#2800`: adds reasoning_content field to NativeMessage, implements encoding/decoding in parse_native_response and convert_messages_for_native, includes comprehensive unit tests, and prevents 400 errors on multi-turn thinking model conversations.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to preserving reasoning_content in the OpenAI-compatible provider: field addition to NativeMessage, JSON encoding/decoding logic, and corresponding unit tests covering the round-trip behavior.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

oxoxDev

Author: @staimoorulhassan (FIRST_TIME_CONTRIBUTOR — external contributor) — welcome and thanks for picking this up!

⚠ Duplicate situation (3 PRs targeting same bug)

Three open PRs address the same Sentry issue / #2800:

PR	Author	Scope	Approach
#2806 (this PR)	@staimoorulhassan (first-time)	3 files, +137	JSON-blob in message text field (in-place hack)
#2818	@graycyrus (core)	25 files, +325/-7	Adds `reasoning_content: Option<String>` to `ChatResponse` struct, threaded through stack
#2817	@CodeGhost21 (core)	28 files	Tool-call-turn variant of same fix

#2818 is architecturally cleaner (proper field on ChatResponse, no encoding in text). #2817 also handles the tool-call-turn case that this PR explicitly skips (if tool_calls.is_empty() gate at compatible.rs:843).

Worth coordinating with @graycyrus before pushing further — likely path is to either close #2806 in favour of #2818, or rebase your tests onto #2818's struct change (your test cases are valuable regardless).

Concerns (independent of duplicate question)

Major

1. src/openhuman/inference/provider/compatible.rs:843 — tool-call turns LOSE reasoning_content

The gate if tool_calls.is_empty() means reasoning is only preserved when the assistant message has no tool calls. Thinking models routinely emit reasoning alongside tool calls — the assistant decides what tool to call after reasoning. That entire path drops reasoning_content. #2817 by @CodeGhost21 specifically addresses this case. Without that, multi-turn conversations that include any tool use still 400 on the second turn.

2. src/openhuman/inference/provider/compatible.rs:846-855 — JSON-blob in text field leaks into UI / transcript / exports

The assistant message text is consumed by many downstream paths — UI rendering, conversation export, payload_summarizer, subagent_runner::extract_tool, transcript dumps. Any of these will now see a raw {"content":"...","reasoning_content":"..."} string instead of plain message text. Risks:

UI shows JSON to the user instead of the message
Summarizer fails to parse / produces broken summaries
Transcript exports contain encoded JSON instead of conversational text
Search / indexing matches stringified JSON tokens

The proper fix is to add reasoning_content as a first-class field on ChatResponse / NativeMessage and thread it through the stack — exactly what #2818 does. The current PR overloads text and turns this from a provider-layer concern into a cross-cutting data-shape change without explicit type-system support.

3. src/openhuman/inference/provider/compatible.rs:583-602 — naive JSON-blob detection misclassifies legitimate JSON user content

The decode logic in native_messages_from_transcript:

if let Some(rc) = value.get("reasoning_content").and_then(serde_json::Value::as_str).filter(|s| !s.is_empty())

If the user message text legitimately contains the JSON {"content":"...","reasoning_content":"..."} (someone debugging the API, copy-pasting a thinking-model response, asking the model to format output as JSON), this will be misclassified as an encoded reasoning blob and stripped/restructured. Worth a magic-token prefix (e.g. __openhuman_thinking__{...} or a struct sentinel) to make detection unambiguous.

Minor

compatible.rs:611, 632, 642 — 3 explicit reasoning_content: None fields point at the maintenance cost of in-place struct extension vs Default::default(). If #2818's approach lands, all of these go away.
compatible_types.rs:80 — field doc-comment documents the constraint but not the wire-level encoding (JSON blob in text). The implicit coupling between struct field and text-field encoding is the architectural concern raised in #2.

Questions

Did you see #2818 / #2817 before opening this? Both target the same Sentry issue and were created earlier today.
The JSON-blob-in-text approach was a deliberate choice over a struct field — what drove that? (Asking because the latter is the codebase's typical pattern for response-shape additions.)
Have you tested with a thinking model that emits both reasoning AND a tool call in one turn? Per concern #1, that path is currently silently broken.

Verified / looks good

reasoning_content field correctly uses #[serde(skip_serializing_if = "Option::is_none")] — non-thinking models won't send the field on the wire
New tests are clean and well-targeted within the scope chosen (no-tool-call case round-trip + non-thinking path unaffected)
3 explicit None initializations are mechanically correct
PR body links #2800 cleanly; commit signature looks fine
coderabbit APPROVED (after DISMISS+resubmit)

Suggested next step

Drop a note on #2818 introducing yourself + your test cases — they're useful (the round-trip + JSON-blob-detection tests translate to direct round-trip tests against #2818's struct-field approach).
Close #2806 once #2818 lands, OR contribute the tool-call-turn coverage on top of #2818 if @CodeGhost21's #2817 doesn't already handle it.

Thanks again for jumping on this — first contribution and you correctly identified the root cause. Coordination is the main blocker here, not your code.

graycyrus

Root cause is correctly identified and the test coverage is solid. Two architectural issues make this not mergeable as-is, and there's a coordination question worth resolving first.

What it does: Encodes reasoning_content alongside content as a JSON blob in the assistant message text field, then detects and unpacks that blob on the next turn to reconstruct the NativeMessage with reasoning_content set. Fixes the 400 from DeepSeek-R1 / Qwen3 on multi-turn conversations.

Breaking risk: Medium. The JSON-blob encoding changes the shape of data stored in assistant message text fields — anything that consumes ProviderChatResponse::text downstream (UI, summarizer, exports) will now see raw JSON for thinking-model turns.

Security risk: Zero.

Bottom line: Not safe to merge — JSON-blob in text field leaks to unrelated consumers, and the tool-call gate silently breaks agentic flows.

Note on coordination: there is an open PR targeting the same bug with a struct-level fix — adds reasoning_content: Option<String> to ChatResponse and threads it through the stack without touching the text field. Your tests (the round-trip test especially) are the most valuable part of this PR and translate directly to tests for that approach. Worth looking at whether you'd like to rebase your test cases onto that branch.

staimoorulhassan

changes confirmed

sanil-23 · 2026-05-28T18:18:10Z

Closing as superseded by #2818 (merged), which landed an architectural fix for the same reasoning_content multi-turn issue (#2800). Your patch on compatible.rs and the merged fix overlap on the same code paths, and the PR is now DIRTY against main after #2818's merge.

Thanks for the contribution — your investigation helped surface the bug. If there's residual coverage gap or test case from your branch you'd like upstreamed, please open a follow-up PR cherry-picking just that piece on top of current main.

staimoorulhassan requested a review from a team May 28, 2026 04:35

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. bug labels May 28, 2026

coderabbitai Bot previously approved these changes May 28, 2026

View reviewed changes

style: fix cargo fmt violations in reasoning_content test code

b129db8

staimoorulhassan dismissed coderabbitai[bot]’s stale review via b129db8 May 28, 2026 06:36

coderabbitai Bot approved these changes May 28, 2026

View reviewed changes

oxoxDev self-assigned this May 28, 2026

oxoxDev reviewed May 28, 2026

View reviewed changes

ozpool mentioned this pull request May 28, 2026

fix(observability): demote expected-error Sentry buckets across embeddings, provider, memory-store, FS, and thinking-mode wire shapes #2830

Merged

7 tasks

oxoxDev mentioned this pull request May 28, 2026

fix(observability): classify RPC filesystem path validation errors as expected (Sentry TAURI-RUST-4QH) #2795

Merged

7 tasks

graycyrus requested changes May 28, 2026

View reviewed changes

Comment thread src/openhuman/inference/provider/compatible.rs

Comment thread src/openhuman/inference/provider/compatible.rs

Comment thread src/openhuman/inference/provider/compatible.rs

staimoorulhassan commented May 28, 2026

View reviewed changes

staimoorulhassan requested a review from graycyrus May 28, 2026 12:44

sanil-23 closed this May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): preserve reasoning_content across multi-turn conversations (#2800)#2806

fix(inference): preserve reasoning_content across multi-turn conversations (#2800)#2806
staimoorulhassan wants to merge 2 commits into
tinyhumansai:mainfrom
staimoorulhassan:fix/reasoning-content-multi-turn

staimoorulhassan commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

oxoxDev left a comment

Uh oh!

graycyrus left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

staimoorulhassan left a comment

Uh oh!

sanil-23 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

staimoorulhassan commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

oxoxDev left a comment

Choose a reason for hiding this comment

⚠ Duplicate situation (3 PRs targeting same bug)

Concerns (independent of duplicate question)

Major

Minor

Questions

Verified / looks good

Suggested next step

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

staimoorulhassan left a comment

Choose a reason for hiding this comment

Uh oh!

sanil-23 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

staimoorulhassan commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading