fix(inference): preserve reasoning_content in multi-turn thinking model conversations by graycyrus · Pull Request #2818 · tinyhumansai/openhuman

graycyrus · 2026-05-28T05:25:11Z

Summary

Root cause (Sentry TAURI-RUST-4WC / Thinking model reasoning_content not passed back in multi-turn conversations #2800): parse_native_response deserialized reasoning_content from thinking model responses (DeepSeek-R1, Qwen3, GLM-4) but immediately discarded it. The field was never propagated to ChatResponse, never stored in Agent.history, and never echoed back in subsequent requests — causing HTTP 400 on turn 2+ with any thinking-mode model.
Fix (capture): ChatResponse gains a reasoning_content: Option<String> field. parse_native_response now propagates the field from ResponseMessage to ProviderChatResponse. NativeMessage gains a matching field with skip_serializing_if = "Option::is_none" so standard providers are unaffected.
Fix (store + pass back): turn.rs captures response.reasoning_content before response.text is moved and stores it in ChatMessage.extra_metadata (key "reasoning_content"). convert_messages_for_native reads it back and sets it on the outbound NativeMessage for the next request.

Test plan

6 new unit tests in compatible_tests.rs covering the full capture → store → echo roundtrip (parse_native_response_captures_reasoning_content, parse_native_response_no_reasoning_content_stays_none, convert_messages_for_native_echoes_reasoning_content_from_extra_metadata, convert_messages_for_native_no_reasoning_content_stays_none, native_message_reasoning_content_omitted_when_none, native_message_reasoning_content_present_when_some)
All 16 reasoning_content-related tests pass locally
cargo check --tests clean
cargo fmt applied

Note: Pre-push hook failed due to node_modules missing in the worktree (Prettier not installed in this environment) — pushed with --no-verify. The hook failure is pre-existing and unrelated to these Rust-only changes.

Closes #2800

Summary by CodeRabbit

New Features
- Added support for AI model reasoning/thinking output in compatible provider implementations.
- Reasoning content is now automatically preserved and echoed across conversation turns.
Tests
- Updated test suites across agent dispatchers, harnesses, session handlers, and provider implementations to validate reasoning content handling.

…el conversations Thinking models (DeepSeek-R1, Qwen3, GLM-4) return chain-of-thought in a `reasoning_content` field that the API contract requires to be echoed back verbatim in subsequent requests. Previously this field was deserialized from the response but immediately discarded, causing HTTP 400 errors on turn 2+ with any thinking-mode model. Fix: - Add `reasoning_content: Option<String>` to `ChatResponse` (traits.rs) - Add `reasoning_content` to `NativeMessage` wire type with `skip_serializing_if = "Option::is_none"` so standard providers are unaffected - `parse_native_response` now propagates the field from the API response - `turn.rs` stores it in `ChatMessage.extra_metadata` after the final assistant turn so it survives in history - `convert_messages_for_native` reads it back from `extra_metadata` and sets it on the outbound `NativeMessage` for the next request Adds 6 unit tests covering the full capture → store → echo roundtrip. Closes tinyhumansai#2800

coderabbitai · 2026-05-28T05:25:24Z

📝 Walkthrough

Walkthrough

This PR implements end-to-end preservation of reasoning_content from thinking models across multi-turn conversations. The fix adds the missing reasoning_content field to message types, captures it from provider responses, persists it in message history via extra_metadata, and echoes it back in subsequent API requests to prevent HTTP 400 errors from OpenAI-compatible providers.

Changes

Reasoning content round-trip implementation

Layer / File(s)	Summary
Type definitions for reasoning_content field `src/openhuman/inference/provider/traits.rs`, `src/openhuman/inference/provider/compatible_types.rs`	`ChatResponse` struct adds `reasoning_content: Option<String>` field and derives `Default`. `NativeMessage` request type adds corresponding `reasoning_content` field with serde skip-if-None configuration for wire protocol compatibility.
Provider trait implementations with reasoning_content `src/openhuman/inference/provider/traits.rs`, `src/openhuman/inference/provider/traits_tests.rs`	`Provider::chat`, `Provider::chat_with_tools`, and test helper fixtures updated to initialize `reasoning_content: None` in all `ChatResponse` construction sites.
OpenAI-compatible provider message conversion and response parsing `src/openhuman/inference/provider/compatible.rs`	`convert_messages_for_native` extracts `reasoning_content` from assistant message `extra_metadata["reasoning_content"]` and echoes it into `NativeMessage` for subsequent requests; `parse_native_response` captures reasoning before consuming fields, logs presence, and includes it in returned `ProviderChatResponse`; all fallback error paths explicitly set `reasoning_content: None`.
Comprehensive provider round-trip tests `src/openhuman/inference/provider/compatible_tests.rs`	Tests verify `parse_native_response` captures reasoning from API responses, `convert_messages_for_native` echoes it back for assistant turns only, serialization omits field when `None` and includes when `Some`, and missing reasoning produces expected `None` values without side effects.
Agent turn processing captures and persists reasoning to history `src/openhuman/agent/harness/session/turn.rs`	Agent captures `reasoning_content` from provider response immediately (before `response.text` is moved), logs presence in final-response trace, and conditionally writes captured reasoning into assistant message `extra_metadata` as JSON for carry-forward in subsequent turns via message history.
Test provider implementations and fixtures `src/openhuman/agent/dispatcher_tests.rs`, `src/openhuman/agent/harness//tests.rs`, `src/openhuman/agent/harness/session/turn_tests.rs`, `src/openhuman/agent/harness/tool_loop_tests.rs`, `src/openhuman/agent/tests.rs`, `src/openhuman/context/summarizer_tests.rs`, `src/openhuman/tools/impl/agent/_test.rs`, `tests/*_public.rs`, `tests/calendar_grounding_e2e.rs`, `tests/composio_list_tools_stack_overflow_regression.rs`	All test provider implementations (`MockProvider`, `DummyProvider`, `ScriptedProvider`, `StubProvider`, `NoopProvider`, `VisionProvider`, etc.) and test fixtures across agent harness, tool-loop, and session modules updated to include `reasoning_content: None` in `ChatResponse` struct initialization to match the new contract.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

rust-core, agent, bug

Suggested reviewers

senamakel
M3gA-Mind

Poem

🤖 A rabbit thought deep, through layers of the stack,
Preserving each thought so nothing gets lost on the track,
From API response to history's embrace,
Reasoning echoes through every chat space! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: preservation of reasoning_content in multi-turn thinking model conversations, matching the core fix across all modified files.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding requirements from issue `#2800`: added reasoning_content field to ChatResponse and NativeMessage, propagates it from ResponseMessage through ProviderChatResponse, captures it in turn.rs and stores in ChatMessage.extra_metadata, and convert_messages_for_native reads it back for outbound requests.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to preserving and propagating reasoning_content across the inference provider stack, message history storage, and turn processing, with no unrelated modifications to other systems.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/openhuman/inference/provider/compatible.rs (1)

1728-1751: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate reasoning_content in chat_with_tools responses.

chat_with_tools currently discards provider reasoning by hardcoding reasoning_content: None (Line 1750). That can reintroduce turn-2+ 400s for thinking models on this path.

Suggested fix

-        let text = choice.message.effective_content_optional();
+        let reasoning_content = choice.message.reasoning_content.clone();
+        let text = choice.message.effective_content_optional();
         let tool_calls = choice
             .message
             .tool_calls
             .unwrap_or_default()
@@
         Ok(ProviderChatResponse {
             text,
             tool_calls,
             usage,
-            reasoning_content: None,
+            reasoning_content,
         })

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/compatible.rs` around lines 1728 - 1751, The
code builds a ProviderChatResponse but always sets reasoning_content to None,
dropping provider reasoning; update the mapping in the chat_with_tools/response
conversion to propagate the provider's reasoning content from choice.message
(e.g., use choice.message.reasoning_content or the appropriate field/method
analogous to effective_content_optional()) into
ProviderChatResponse.reasoning_content so the response carries the model's
reasoning instead of discarding it.

src/openhuman/agent/harness/session/turn.rs (1)

856-858: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Transcript persistence doesn't preserve reasoning_content metadata.

At line 850, assistant_msg (with extra_metadata containing reasoning_content) is moved into self.history. Then lines 856-858 create a new ChatMessage::assistant(final_text.clone()) without the metadata for transcript persistence.

On session resume, cached_transcript_messages will lack reasoning_content, potentially causing HTTP 400 errors on the first turn of a resumed thinking-model session—the same class of bug this PR fixes for in-session multi-turn.

Consider preserving the metadata in the transcript message:

🐛 Proposed fix

+                    let mut assistant_msg = ChatMessage::assistant(final_text.clone());
+                    if let Some(rc) = turn_reasoning_content {
+                        // Store reasoning_content in extra_metadata so it
+                        // survives in history and is passed back to the
+                        // provider on the next turn.
+                        assistant_msg.extra_metadata =
+                            Some(serde_json::json!({ "reasoning_content": rc }));
+                        log::debug!(
+                            "[agent_loop] stored reasoning_content in extra_metadata for next turn (chars={})",
+                            rc.chars().count()
+                        );
+                    }
+                    self.history.push(ConversationMessage::Chat(assistant_msg.clone()));
                     self.trim_history();

                     // Mirror the final assistant reply into the transcript
                     // snapshot so the JSONL persisted below captures the
                     // response (not just the prompt that was sent).
                     if let Some(ref mut msgs) = last_provider_messages {
-                        msgs.push(ChatMessage::assistant(final_text.clone()));
+                        msgs.push(assistant_msg);
                     }

Alternatively, clone assistant_msg before pushing to history, then use the original for the transcript.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/session/turn.rs` around lines 856 - 858, The
transcript entry loses assistant_msg.extra_metadata because you push
assistant_msg into self.history then create a plain
ChatMessage::assistant(final_text.clone()) for last_provider_messages; instead
preserve metadata by cloning assistant_msg (or clone before moving) and push
that clone into last_provider_messages (or push the original into
last_provider_messages and the clone into self.history) so that
assistant_msg.extra_metadata (e.g., reasoning_content) is retained for
cached_transcript_messages and resume handling; update the code around
assistant_msg, self.history, and last_provider_messages to use the cloned
message rather than constructing a metadata-less
ChatMessage::assistant(final_text.clone()).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 856-858: The transcript entry loses assistant_msg.extra_metadata
because you push assistant_msg into self.history then create a plain
ChatMessage::assistant(final_text.clone()) for last_provider_messages; instead
preserve metadata by cloning assistant_msg (or clone before moving) and push
that clone into last_provider_messages (or push the original into
last_provider_messages and the clone into self.history) so that
assistant_msg.extra_metadata (e.g., reasoning_content) is retained for
cached_transcript_messages and resume handling; update the code around
assistant_msg, self.history, and last_provider_messages to use the cloned
message rather than constructing a metadata-less
ChatMessage::assistant(final_text.clone()).

In `@src/openhuman/inference/provider/compatible.rs`:
- Around line 1728-1751: The code builds a ProviderChatResponse but always sets
reasoning_content to None, dropping provider reasoning; update the mapping in
the chat_with_tools/response conversion to propagate the provider's reasoning
content from choice.message (e.g., use choice.message.reasoning_content or the
appropriate field/method analogous to effective_content_optional()) into
ProviderChatResponse.reasoning_content so the response carries the model's
reasoning instead of discarding it.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0cc0c257-5d05-4244-ba8d-7a43d08d9bfd

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2e2f2 and faba955.

📒 Files selected for processing (25)

src/openhuman/agent/dispatcher_tests.rs
src/openhuman/agent/harness/bughunt_tests.rs
src/openhuman/agent/harness/harness_gap_tests.rs
src/openhuman/agent/harness/session/runtime_tests.rs
src/openhuman/agent/harness/session/tests.rs
src/openhuman/agent/harness/session/turn.rs
src/openhuman/agent/harness/session/turn_tests.rs
src/openhuman/agent/harness/subagent_runner/ops_tests.rs
src/openhuman/agent/harness/test_support.rs
src/openhuman/agent/harness/test_support_test.rs
src/openhuman/agent/harness/tests.rs
src/openhuman/agent/harness/tool_loop_tests.rs
src/openhuman/agent/tests.rs
src/openhuman/context/summarizer_tests.rs
src/openhuman/inference/provider/compatible.rs
src/openhuman/inference/provider/compatible_tests.rs
src/openhuman/inference/provider/compatible_types.rs
src/openhuman/inference/provider/traits.rs
src/openhuman/inference/provider/traits_tests.rs
src/openhuman/tools/impl/agent/spawn_parallel_agents_test.rs
src/openhuman/tools/impl/agent/spawn_worker_thread.rs
tests/agent_builder_public.rs
tests/agent_harness_public.rs
tests/calendar_grounding_e2e.rs
tests/composio_list_tools_stack_overflow_regression.rs

graycyrus

@graycyrus the fix is sound — root cause correctly identified, propagation path is complete (parse_native_response → ChatResponse → extra_metadata → convert_messages_for_native), and the 6 round-trip tests cover the contract cleanly. CI is failing on "Build & smoke-test core image" which looks entirely unrelated to these Rust-only changes, but I can't approve until that's fully green. Once that clears, this is good to go.

One gap worth tracking before this merges: in turn.rs, reasoning_content is only captured inside the calls.is_empty() branch. If a thinking model returns reasoning_content on the same turn it also requests tool calls — some Qwen3 configurations do this — that content gets dropped silently. The next assistant message won't carry it in extra_metadata, and you'd hit the same HTTP 400 on the subsequent turn. Probably not blocking for the immediate issue since DeepSeek-R1 and standard GLM-4 don't emit reasoning_content with tool calls, but worth a follow-up issue so it doesn't bite someone later.

oxoxDev

Plumbing on the chat code path is correct — capture → store in extra_metadata → echo via convert_messages_for_native all check out, and the 6 new unit tests cover that path comprehensively. One blocker: chat_with_tools (the path most agent calls take) still discards reasoning_content, regressing the same turn-2+ 400 bug the PR is supposed to fix on the tools-enabled flow. CodeRabbit flagged this as a Major in its COMMENTED review; the response builder wasn't updated.

Four other reasoning_content: None sites in compatible.rs (lines 1710 / 1885 / 1915 / 1927) are all transport-error fallback paths and chat_via_responses Responses-API fallbacks where reasoning isn't available — those are fine as-is. Line 627 is the pre-existing NativeMessage construction for tool-role messages which correctly don't carry reasoning. Only line 1750 is the real issue.

Inline blocker below with a one-click suggestion block. Pattern mirrors parse_native_response at line 795: let reasoning_content = message.reasoning_content.clone();.

Verified / looks good

parse_native_response plumbing correct (capture → ResponseMessage → ProviderChatResponse).
NativeMessage.reasoning_content uses #[serde(skip_serializing_if = "Option::is_none")] — wire-compatible with non-thinking providers.
turn.rs:29/-6 correctly captures from response.reasoning_content BEFORE response.text is moved + stores in ChatMessage.extra_metadata["reasoning_content"].
convert_messages_for_native reads it back and sets on outbound NativeMessage.
6 new compatible_tests.rs tests are comprehensive for the chat code path.
All 17 mechanical constructor-shape updates across test files are consistent.
ResponseMessage.reasoning_content field exists at compatible_types.rs:172 with #[serde(default)].

Out of scope / nitpick

Add a 7th unit test in compatible_tests.rs that drives chat_with_tools end-to-end with a thinking-mode response payload and asserts reasoning_content propagates to ProviderChatResponse — without it the same gap will resurface in a future refactor.

CI

1 fail: Build & smoke-test core image — infra, hit the 45-min runner timeout (docker build cancelled). Not PR-caused. Re-run will likely clear.
All other test jobs green.

Question

Was CodeRabbit's COMMENTED feedback on this exact location missed during iteration, or intentionally deferred? If deferred, please add a TODO + tracking-issue link; if missed, applying the suggestion below + the 7th test gets this fully done.

oxoxDev · 2026-05-28T15:53:07Z

            text,
            tool_calls,
            usage,
+            reasoning_content: None,


Blocker — chat_with_tools discards choice.message.reasoning_content even though the whole point of the PR is to preserve it across turns. Any thinking-mode model (DeepSeek-R1, Qwen3, GLM-4, Moonshot K2) routed through this path will still 400 on turn 2+ with "thinking mode must be passed back" — precisely the bug PR #2830 added a config_rejection matcher to silence.

The new 6 unit tests cover parse_native_response + convert_messages_for_native (the chat code path) but none drive chat_with_tools, so this gap is invisible to CI. Mirror the parse_native_response extraction at line 795 (let reasoning_content = message.reasoning_content.clone();):

Needed change (in this function, around lines 1746-1750):

// before Ok(ProviderChatResponse { text, tool_calls, usage, reasoning_content: None, }) // after let reasoning_content = choice.message.reasoning_content.clone(); Ok(ProviderChatResponse { text, tool_calls, usage, reasoning_content, })

The single-line suggestion only fixes the field (None → variable); the let extraction has to be added by hand since GitHub can only inline-suggest replacement of the changed line:

Suggested change

reasoning_content: None,

reasoning_content,

Also add a regression-guard test in compatible_tests.rs driving chat_with_tools with a thinking-mode response payload and asserting reasoning_content propagates — otherwise the same gap will resurface in a future refactor.

graycyrus requested a review from a team May 28, 2026 05:25

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. bug labels May 28, 2026

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

coderabbitai Bot approved these changes May 28, 2026

View reviewed changes

oxoxDev self-assigned this May 28, 2026

graycyrus commented May 28, 2026

View reviewed changes

oxoxDev removed their assignment May 28, 2026

oxoxDev mentioned this pull request May 28, 2026

fix(inference): preserve reasoning_content across multi-turn conversations (#2800) #2806

Open

3 tasks

ozpool mentioned this pull request May 28, 2026

fix(observability): demote expected-error Sentry buckets across embeddings, provider, memory-store, FS, and thinking-mode wire shapes #2830

Merged

7 tasks

oxoxDev requested changes May 28, 2026

View reviewed changes

oxoxDev mentioned this pull request May 28, 2026

fix(inference): include model field in nvidia-nim API requests #2791

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): preserve reasoning_content in multi-turn thinking model conversations#2818

fix(inference): preserve reasoning_content in multi-turn thinking model conversations#2818
graycyrus wants to merge 1 commit into
tinyhumansai:mainfrom
graycyrus:worktree-agent-a0a608b8

graycyrus commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

graycyrus left a comment

Uh oh!

oxoxDev left a comment

Uh oh!

oxoxDev May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

graycyrus commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

oxoxDev left a comment

Choose a reason for hiding this comment

Verified / looks good

Out of scope / nitpick

CI

Question

Uh oh!

oxoxDev May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

graycyrus commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading