fix: handle thinking model reasoning-only responses as valid content#11486
Open
neorrk wants to merge 4 commits intoRooCodeInc:mainfrom
Open
fix: handle thinking model reasoning-only responses as valid content#11486neorrk wants to merge 4 commits intoRooCodeInc:mainfrom
neorrk wants to merge 4 commits intoRooCodeInc:mainfrom
Conversation
Thinking models (Kimi K2.5, DeepSeek-R1, QwQ) may produce only reasoning_content with no regular text content or tool calls. Previously this was treated as an empty/failed response, triggering the "language model did not provide any assistant messages" error and unnecessary retries. Changes: - Task.ts: Include reasoningMessage in the content check so reasoning-only responses are recognized as valid model output instead of triggering the empty response error path - openai.ts: Handle both "reasoning_content" (standard) and "reasoning" (Ollama /v1/) field names in streaming responses - openai.ts: Yield reasoning content from non-streaming responses Fixes RooCodeInc#10603 Related: RooCodeInc#9959, RooCodeInc#10064, RooCodeInc#9551 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive tests verifying correct handling of Ollama's non-standard streaming format used by thinking models (kimi-k2.5, etc.): - NativeToolCallParser: Test single-chunk tool calls with non-standard IDs (functions.read_file:0), finalization via both finalizeRawChunks and processFinishReason, multiple sequential tool calls - OpenAI handler: Test reasoning_content and reasoning field extraction, single-chunk tool call with non-standard ID, non-streaming response with reasoning and tool calls These tests verify the fix from the previous commit works correctly with Ollama's /v1/ endpoint behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Both previously flagged issues are now resolved. The
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
The first content gate (line 3416) was updated to include hasReasoningContent, but the second gate (line 3567) was not. This caused reasoning-only responses to save an assistant message to history but then fall through to the "no assistant messages" error path, incrementing the retry counter and leaving an orphaned assistant message in conversation history. Both gates now consistently check: if (hasTextContent || hasToolUses || hasReasoningContent) Addresses review feedback from @roomote. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a thinking model returns only reasoning_content (no text, no tool calls), assistantMessageContent is empty. Since no content blocks exist, presentAssistantMessage is never called and userMessageContentReady is never set to true, causing pWaitFor to block indefinitely. The fix sets userMessageContentReady = true directly when assistantMessageContent is empty, allowing the flow to proceed to the didToolUse check which will correctly prompt the model to use tools. Addresses second review from @roomote. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Thinking models (Kimi K2.5, DeepSeek-R1, QwQ, etc.) often produce responses containing only
reasoning_contentwith no regular textcontentortool_calls. This triggers the "The language model did not provide any assistant messages" error and unnecessary retries, even though the model did respond — just entirely in reasoning tokens.This PR fixes five issues:
1. Reasoning-only responses treated as empty (
Task.ts— first content gate)When a thinking model returns only
reasoning_content(no text, no tool calls),hasTextContentisfalseandhasToolUsesisfalse, causing the response to be treated as completely empty. The fix addshasReasoningContentto the validity check at line 3416 so reasoning-only responses are recognized as valid model output.2. Second content gate missing
hasReasoningContent(Task.ts— line 3567)The first content gate was updated to include
hasReasoningContent, but the second gate at line 3567 was not. This caused reasoning-only responses to save an assistant message to history but then fall through to the "no assistant messages" error path, incrementingconsecutiveNoAssistantMessagesCountand leaving an orphaned assistant message in conversation history. Both gates now consistently checkhasReasoningContent.3.
pWaitForhang on reasoning-only responses (Task.ts)When only reasoning content is present,
assistantMessageContentis empty andpresentAssistantMessageis never called, souserMessageContentReadystaysfalse— causingpWaitForto block forever. The fix setsuserMessageContentReady = truedirectly whenassistantMessageContentis empty, allowing the flow to proceed to thedidToolUsecheck which will correctly prompt the model to use tools.4. Missing
reasoningfield support in OpenAI streaming handler (openai.ts)Ollama's
/v1/chat/completionsendpoint sends reasoning tokens under thereasoningfield (notreasoning_content). The streaming handler only checked forreasoning_content, so users connecting through Ollama would lose all thinking output. The fix handles both field names.5. Missing reasoning in non-streaming responses (
openai.ts)The non-streaming path didn't yield
reasoning_contentorreasoningat all, so thinking model responses via non-streaming mode would lose their reasoning output.How It Was Discovered
Running Kimi K2.5 through an Ollama backend via the OpenAI Compatible provider. The model frequently returns reasoning + tool_calls with zero
contenttext. Without this fix, every such response triggers the empty response error and retry loop, making the model unusable.Test Plan
hasReasoningContentpWaitFordoes not hang whenassistantMessageContentis empty (reasoning-only response)/v1/chat/completions— reasoning-only responses no longer trigger "no assistant messages" error or hangFixes #10603
Related: #9959, #10064, #9551
🤖 Generated with Claude Code