Skip to content

Conversation

mzegla
Copy link
Collaborator

@mzegla mzegla commented Sep 30, 2025

No description provided.

@mzegla mzegla force-pushed the phi4_tool_streaming branch from 62e82ee to 38706c5 Compare September 30, 2025 13:31
@mzegla mzegla requested a review from Copilot September 30, 2025 13:34
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds streaming functionality to the Phi4 tool parser, enabling real-time processing of tool calls during generation rather than waiting for complete output.

Key changes:

  • Implements streaming logic with state machine in Phi4ToolParser::parseChunk()
  • Adds comprehensive test coverage for streaming scenarios including holistic streaming and edge cases
  • Enhances PartialJsonBuilder with completion detection and buffer management methods

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/llm/io_processing/phi4/tool_parser.hpp Adds streaming state machine members and enums to track parsing progress
src/llm/io_processing/phi4/tool_parser.cpp Implements complete streaming logic with 4-state parsing and JSON delta generation
src/llm/io_processing/partial_json_builder.hpp Adds methods to check completion status and retrieve unprocessed buffer
src/llm/io_processing/partial_json_builder.cpp Implements completion detection and buffer management functionality
src/test/llm/output_parsers/phi4_output_parser_test.cpp Adds comprehensive streaming tests covering various scenarios and edge cases
src/llm/io_processing/hermes3/tool_parser.cpp Minor comment clarification about closing brace handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

} else {
return parseChunk(remainingChunk, finishReason);
}
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller
Copy link
Preview

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates this condition should only occur when chunk length equals tag length, but the code structure suggests this could also handle cases where the chunk is shorter than the tag. Consider adding an assertion or clearer logic to handle the case where modifiedChunk.length() < parsingStartTag.length().

Suggested change
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller
if (modifiedChunk.length() < parsingStartTag.length()) {
SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Unexpected state: modifiedChunk.length() < parsingStartTag.length() ({} < {}). This may indicate a logic error or malformed input.", modifiedChunk.length(), parsingStartTag.length());
}

Copilot uses AI. Check for mistakes.

Comment on lines 186 to 213
for (char c : modifiedChunk) {
if (c == '{') {
openBracesCount++;
} else if (c == '}') {
openBracesCount--;
}
Copy link
Preview

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment indicates a known limitation in the brace counting logic that could lead to incorrect parsing when JSON string values contain unescaped braces. Consider implementing proper JSON-aware brace counting to avoid potential parsing errors.

Suggested change
for (char c : modifiedChunk) {
if (c == '{') {
openBracesCount++;
} else if (c == '}') {
openBracesCount--;
}
// JSON-aware brace counting: only count braces outside of string values
bool inString = false;
bool escape = false;
for (char c : modifiedChunk) {
if (escape) {
escape = false;
continue;
}
if (c == '\\') {
escape = true;
} else if (c == '"') {
inString = !inString;
} else if (!inString) {
if (c == '{') {
openBracesCount++;
} else if (c == '}') {
openBracesCount--;
}
}

Copilot uses AI. Check for mistakes.

Comment on lines +273 to +290
// Remove potential escape characters added in arguments processing logic from the unprocessedBuffer as we move to the next tool call
unprocessedBuffer.erase(
std::remove(unprocessedBuffer.begin(), unprocessedBuffer.end(), '\\'),
unprocessedBuffer.end());
Copy link
Preview

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing all backslash characters from the unprocessed buffer could corrupt legitimate escape sequences in JSON strings. This approach may break valid JSON content that contains escaped characters like \\n, \\\", or \\\\.

Suggested change
// Remove potential escape characters added in arguments processing logic from the unprocessedBuffer as we move to the next tool call
unprocessedBuffer.erase(
std::remove(unprocessedBuffer.begin(), unprocessedBuffer.end(), '\\'),
unprocessedBuffer.end());
// If needed, clean up unprocessedBuffer here, but do not remove all backslashes as this may corrupt valid JSON escape sequences.
// (Blanket backslash removal removed to preserve valid JSON content.)

Copilot uses AI. Check for mistakes.

if (currentPosition == buffer.size()) {
doc.Parse(buffer.c_str());
} else {
doc.Parse(buffer.c_str(), currentPosition);
Copy link
Preview

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Parse method is being called with a length parameter, but RapidJSON's Parse(const char*, size_t) expects the second parameter to be the length of the string to parse. Consider using buffer.substr(0, currentPosition) or verify that this usage is correct for the intended JSON parsing behavior.

Suggested change
doc.Parse(buffer.c_str(), currentPosition);
std::string jsonSubstring = buffer.substr(0, currentPosition);
doc.Parse(jsonSubstring.c_str(), jsonSubstring.size());

Copilot uses AI. Check for mistakes.

@mzegla mzegla force-pushed the phi4_tool_streaming branch from 78b0df0 to e5b46c9 Compare October 6, 2025 13:21
@mzegla mzegla requested review from dtrawins and atobiszei October 6, 2025 15:01
auto endIt = buffer.end();

for (auto it = beginIt; it != endIt; ++it, currentPosition++) {
for (auto it = beginIt; it != endIt && state != IteratorState::END; ++it, ++currentPosition) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we change implementation in partial json builder, aren't we missing some unit tests of partial json buidler that would uncover those gaps?

Comment on lines +174 to +177
lastJson.Clear();
jsonBuilder.clear();
toolCallIndex++;
argumentsQuotesOpened = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be wrapped into helper function?
reset_state?

Comment on lines +205 to +213
for (char c : modifiedChunk) {
if (c == '{') {
openBracesCount++;
} else if (c == '}') {
openBracesCount--;
if (openBracesCount == 0) {
break; // No need to count further if we balanced the braces
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper function?
updateOpenBracesCount?

modifiedChunk.insert(lastClosingBrace, "\"");
}
} else if (openBracesCount == 0) {
// If we balanced the braces, we are at the end of the tool call object, so we add closing quote before the last closing brace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper function
handleEndOfToolCall?

Comment on lines +266 to +278
std::string functionName;
if (lastJson.HasMember("name") && lastJson["name"].IsString()) {
functionName = lastJson["name"].GetString();
} else if (newJson.HasMember("name") && newJson["name"].IsString()) {
// We received big chunk with both full function name and arguments, so we get function name from the new JSON
functionName = newJson["name"].GetString();
} else {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool call name has not been generated and arguments already started");
throw std::runtime_error("Tool call name is missing in generated output");
}
// Wrap first delta in {"tool_calls":[{"id":<id>,"type":"function","index":<toolCallIndex>,"function":{"name": <functionName>}}]}
doc = wrapFirstDelta(functionName, toolCallIndex);
lastJson.CopyFrom(newJson, lastJson.GetAllocator());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper:
sendArgumentsFirstTime?


// Before we have 'arguments' in the JSON, we do not want to process both key and value in the same call due to special handling of arguments value.
// We look for colon after 'arguments' key and move everything after it to unprocessedBuffer to be processed in the next call.
if (!lastJson.HasMember("arguments")) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we wrap this:
lastJson.HasMember("arguments") as sth more meaningfull? I presume this means:
weAlreadySentDeltaWithArguments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants