-
Notifications
You must be signed in to change notification settings - Fork 222
Streaming for Phi4 tool parser #3673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
62e82ee
to
38706c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds streaming functionality to the Phi4 tool parser, enabling real-time processing of tool calls during generation rather than waiting for complete output.
Key changes:
- Implements streaming logic with state machine in
Phi4ToolParser::parseChunk()
- Adds comprehensive test coverage for streaming scenarios including holistic streaming and edge cases
- Enhances
PartialJsonBuilder
with completion detection and buffer management methods
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
src/llm/io_processing/phi4/tool_parser.hpp |
Adds streaming state machine members and enums to track parsing progress |
src/llm/io_processing/phi4/tool_parser.cpp |
Implements complete streaming logic with 4-state parsing and JSON delta generation |
src/llm/io_processing/partial_json_builder.hpp |
Adds methods to check completion status and retrieve unprocessed buffer |
src/llm/io_processing/partial_json_builder.cpp |
Implements completion detection and buffer management functionality |
src/test/llm/output_parsers/phi4_output_parser_test.cpp |
Adds comprehensive streaming tests covering various scenarios and edge cases |
src/llm/io_processing/hermes3/tool_parser.cpp |
Minor comment clarification about closing brace handling |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
} else { | ||
return parseChunk(remainingChunk, finishReason); | ||
} | ||
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment indicates this condition should only occur when chunk length equals tag length, but the code structure suggests this could also handle cases where the chunk is shorter than the tag. Consider adding an assertion or clearer logic to handle the case where modifiedChunk.length() < parsingStartTag.length()
.
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller | |
} else { // modifiedChunk.length() == parsingStartTag.length() as at this state, chunk cannot be smaller | |
if (modifiedChunk.length() < parsingStartTag.length()) { | |
SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Unexpected state: modifiedChunk.length() < parsingStartTag.length() ({} < {}). This may indicate a logic error or malformed input.", modifiedChunk.length(), parsingStartTag.length()); | |
} |
Copilot uses AI. Check for mistakes.
for (char c : modifiedChunk) { | ||
if (c == '{') { | ||
openBracesCount++; | ||
} else if (c == '}') { | ||
openBracesCount--; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment indicates a known limitation in the brace counting logic that could lead to incorrect parsing when JSON string values contain unescaped braces. Consider implementing proper JSON-aware brace counting to avoid potential parsing errors.
for (char c : modifiedChunk) { | |
if (c == '{') { | |
openBracesCount++; | |
} else if (c == '}') { | |
openBracesCount--; | |
} | |
// JSON-aware brace counting: only count braces outside of string values | |
bool inString = false; | |
bool escape = false; | |
for (char c : modifiedChunk) { | |
if (escape) { | |
escape = false; | |
continue; | |
} | |
if (c == '\\') { | |
escape = true; | |
} else if (c == '"') { | |
inString = !inString; | |
} else if (!inString) { | |
if (c == '{') { | |
openBracesCount++; | |
} else if (c == '}') { | |
openBracesCount--; | |
} | |
} |
Copilot uses AI. Check for mistakes.
// Remove potential escape characters added in arguments processing logic from the unprocessedBuffer as we move to the next tool call | ||
unprocessedBuffer.erase( | ||
std::remove(unprocessedBuffer.begin(), unprocessedBuffer.end(), '\\'), | ||
unprocessedBuffer.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing all backslash characters from the unprocessed buffer could corrupt legitimate escape sequences in JSON strings. This approach may break valid JSON content that contains escaped characters like \\n
, \\\"
, or \\\\
.
// Remove potential escape characters added in arguments processing logic from the unprocessedBuffer as we move to the next tool call | |
unprocessedBuffer.erase( | |
std::remove(unprocessedBuffer.begin(), unprocessedBuffer.end(), '\\'), | |
unprocessedBuffer.end()); | |
// If needed, clean up unprocessedBuffer here, but do not remove all backslashes as this may corrupt valid JSON escape sequences. | |
// (Blanket backslash removal removed to preserve valid JSON content.) |
Copilot uses AI. Check for mistakes.
if (currentPosition == buffer.size()) { | ||
doc.Parse(buffer.c_str()); | ||
} else { | ||
doc.Parse(buffer.c_str(), currentPosition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Parse
method is being called with a length parameter, but RapidJSON's Parse(const char*, size_t)
expects the second parameter to be the length of the string to parse. Consider using buffer.substr(0, currentPosition)
or verify that this usage is correct for the intended JSON parsing behavior.
doc.Parse(buffer.c_str(), currentPosition); | |
std::string jsonSubstring = buffer.substr(0, currentPosition); | |
doc.Parse(jsonSubstring.c_str(), jsonSubstring.size()); |
Copilot uses AI. Check for mistakes.
78b0df0
to
e5b46c9
Compare
auto endIt = buffer.end(); | ||
|
||
for (auto it = beginIt; it != endIt; ++it, currentPosition++) { | ||
for (auto it = beginIt; it != endIt && state != IteratorState::END; ++it, ++currentPosition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we change implementation in partial json builder, aren't we missing some unit tests of partial json buidler that would uncover those gaps?
lastJson.Clear(); | ||
jsonBuilder.clear(); | ||
toolCallIndex++; | ||
argumentsQuotesOpened = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be wrapped into helper function?
reset_state?
for (char c : modifiedChunk) { | ||
if (c == '{') { | ||
openBracesCount++; | ||
} else if (c == '}') { | ||
openBracesCount--; | ||
if (openBracesCount == 0) { | ||
break; // No need to count further if we balanced the braces | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helper function?
updateOpenBracesCount?
modifiedChunk.insert(lastClosingBrace, "\""); | ||
} | ||
} else if (openBracesCount == 0) { | ||
// If we balanced the braces, we are at the end of the tool call object, so we add closing quote before the last closing brace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper function
handleEndOfToolCall?
std::string functionName; | ||
if (lastJson.HasMember("name") && lastJson["name"].IsString()) { | ||
functionName = lastJson["name"].GetString(); | ||
} else if (newJson.HasMember("name") && newJson["name"].IsString()) { | ||
// We received big chunk with both full function name and arguments, so we get function name from the new JSON | ||
functionName = newJson["name"].GetString(); | ||
} else { | ||
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool call name has not been generated and arguments already started"); | ||
throw std::runtime_error("Tool call name is missing in generated output"); | ||
} | ||
// Wrap first delta in {"tool_calls":[{"id":<id>,"type":"function","index":<toolCallIndex>,"function":{"name": <functionName>}}]} | ||
doc = wrapFirstDelta(functionName, toolCallIndex); | ||
lastJson.CopyFrom(newJson, lastJson.GetAllocator()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helper:
sendArgumentsFirstTime?
|
||
// Before we have 'arguments' in the JSON, we do not want to process both key and value in the same call due to special handling of arguments value. | ||
// We look for colon after 'arguments' key and move everything after it to unprocessedBuffer to be processed in the next call. | ||
if (!lastJson.HasMember("arguments")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we wrap this:
lastJson.HasMember("arguments") as sth more meaningfull? I presume this means:
weAlreadySentDeltaWithArguments?
No description provided.