-
Notifications
You must be signed in to change notification settings - Fork 222
Qwen3Coder unary output parser #3664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
18b9c2c
to
f9b33d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a Qwen3Coder unary output parser to handle tool calls in the Qwen3Coder model format. The parser supports a unique XML-style format for function calls with parameters.
Key changes:
- Adds a new Qwen3Coder tool parser with XML-style tag parsing
- Updates all existing parsers to accept tool schemas as a parameter
- Modifies the parser interface to include tool schema information
Reviewed Changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.hpp |
Header for new Qwen3Coder tool parser with XML-style tag definitions |
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp |
Implementation of Qwen3Coder parser with state machine for XML tag parsing |
src/test/llm/output_parsers/qwen3coder_output_parser_test.cpp |
Comprehensive test suite for the new parser functionality |
src/llm/io_processing/base_output_parser.hpp |
Updates parser interface to include tool schemas parameter |
src/llm/io_processing/output_parser.cpp |
Registers new Qwen3Coder parser and passes tool schemas to parsers |
Multiple parser files | Updates existing parsers to accept tool schemas parameter |
Multiple test files | Updates test calls to include tool schemas parameter |
Comments suppressed due to low confidence (2)
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp:1
- Commented-out member variable should be removed if not needed, or implemented if it serves a purpose.
//*****************************************************************************
src/llm/BUILD:144
- Removed line that may be needed for proper Python compilation options. Verify this removal doesn't break Python integration.
additional_copts = COPTS_PYTHON
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
bba2dd1
to
e095f8b
Compare
src/llm/BUILD
Outdated
"io_processing/gptoss/reasoning_parser.hpp", | ||
"io_processing/gptoss/tool_parser.hpp", | ||
"io_processing/gptoss/harmony.hpp", | ||
"io_processing/qwen3coder/qwen3coder_tool_parser.hpp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This naming breaks the convention. File is in "qwen3coder" catalog, so adding "qwen3coder_" prefix seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for logging purposes -> in spdlog we have filename butif we have tool parser inside qwen3coder dir it does not show there. I would rename all tool parsers this way for faster debugging.
src/llm/apis/openai_completions.cpp
Outdated
choice.AddMember("logprobs", Value(), allocator); | ||
if (endpoint == Endpoint::CHAT_COMPLETIONS) { | ||
if (outputParser != nullptr) { | ||
// FIXME need tool maps for streaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if github issue, but I still see that FIXME here
std::string id; | ||
std::string name; | ||
std::string arguments; | ||
std::string arguments; // JSON "{"a":1, "b":"SOME_STRING"}" TODO rename to know in context that's JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the "TODO" I would prefer to keep that name as it maps exactly to OpenAI response field we need to fill.
Maybe we could have a comment explaining that ToolCall
struct is supposed to mirror tool call structure in OpenAI API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok i see your point. It was confusing for me that it was JSON wrapped as stirng in arguments field initially, so I would at least keep the first part of the comment.
/* | ||
{"type":"response.output_item.added","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_1234xyz","name":"get_weather","arguments":""}} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"{\""} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"location"} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\":\""} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"Paris"} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":","} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":" France"} | ||
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\"}"} | ||
{"type":"response.function_call_arguments.done","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"arguments":"{\"location\":\"Paris, France\"}"} | ||
{"type":"response.output_item.done","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_1234xyz","name":"get_weather","arguments":"{\"location\":\"Paris, France\"}"}} | ||
*/ | ||
// example1 {"location":"San Francisco"} | ||
// example1 {"city":"San Francisco","state":"CA", "length":5, "is_day":true, "temperatures":[5,6,7], "details":{"humidity":80,"condition":"sunny"}}]} | ||
// index is toolCallId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it leftover? Looks like Response API and shows deltas that your parser will not return (arguments broken into smaller pieces)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this initially to underline the difference in qwen3coder parser that we will not do that incremental sending, but it may be confusing so I will remove it.
static std::string documentToString(const rapidjson::Document& doc) { | ||
rapidjson::StringBuffer buffer; | ||
rapidjson::Writer<rapidjson::StringBuffer> writer(buffer); | ||
doc.Accept(writer); | ||
return buffer.GetString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be moved to some common utilities place
std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFullDelta(std::optional<ToolCalls>& toolCallsOpt) { | ||
auto& toolCalls = toolCallsOpt.value(); | ||
if (toolCalls.size() != 1) { | ||
SPDLOG_ERROR("For streaming we expected one tool call, got: {}", toolCalls.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging as error and if we want to treat it as error then maybe we should return nullopt here or throw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldnt crash ovms if we have error in tool parser as this is potential attack surface. I want this in error level so it will be easier potentially to track such occurences. Or we may consider returning error in all parsers but this will be bigger overhaul.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure throwing here crashes OVMS? I would say it will be some generic "Response generation error" on the client side only.
|
||
std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFirstDeltaIfNeeded(const std::string& toolCallName) { | ||
if (this->returnedFirstDeltas.size() != this->returnedCompleteDeltas.size()) { | ||
SPDLOG_TRACE("Skipping first delta, already sent for current function, fi:{} co:{}", returnedFirstDeltas.size(), returnedCompleteDeltas.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fi:{} co:{}
part is not understandable without looking into the code.
} | ||
|
||
std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFirstDeltaIfNeeded(const std::string& toolCallName) { | ||
if (this->returnedFirstDeltas.size() != this->returnedCompleteDeltas.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be this->returnedFirstDeltas.size() > this->returnedCompleteDeltas.size()
?
That would better indicate the connection between first deltas and complete deltas (like we always have either one more first delta returned when we already returned it for current function or the number is equal when we still wait for the first delta for current function). Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will change it to comparison with returnedFirst == (returnedComplete + 1)
0.825 parallel multiple
Accuracy on BFCL simple multiple as in unary
BFCL: parallel_multiple. 🎯 Accuracy: 0.83 simple. 🎯 Accuracy: 0.9575 multiple. 🎯 Accuracy: 0.935
6aebb03
to
55e4cc0
Compare
return toolsStartTag; | ||
} | ||
const std::unordered_set<std::string>& getSpecialParsingStartTags() const override { | ||
static const std::unordered_set<std::string> specialParsingStartTags = {toolsStartTag}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be empty. If qwen3 coder has only one way (one tag) to start tool parsing, it should not be repeated in special tags.
} | ||
// Tools calls are expected to be the last part of the content, so we do not specify an end tag. | ||
const std::string& getParsingEndTag() const override { | ||
return toolsEndTag; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return empty string here as well. Your end tag is also end tag for a single tool call and in that context is parsing end tag - meaning outer parser will switch phase (get out of tool parsing mode) after seeing the string you return here. It works, because you get opening tag right after it when you have multiple tool calls, so outer parser will get back to tool calls processing again, but nevertheless we should change that.
getParsingEndTag() should return tag that indicates end of parser work (like </think>
in reasoning parser), not a separator between tool calls. If there is no definite tag that indicates "no more tool calls", we leave it empty and assume all remaining output is tool calls.
dffd55e
to
1023820
Compare
parametersIt->value.Accept(writer); | ||
std::string parametersStr = buffer.GetString(); | ||
request.toolNameSchemaMap[nameIt->value.GetString()] = parametersStr; | ||
std::pair<rapidjson::Value*, std::string> schemaReprs = {¶metersIt->value, std::move(parametersStr)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be clearer and more extensible if we have it in a struct like:
struct ToolSchema:
rapidjson::Value& rapidjsonRepresentation;
std::string stringRepresentation;
...
(nlohmann::json nlohmannRepresentation) etc.
src/llm/apis/openai_completions.cpp
Outdated
choice.AddMember("logprobs", Value(), allocator); | ||
if (endpoint == Endpoint::CHAT_COMPLETIONS) { | ||
if (outputParser != nullptr) { | ||
// FIXME need tool maps for streaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if github issue, but I still see that FIXME here
|
||
bool reasoningParserExistsAndSupportsStreaming = reasoningParser && !reasoningParser->getParsingStartTag().empty() && !reasoningParser->getParsingEndTag().empty(); | ||
bool toolParserExistsAndSupportsStreaming = toolParser && !toolParser->getParsingStartTag().empty(); | ||
bool toolParserExistsAndSupportsStreaming = toolParser && !toolParser->getParsingStartTag().empty(); // FIXME why not check for parsingEntTag not empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove fixme comment if it is clear
ToolsParameterTypeMap_t toolsParametersTypes; | ||
for (const auto& [toolName, schemaPair] : toolsSchemas) { | ||
SPDLOG_TRACE("Creating tools parameters types for tool: {}, schema: {}", toolName, schemaPair.second); | ||
toolsParametersTypes.emplace(toolName, parseToolSchema(toolName, *schemaPair.first)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure schemaPair is always valid at this point? Shouldn't we check it before access here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this->lazyFillInitToolParamatersTypsMap(); | ||
auto toolCallsOpt = this->streamParser.parseChunk(parsedOutput.content); | ||
if (toolCallsOpt.has_value()) { | ||
// TODO do we want to support not ending in content state? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other parser parsers do not return tool calls if they are not properly finished (in unary mode). We could stick to that for now as changing this behavior would need to be done for all parsers.
// For Qwen3 model we use hermes3 tool parser (due to the same format of generated tool calls) and qwen3 reasoning parser | ||
outputParser = std::make_unique<OutputParser>(*qwen3Tokenizer, "qwen3coder", "", toolsSchemas); | ||
} | ||
std::tuple<ov::Tensor, std::vector<int64_t>, ParsedOutput> doTheWork(const std::string& input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have more precise name here?
} | ||
}; | ||
TEST_F(Qwen3CoderOutputParserTest, Parse1ToolCall1Function1ArgumentTagsNewline) { | ||
std::string input = R"(io_processing/hermes3/generation_config_builder.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this hermes3 path intended here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added by mistake
EXPECT_EQ(parsedOutput.toolCalls[0].arguments, "{\"arg1\": \"<value=abc>value1</value>\"}"); | ||
EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false); | ||
} | ||
// FIXME check if two tool calls is a vali for outputparser as well not only for parser imple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added.
// since unary reuses streaming we don't need to test for partial tool calls | ||
// if we don't get closing tag we don't emit tool call | ||
int i = -1; | ||
// FIXME add content in between tool_calls and test what happens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME -> TODO?
Also can we have case with more than one parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME aready done, will add functin with 2nd parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp:1
- Corrected spelling of 'prametersIt' to 'parametersIt'.
//*****************************************************************************
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
fbd7ae7
to
e648478
Compare
No description provided.