Skip to content

Conversation

atobiszei
Copy link
Collaborator

No description provided.

@atobiszei atobiszei added the WIP Do not merge until resolved label Sep 22, 2025
@atobiszei atobiszei force-pushed the atobisze_qwen3_tool_parser branch from 18b9c2c to f9b33d7 Compare September 22, 2025 08:59
@atobiszei atobiszei requested a review from Copilot September 22, 2025 10:43
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a Qwen3Coder unary output parser to handle tool calls in the Qwen3Coder model format. The parser supports a unique XML-style format for function calls with parameters.

Key changes:

  • Adds a new Qwen3Coder tool parser with XML-style tag parsing
  • Updates all existing parsers to accept tool schemas as a parameter
  • Modifies the parser interface to include tool schema information

Reviewed Changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.hpp Header for new Qwen3Coder tool parser with XML-style tag definitions
src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp Implementation of Qwen3Coder parser with state machine for XML tag parsing
src/test/llm/output_parsers/qwen3coder_output_parser_test.cpp Comprehensive test suite for the new parser functionality
src/llm/io_processing/base_output_parser.hpp Updates parser interface to include tool schemas parameter
src/llm/io_processing/output_parser.cpp Registers new Qwen3Coder parser and passes tool schemas to parsers
Multiple parser files Updates existing parsers to accept tool schemas parameter
Multiple test files Updates test calls to include tool schemas parameter
Comments suppressed due to low confidence (2)

src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp:1

  • Commented-out member variable should be removed if not needed, or implemented if it serves a purpose.
//*****************************************************************************

src/llm/BUILD:144

  • Removed line that may be needed for proper Python compilation options. Verify this removal doesn't break Python integration.
    additional_copts = COPTS_PYTHON

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@atobiszei atobiszei force-pushed the atobisze_qwen3_tool_parser branch from bba2dd1 to e095f8b Compare September 29, 2025 13:23
src/llm/BUILD Outdated
"io_processing/gptoss/reasoning_parser.hpp",
"io_processing/gptoss/tool_parser.hpp",
"io_processing/gptoss/harmony.hpp",
"io_processing/qwen3coder/qwen3coder_tool_parser.hpp",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming breaks the convention. File is in "qwen3coder" catalog, so adding "qwen3coder_" prefix seems redundant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for logging purposes -> in spdlog we have filename butif we have tool parser inside qwen3coder dir it does not show there. I would rename all tool parsers this way for faster debugging.

choice.AddMember("logprobs", Value(), allocator);
if (endpoint == Endpoint::CHAT_COMPLETIONS) {
if (outputParser != nullptr) {
// FIXME need tool maps for streaming
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still relevant?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if github issue, but I still see that FIXME here

std::string id;
std::string name;
std::string arguments;
std::string arguments; // JSON "{"a":1, "b":"SOME_STRING"}" TODO rename to know in context that's JSON
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the "TODO" I would prefer to keep that name as it maps exactly to OpenAI response field we need to fill.
Maybe we could have a comment explaining that ToolCall struct is supposed to mirror tool call structure in OpenAI API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i see your point. It was confusing for me that it was JSON wrapped as stirng in arguments field initially, so I would at least keep the first part of the comment.

Comment on lines 325 to 340
/*
{"type":"response.output_item.added","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_1234xyz","name":"get_weather","arguments":""}}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"{\""}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"location"}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\":\""}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"Paris"}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":","}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":" France"}
{"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\"}"}
{"type":"response.function_call_arguments.done","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"arguments":"{\"location\":\"Paris, France\"}"}
{"type":"response.output_item.done","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_1234xyz","name":"get_weather","arguments":"{\"location\":\"Paris, France\"}"}}
*/
// example1 {"location":"San Francisco"}
// example1 {"city":"San Francisco","state":"CA", "length":5, "is_day":true, "temperatures":[5,6,7], "details":{"humidity":80,"condition":"sunny"}}]}
// index is toolCallId
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it leftover? Looks like Response API and shows deltas that your parser will not return (arguments broken into smaller pieces)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this initially to underline the difference in qwen3coder parser that we will not do that incremental sending, but it may be confusing so I will remove it.

Comment on lines 344 to 350
static std::string documentToString(const rapidjson::Document& doc) {
rapidjson::StringBuffer buffer;
rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
doc.Accept(writer);
return buffer.GetString();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be moved to some common utilities place

std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFullDelta(std::optional<ToolCalls>& toolCallsOpt) {
auto& toolCalls = toolCallsOpt.value();
if (toolCalls.size() != 1) {
SPDLOG_ERROR("For streaming we expected one tool call, got: {}", toolCalls.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging as error and if we want to treat it as error then maybe we should return nullopt here or throw?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldnt crash ovms if we have error in tool parser as this is potential attack surface. I want this in error level so it will be easier potentially to track such occurences. Or we may consider returning error in all parsers but this will be bigger overhaul.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure throwing here crashes OVMS? I would say it will be some generic "Response generation error" on the client side only.


std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFirstDeltaIfNeeded(const std::string& toolCallName) {
if (this->returnedFirstDeltas.size() != this->returnedCompleteDeltas.size()) {
SPDLOG_TRACE("Skipping first delta, already sent for current function, fi:{} co:{}", returnedFirstDeltas.size(), returnedCompleteDeltas.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fi:{} co:{} part is not understandable without looking into the code.

}

std::optional<rapidjson::Document> Qwen3CoderToolParser::sendFirstDeltaIfNeeded(const std::string& toolCallName) {
if (this->returnedFirstDeltas.size() != this->returnedCompleteDeltas.size()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be this->returnedFirstDeltas.size() > this->returnedCompleteDeltas.size() ?
That would better indicate the connection between first deltas and complete deltas (like we always have either one more first delta returned when we already returned it for current function or the number is equal when we still wait for the first delta for current function). Is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will change it to comparison with returnedFirst == (returnedComplete + 1)

TODO: Extract param type from request as discovered by bfcl that
otherwise we can't be sure eg with string ids
Accuracy on BFCL simple multiple as in unary
BFCL:
parallel_multiple. 🎯 Accuracy: 0.83
simple. 🎯 Accuracy: 0.9575
multiple. 🎯 Accuracy: 0.935
@atobiszei atobiszei force-pushed the atobisze_qwen3_tool_parser branch from 6aebb03 to 55e4cc0 Compare October 2, 2025 09:19
return toolsStartTag;
}
const std::unordered_set<std::string>& getSpecialParsingStartTags() const override {
static const std::unordered_set<std::string> specialParsingStartTags = {toolsStartTag};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be empty. If qwen3 coder has only one way (one tag) to start tool parsing, it should not be repeated in special tags.

}
// Tools calls are expected to be the last part of the content, so we do not specify an end tag.
const std::string& getParsingEndTag() const override {
return toolsEndTag;
Copy link
Collaborator

@mzegla mzegla Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return empty string here as well. Your end tag is also end tag for a single tool call and in that context is parsing end tag - meaning outer parser will switch phase (get out of tool parsing mode) after seeing the string you return here. It works, because you get opening tag right after it when you have multiple tool calls, so outer parser will get back to tool calls processing again, but nevertheless we should change that.

getParsingEndTag() should return tag that indicates end of parser work (like </think> in reasoning parser), not a separator between tool calls. If there is no definite tag that indicates "no more tool calls", we leave it empty and assume all remaining output is tool calls.

@atobiszei atobiszei removed the WIP Do not merge until resolved label Oct 2, 2025
@atobiszei atobiszei force-pushed the atobisze_qwen3_tool_parser branch from dffd55e to 1023820 Compare October 3, 2025 13:29
@atobiszei atobiszei requested a review from dtrawins October 3, 2025 13:57
parametersIt->value.Accept(writer);
std::string parametersStr = buffer.GetString();
request.toolNameSchemaMap[nameIt->value.GetString()] = parametersStr;
std::pair<rapidjson::Value*, std::string> schemaReprs = {&parametersIt->value, std::move(parametersStr)};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be clearer and more extensible if we have it in a struct like:

struct ToolSchema:
	rapidjson::Value& rapidjsonRepresentation;
	std::string stringRepresentation;
	...
	(nlohmann::json nlohmannRepresentation) etc.

choice.AddMember("logprobs", Value(), allocator);
if (endpoint == Endpoint::CHAT_COMPLETIONS) {
if (outputParser != nullptr) {
// FIXME need tool maps for streaming
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if github issue, but I still see that FIXME here


bool reasoningParserExistsAndSupportsStreaming = reasoningParser && !reasoningParser->getParsingStartTag().empty() && !reasoningParser->getParsingEndTag().empty();
bool toolParserExistsAndSupportsStreaming = toolParser && !toolParser->getParsingStartTag().empty();
bool toolParserExistsAndSupportsStreaming = toolParser && !toolParser->getParsingStartTag().empty(); // FIXME why not check for parsingEntTag not empty?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove fixme comment if it is clear

ToolsParameterTypeMap_t toolsParametersTypes;
for (const auto& [toolName, schemaPair] : toolsSchemas) {
SPDLOG_TRACE("Creating tools parameters types for tool: {}, schema: {}", toolName, schemaPair.second);
toolsParametersTypes.emplace(toolName, parseToolSchema(toolName, *schemaPair.first));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure schemaPair is always valid at this point? Shouldn't we check it before access here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this->lazyFillInitToolParamatersTypsMap();
auto toolCallsOpt = this->streamParser.parseChunk(parsedOutput.content);
if (toolCallsOpt.has_value()) {
// TODO do we want to support not ending in content state?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other parser parsers do not return tool calls if they are not properly finished (in unary mode). We could stick to that for now as changing this behavior would need to be done for all parsers.

// For Qwen3 model we use hermes3 tool parser (due to the same format of generated tool calls) and qwen3 reasoning parser
outputParser = std::make_unique<OutputParser>(*qwen3Tokenizer, "qwen3coder", "", toolsSchemas);
}
std::tuple<ov::Tensor, std::vector<int64_t>, ParsedOutput> doTheWork(const std::string& input) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have more precise name here?

}
};
TEST_F(Qwen3CoderOutputParserTest, Parse1ToolCall1Function1ArgumentTagsNewline) {
std::string input = R"(io_processing/hermes3/generation_config_builder.cpp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this hermes3 path intended here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added by mistake

EXPECT_EQ(parsedOutput.toolCalls[0].arguments, "{\"arg1\": \"<value=abc>value1</value>\"}");
EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false);
}
// FIXME check if two tool calls is a vali for outputparser as well not only for parser imple
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added.

// since unary reuses streaming we don't need to test for partial tool calls
// if we don't get closing tag we don't emit tool call
int i = -1;
// FIXME add content in between tool_calls and test what happens
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME -> TODO?
Also can we have case with more than one parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME aready done, will add functin with 2nd parameter

@mzegla mzegla requested a review from Copilot October 6, 2025 10:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/llm/io_processing/qwen3coder/qwen3coder_tool_parser.cpp:1

  • Corrected spelling of 'prametersIt' to 'parametersIt'.
//*****************************************************************************

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@atobiszei atobiszei force-pushed the atobisze_qwen3_tool_parser branch from fbd7ae7 to e648478 Compare October 6, 2025 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants