-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
[Frontend] Adding the "User Defined Custom Tool Calling" parser for the Llama models #12752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Frontend] Adding the "User Defined Custom Tool Calling" parser for the Llama models #12752
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Hi @lulmer , |
This pull request has merge conflicts that must be resolved before it can be |
Thank you for your indications @paolovic, I didn't noticed I needed to address linting issues, I updated my branch accordingly as it was simple. I rebased with the current branch as well. |
Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Marko Rosenmueller <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Alexei V. Ivanov <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Alexander Matveev <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Alexander Matveev <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Chengji Yao <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Matthew Vine <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: ElizaWszola <[email protected]> Signed-off-by: ElizaWszola <[email protected]> Co-authored-by: Lucas Wilkinson <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Chenyaaang <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Varun Sundar Rabindranath <[email protected]> Co-authored-by: Varun Sundar Rabindranath <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: weizeng <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Mengqing Cao <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Gregory Shtrasberg <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
…QS (vllm-project#15583) Signed-off-by: Chengji Yao <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
…in V1 (vllm-project#15556) Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: [email protected] <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Bella kira <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Louis Ulmer <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
@paolovic Thank you, I followed the procedure you've pinpointed, now the DCO is passing, but a lot of unrelated files/labels have been added in this PR, I had to solve a merge conflict (I systematically chose the latest updates from main branch) and add back pre commits for |
Hi @lulmer ,
|
I ensured that I synced my fork when I did the rebase (If you see the history, I got told by Mergebot), on this commit When I go to my fork on the github UI, the right tab sync fork says my fork is up to date with the latest main branch, see below. |
@paolovic is there any additional steps I should do now ? |
@lulmer tbh: i would close this PR and create a new, clean one where furthermore, fix the failing checks. without them it won't be merged. FYI: I cannot and won't do the code review. |
@paolovic finally, I managed to remove the changes from |
@lulmer nice, super cool! By the way could you provide an example for me how to use it? edit: I guess something like this vllm serve --model meta/llama... \
--chat-template examples/usr_defined... \
--enable-auto-tool-choice --tool-call-parser llama |
This is a way to use it ! I always developed it as an external plugin but I would assume it can be launched this way: vllm serve meta-llama/Llama-3.1-8B-Instruct \
--enable-auto-tool-choice \
--tool-call-parser llama3_user_defined_custom \
--chat-template examples/tool_chat_template_llama3.1_usr_def_tool_call.jinja |
This pull request has merge conflicts that must be resolved before it can be |
Description
The current Llama tool parsing in vLLM is based on the JSON based tool calling using the procedure given by Meta. However, another tool parsing strategy is mentioned on this same website : The User Defined Custom Tool Calling.
The gain is substantial : After testing this approach as a plugin on a private function calling benchmark (more than 120 different scenarios tested with a set of 30 complex and lengthy fintech tool definitions), I observed significantly higher function-calling accuracy compared to the current JSON-based tool parser. I also run some experiments on the BFCL benchmark (AST non-live bench) and could observe the same types of improvements:
This PR introduces a new
Llama3UserDefinedCustomToolParser
class that extends theToolParser
base class. The new parser allows for streaming support when using custom tools with the Llama models. It handles the extraction of tool calls and arguments from the model's response in streaming too, enabling real-time processing of tool calls.The flow looks like this :

Main Changes
Llama3UserDefinedCustomToolParser
class is added to handle streaming tool calls for Llama models.vllm/examples/tool_chat_template_llama3.1_usr_def_tool_call.jinja
Remarks
This is my first PR on the vLLM project and I believe there is still some stuff I need guidance on :