Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool usage support in tokenizers for Agentic RL #2821

Open
August-murr opened this issue Feb 10, 2025 · 1 comment
Open

Tool usage support in tokenizers for Agentic RL #2821

August-murr opened this issue Feb 10, 2025 · 1 comment
Labels
✨ enhancement New feature or request 🏋 GRPO Related to GRPO

Comments

@August-murr
Copy link
Collaborator

as explained by @Rocketknight1 in tools use, unified,all newer models were expected to come with tool use supported by their tokenizer with XML tags like <tool>, but that's not the case.

Qwens chat template does support it:

<|im_start|>system
You are a helpful assistant.

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_current_temperature", "description": "Gets the temperature at a given location.", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The location to get the temperature for"}}, "required": ["location"]}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
What is the current temprature in london?<|im_end|>
<|im_start|>assistant

while R1 doesn't:

<beginofsentence><User>Whats the current temprature in london?<Assistant><think>

and some other models like Llama do create JSON schemas but have misleading prompts and don't prompt the model to use the XML tags:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024

<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.Do not use variables.

{
    "type": "function",
    "function": {
        "name": "get_current_temperature",
        "description": "Gets the temperature at a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location to get the temperature for"
                }
            },
            "required": [
                "location"
            ]
        }
    }
}

Whats the current temprature in london?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

This situation could lead to complications when training agents, particularly when it comes to using the trained agents afterward.

It is essential that:

  1. The tokenizer includes JSON schemas of the tools in the system prompt if tool usage is not supported.
  2. The system prompt clearly specifies how to use XML tags to call functions.

any ideas on how to implement this?

@github-actions github-actions bot added ✨ enhancement New feature or request 🏋 SFT Related to SFT labels Feb 10, 2025
@Rocketknight1
Copy link
Member

Not all models are expected to support tool use! When they do support tool use, we encourage support for that in their chat template, but I'm not sure if models like Deepseek-R1 are trained to use tools.

cc @aymeric-roucher for agentic workflows, though!

@August-murr August-murr added 🏋 GRPO Related to GRPO and removed 🏋 SFT Related to SFT labels Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🏋 GRPO Related to GRPO
Projects
None yet
Development

No branches or pull requests

2 participants