Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with agents not calling the appropriate tools #178

Open
roaguirre opened this issue Mar 15, 2025 · 9 comments
Open

How to deal with agents not calling the appropriate tools #178

roaguirre opened this issue Mar 15, 2025 · 9 comments
Labels
question Question about using the SDK

Comments

@roaguirre
Copy link

roaguirre commented Mar 15, 2025

Question

I'm building agents that run within Github actions (repo).

I have the problem that the agents do not always call the appropriate tools, even when the instructions request them to do so.

Here's an example:

class PRReviewResponse(BaseModel):
    """Response model for PR Review agent."""

    summary: str = Field(description="A summary of the changes in the PR")
    code_quality: str = Field(description="Assessment of code quality")
    issues: List[str] = Field(description="List of potential issues or bugs found")
    suggestions: List[str] = Field(description="List of suggestions for improvement")
    assessment: str = Field(
        description="Overall assessment (approve, request changes, comment)"
    )
    review_event: PRReviewEvent = Field(
        description="The review event type to use for the PR review"
    )

instructions = """
  You are a reviewer who helps analyze GitHub pull requests.

  Your task is to use the tools provided to review the PR files, analyze the code changes, and provide:
  1. A summary of the changes
  2. Code quality assessment
  3. Potential issues or bugs
  4. Suggestions for improvement
  5. Overall assessment (APPROVE, REQUEST_CHANGES, COMMENT)

  Always provide constructive feedback with specific examples and suggestions.

  IMPORTANT (Follow these steps in order):
  1. You MUST use the get_pull_request tool to get information about the PR.
  2. You MUST use the get_pull_request_files tool to fetch the diff of the files in the PR.
  3. You can use the get_repository_file_content tool to get more context about the files in the PR.
  4. You can use the search_code tool to search for code in the repository.
  5. You MUST call the create_pull_request_review tool to submit your review.
  """

tools = [
    get_pull_request,
    get_pull_request_files,
    get_repository_info,
    get_repository_file_content,
    search_code,
    create_pull_request_review,
]

agent = Agent(
        name="PR Review Agent",
        instructions=instructions,
        tools=tools,
        model=model,
        output_type=PRReviewResponse,
    )

There are instances where the agent only runs the create_pull_request_review tool and hallucinates the PR changes (I'm using o3-mini).

Any suggestions?

I had success by removing the specific tools from the agent, running them manually, and providing the tool response as part of the input message. Still, I'd prefer if the agent were to follow the instructions.

When the tools are wrapped with the @function_tool decorator we can only get a string response from the original function (when calling FunctionTool.on_invoke_tool), so this workaround is still odd

@roaguirre roaguirre added the question Question about using the SDK label Mar 15, 2025
@rohan-mehta
Copy link

You can use tool_choice for that. Setting it to required means the agent has to use a tool (any tool it chooses), or you can set it to a specific tool name to force it to use that tool.

@Muhammadzainattiq
Copy link

The best way is to put a much detailed doc string in the tool functions. Because that doc string is passed to the llm along with the tool and it tells llm the functionality of that tool and when to call it and also about the required arguments to call that particular tool.
Additionally, the tool calling capability also depends upon the model you are using. Some models are very good at it, and others struggle with calling tools. Here is the leaderboard:

https://gorilla.cs.berkeley.edu/leaderboard.html

So try with a different model which is good at it.

@roaguirre
Copy link
Author

You can use tool_choice for that. Setting it to required means the agent has to use a tool (any tool it chooses), or you can set it to a specific tool name to force it to use that tool.

Thank you for the suggestion. I tried it and there's the issue where it eventually gets into an infinite loop of the model always calling the create_pull_request_review tool.

Do you know how I should set it up for the example I shared?

@roaguirre
Copy link
Author

roaguirre commented Mar 16, 2025

The best way is to put a much detailed doc string in the tool functions. Because that doc string is passed to the llm along with the tool and it tells llm the functionality of that tool and when to call it and also about the required arguments to call that particular tool. Additionally, the tool calling capability also depends upon the model you are using. Some models are very good at it, and others struggle with calling tools. Here is the leaderboard:

https://gorilla.cs.berkeley.edu/leaderboard.html

So try with a different model which is good at it.

Thanks a lot for your response. I have been improving the tool documentation and it has helped greatly.
Also, the leaderboard shows me that I should switch from o3-mini to 4o-mini.

Still, it will not ensure the right tools are always called.

It would be nice to have a method in the Agent or Runner class to force a tool call.

@Muhammadzainattiq
Copy link

You can use tool_choice for that. Setting it to required means the agent has to use a tool (any tool it chooses), or you can set it to a specific tool name to force it to use that tool.

Thank you for the suggestion. I tried it and there's the issue where it eventually gets into an infinite loop of the model always calling the create_pull_request_review tool.

Do you know how I should set it up for the example I shared?

You shouldn't use tool_choice for the example mentioned because it's only suitable for the cases when you want your llm to always call some tool. But that's not your case. You want to call different tools for different prompts.

@Muhammadzainattiq
Copy link

The best way is to put a much detailed doc string in the tool functions. Because that doc string is passed to the llm along with the tool and it tells llm the functionality of that tool and when to call it and also about the required arguments to call that particular tool. Additionally, the tool calling capability also depends upon the model you are using. Some models are very good at it, and others struggle with calling tools. Here is the leaderboard:
https://gorilla.cs.berkeley.edu/leaderboard.html
So try with a different model which is good at it.

Thanks a lot for your response. I have been improving the tool documentation and it has helped greatly. Also, the leaderboard shows me that I should switch from o3-mini to 4o-mini.

Still, it will not ensure the right tools are always called.

It would be nice to have a method in the Agent or Runner class to force a tool call.

What do you meant by force a tool call? Don't you want the LLM to decide which tool to call?

@roaguirre
Copy link
Author

roaguirre commented Mar 17, 2025

The best way is to put a much detailed doc string in the tool functions. Because that doc string is passed to the llm along with the tool and it tells llm the functionality of that tool and when to call it and also about the required arguments to call that particular tool. Additionally, the tool calling capability also depends upon the model you are using. Some models are very good at it, and others struggle with calling tools. Here is the leaderboard:
https://gorilla.cs.berkeley.edu/leaderboard.html
So try with a different model which is good at it.

Thanks a lot for your response. I have been improving the tool documentation and it has helped greatly. Also, the leaderboard shows me that I should switch from o3-mini to 4o-mini.
Still, it will not ensure the right tools are always called.
It would be nice to have a method in the Agent or Runner class to force a tool call.

What do you meant by force a tool call? Don't you want the LLM to decide which tool to call?

By that, I mean feeding the tool call/response to the conversation. In my example, I'd prepend the tool response from get_pull_request_files to the conversation, since the agent will always need the files to review the PR. Not sure if it would work though since the LLM wouldn't have requested the tool call first.

@Muhammadzainattiq
Copy link

The best way is to put a much detailed doc string in the tool functions. Because that doc string is passed to the llm along with the tool and it tells llm the functionality of that tool and when to call it and also about the required arguments to call that particular tool. Additionally, the tool calling capability also depends upon the model you are using. Some models are very good at it, and others struggle with calling tools. Here is the leaderboard:
https://gorilla.cs.berkeley.edu/leaderboard.html
So try with a different model which is good at it.

Thanks a lot for your response. I have been improving the tool documentation and it has helped greatly. Also, the leaderboard shows me that I should switch from o3-mini to 4o-mini.
Still, it will not ensure the right tools are always called.
It would be nice to have a method in the Agent or Runner class to force a tool call.

What do you meant by force a tool call? Don't you want the LLM to decide which tool to call?

By that, I mean feeding the tool call/response to the conversation. In my example, I'd prepend the tool response from get_pull_request_files to the conversation, since the agent will always need the files to review the PR. Not sure if it would work though since the LLM wouldn't have requested the tool call first.

but the agent will automatically append the tool message to the conversation. You have to make sure that the tool return the response in a well formatted way understandable by the LLM

@rm-openai
Copy link
Collaborator

@roaguirre:

  1. You can use tool_choice=required to require any tool to be called, or tool_choice=<tool_name> to require a specific tool to be called.
  2. You'll probably need the changes in Introduce tool_use_behavior on agents #203 to ensure the call doesn't infinite loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about using the SDK
Projects
None yet
Development

No branches or pull requests

4 participants