Skip to content

Conversation

@vblagoje
Copy link
Member

@vblagoje vblagoje commented Oct 16, 2025

Why:

Enhances tool flexibility across Haystack components by enabling mixed lists of Tool and Toolset objects, allowing developers to organize tools into logical groupings while maintaining the ability to include standalone tools all within the same tools parameter.

This change is fully backward compatible, existing code continues to work unchanged.

What:

  • Added ToolsType type alias (Union[list[Union[Tool, Toolset]], Toolset]) to standardize tool parameter types
  • Created flatten_tools_or_toolsets() utility function for consistent tool flattening across components
  • Enhanced serialize_tools_or_toolset() and deserialize_tools_or_toolset_inplace() to preserve Tool/Toolset boundaries during serialization
  • Added _ToolSchemaPlaceholder and _ToolsetSchemaPlaceholder to enable JSON schema generation for non-serializable Tool/Toolset types
  • Updated all tool-accepting components: Agent, ToolInvoker, OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator
  • Added comprehensive test coverage for mixed tool configurations, serialization round-trips, and edge cases

How can it be used:

from haystack import Pipeline
from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset

# Organize tools into logical groups
math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])

# Mix toolsets with standalone tools
agent = Agent(
    chat_generator=generator,
    tools=[math_toolset, weather_toolset, calendar_tool]  # ✨ Now supported!
)

# Works with all tool-accepting components
tool_invoker = ToolInvoker(tools=[toolset1, toolset2, standalone_tool])

# Serialization preserves structure
agent_dict = agent.to_dict()
restored_agent = Agent.from_dict(agent_dict)  # Perfect round-trip

How did you test it:

  • Added unit tests in test_agent.py, test_openai.py, test_serde_utils.py, and test_tools_utils.py
  • Tested serialization/deserialization round-trips with mixed Tool/Toolset configurations
  • Verified backward compatibility with existing code (all existing tests pass)
  • Validated error handling for invalid types and edge cases (empty toolsets, None values)
  • Confirmed all generators properly flatten tools before provider-specific conversion

Notes for the reviewer:

Please review the flatten_tools_or_toolsets() function in haystack/tools/utils.py—this is the core utility that enables the feature and is used consistently across all components. Also note the serialization logic in serde_utils.py that preserves the original list structure (not automatically flattening) to ensure perfect round-trip serialization. The _ToolSchemaPlaceholder types in parameters_schema_utils.py solve the JSON schema generation issue for callable-containing types.

@vblagoje vblagoje added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Oct 16, 2025
@coveralls
Copy link
Collaborator

coveralls commented Oct 16, 2025

Pull Request Test Coverage Report for Build 18595249452

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 69 unchanged lines in 10 files lost coverage.
  • Overall coverage increased (+0.02%) to 92.22%

Files with Coverage Reduction New Missed Lines %
tools/tool.py 1 96.47%
tools/serde_utils.py 3 93.02%
components/generators/chat/azure.py 4 93.55%
tools/parameters_schema_utils.py 4 95.4%
components/generators/chat/openai.py 5 97.27%
components/generators/chat/hugging_face_api.py 7 96.39%
components/tools/tool_invoker.py 7 95.9%
components/agents/agent.py 9 96.4%
core/pipeline/breakpoint.py 13 89.94%
components/generators/chat/hugging_face_local.py 16 85.86%
Totals Coverage Status
Change from base Build 18594671768: 0.02%
Covered Lines: 13382
Relevant Lines: 14511

💛 - Coveralls

@vblagoje
Copy link
Member Author

@sjrl @julian-risch @anakin87 I'm still double checking everything but the transition to new signature seems to work well except in one remote edge case:

So there were 2 tests that failed before adding ffa6436 :

  • test_enable_streaming_callback_passthrough
  • test_enable_streaming_callback_passthrough_runtime

Both tests were creating a ComponentTool from OpenAIChatGenerator. So when ComponentTool tried to generate the JSON schema for OpenAIChatGenerator's run method parameters, it hit the tools parameter with the new signature:

  • Optional[Union[list[Union[Tool, Toolset]], Toolset]] and generated schema:
"tools": {
      "anyOf": [
        { "type": "array", "items": { "anyOf": [] } },
        { "type": "null" }
      ],
      "default": null,
      "description": "A list of tools or a Toolset the model can use. Overrides the `tools` parameter from initialization.\nAccepts either a list of `Tool` objects or a `Toolset` instance."
    },

instead of the expected

"tools": {
      "type": "null",
      "default": null,
      "description": "A list of tools or a Toolset the model can use. Overrides the `tools` parameter from initialization.\nAccepts either a list of `Tool` objects or a `Toolset` instance."
    }

for the old signature. Which triggered this failure:

@classmethod
    def check_schema(cls, schema, format_checker=_UNSET):
        Validator = validator_for(cls.META_SCHEMA, default=cls)
        if format_checker is _UNSET:
            format_checker = Validator.FORMAT_CHECKER
        validator = Validator(
            schema=cls.META_SCHEMA,
            format_checker=format_checker,
        )
        for error in validator.iter_errors(schema):
>           raise exceptions.SchemaError.create_from(error)
E           jsonschema.exceptions.SchemaError: [] should be non-empty
E           
E           Failed validating 'minItems' in metaschema['allOf'][1]['properties']['properties']['additionalProperties']['$dynamicRef']['allOf'][1]['properties']['anyOf']['items']['$dynamicRef']['allOf'][1]['properties']['items']['$dynamicRef']['allOf'][1]['properties']['anyOf']:
E               {'type': 'array', 'minItems': 1, 'items': {'$dynamicRef': '#meta'}}
E           
E           On schema['properties']['tools']['anyOf'][0]['items']['anyOf']:
E               []

Full failure trace available here
I think @sjrl would know the best how important or not this is.

Comment on lines 280 to 282
# Parameters that should be excluded from the schema generation
# These are typically configuration parameters that aren't meant to be provided by LLMs
excluded_params = {"tools", "tools_strict"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully understanding the error that's caused when switching to the new tools type but I'm not convinced this is the right solution.

I'd rather get the parameter schema generation to work properly for tools such that is valid from OpenAI's perspective.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No no, I agree - it was just to isolate the error and pinpoint the cause. The issue is that now with the new signature Pydantic tries to generate the schema it goes down the nested Union and ends up producing an empty anyOf array while previously it produced:

"tools": {
      "type": "null",
      "default": null,
      "description": "A list of tools or a Toolset the model can use. Overrides the `tools` parameter from initialization.\nAccepts either a list of `Tool` objects or a `Toolset` instance."
    }

which was valid.

But also, think about it - tools parameter configures the component's internal behavior not input arguments the LLM is supposed to supply! This is kinda recursion into tools that doesn't make much sense.

@vblagoje
Copy link
Member Author

I think this solution should work now. The outcome is that tools can be translated in to OpenAI api schema

"tools": {
      "anyOf": [
        {
          "type": "array",
          "items": {
            "anyOf": [
              { "$ref": "#/$defs/_ToolSchemaPlaceholder" },
              { "$ref": "#/$defs/_ToolsetSchemaPlaceholder" }
            ]
          }
        },
        { "$ref": "#/$defs/_ToolsetSchemaPlaceholder" },
        { "type": "null" }
      ],
      "default": null,
      "description": "A list of tools or a Toolset for which the model can prepare calls. If set, it will override the `tools` parameter set during component initialization. This parameter can accept either a list of `Tool` objects or a `Toolset` instance."
    },

which is acceptable to Draft202012Validator.check_schema()
LMK your thoughts @sjrl @julian-risch @anakin87 @tstadel

@anakin87
Copy link
Member

Very minor contribution, but I suggest replacing Union[list[Union[Tool, Toolset]], Toolset] with a type alias.
Something like ToolsType = Union[list[Union[Tool, Toolset]], Toolset].

@vblagoje
Copy link
Member Author

Very minor contribution, but I suggest replacing Union[list[Union[Tool, Toolset]], Toolset] with a type alias. Something like ToolsType = Union[list[Union[Tool, Toolset]], Toolset].

Very minor but brilliant - would simplify things quite a lot. Where would we put the definition of this alias?
How about haystack/tools/init.py and export it right there so it can be used throughout?

@tstadel
Copy link
Member

tstadel commented Oct 16, 2025

Nice! I didn't go too much into detail, but signatures and (de)serialization logics are looking good.

@vblagoje
Copy link
Member Author

Thanks @tstadel I'll go through the details tonight and this one should be ready for review tomorrow

@vblagoje vblagoje marked this pull request as ready for review October 17, 2025 07:34
@vblagoje vblagoje requested a review from a team as a code owner October 17, 2025 07:34
@vblagoje vblagoje requested review from mpangrazzi and removed request for a team October 17, 2025 07:34
@vblagoje
Copy link
Member Author

vblagoje commented Oct 17, 2025

Ok the PR should be ready for review now. I'll add more edge test cases today but I wanted to give you heads up to start reviewing. thx all @sjrl @tstadel @julian-risch @anakin87 @mpangrazzi

@vblagoje vblagoje changed the title Update tools param to Optional[Union[list[Union[Tool, Toolset]], Toolset]] feat: Update tools param to Optional[Union[list[Union[Tool, Toolset]], Toolset]] Oct 17, 2025
return tools

if isinstance(tools, list):
return cast(list[Union[Tool, Toolset]], tools)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add a dev comment here explaining the need for cast

Comment on lines 162 to 163
def test_flatten_nested_toolsets(self, add_tool, multiply_tool, subtract_tool):
"""Test flattening multiple levels of Toolsets."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think this is actually testing mutliple levels here.

raise TypeError("tools must be Toolset, list[Union[Tool, Toolset]], or None")


def deserialize_tools_or_toolset_inplace(data: dict[str, Any], key: str = "tools") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update the docstrings of this function to reflect the new type serialized tools can have. E.g. Deserialize a list of Tools or a Toolset in a dictionary inplace. is no longer fully correct

class TestToolsetList:
"""Tests for list[Toolset] functionality."""

def test_tool_invoker_with_list_of_toolsets(self, weather_tool):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mixed case (Tool and Toolset) seems to be missing here

Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great already. Left a few nits.

@vblagoje
Copy link
Member Author

Addressed the feedback @sjrl @tstadel , see if there is anything else we can fix now 🙏

generation_kwargs: Optional[dict[str, Any]] = None,
streaming_callback: Optional[StreamingCallbackT] = None,
tools: Optional[Union[list[Tool], Toolset]] = None,
tools: Optional[ToolsType] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for this function needs to be updated

generation_kwargs: Optional[dict[str, Any]] = None,
*,
tools: Optional[Union[list[Tool], Toolset]] = None,
tools: Optional[ToolsType] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for this needs to be updated below

generation_kwargs: Optional[dict[str, Any]] = None,
*,
tools: Optional[Union[list[Tool], Toolset]] = None,
tools: Optional[ToolsType] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for this needs to be updated below


@staticmethod
def _validate_and_prepare_tools(tools: Union[list[Tool], Toolset]) -> dict[str, Tool]:
def _validate_and_prepare_tools(tools: ToolsType) -> dict[str, Tool]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for tools needs to be updated for this function

*,
enable_streaming_callback_passthrough: Optional[bool] = None,
tools: Optional[Union[list[Tool], Toolset]] = None,
tools: Optional[ToolsType] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring for tools below

*,
enable_streaming_callback_passthrough: Optional[bool] = None,
tools: Optional[Union[list[Tool], Toolset]] = None,
tools: Optional[ToolsType] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstrings below

Copy link
Contributor

@sjrl sjrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few minor comments regarding docstrings

@julian-risch julian-risch added this to the 2.19.0 milestone Oct 17, 2025
@vblagoje
Copy link
Member Author

@sjrl I suggest we first integrate #9859 then update this PR with new tools param for it and then integrate this one, cool?

@vblagoje
Copy link
Member Author

Nice, thank you all, please let me see how to coordinate merging this PR and #9856 i.e. what to merge first and how to minimize pain

@vblagoje vblagoje merged commit 8098e9c into main Oct 20, 2025
21 checks passed
@vblagoje vblagoje deleted the toolset_thomas branch October 20, 2025 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:core topic:tests type:documentation Improvements on the docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants