Skip to content

Conversation

MackinnonBuck
Copy link
Member

@MackinnonBuck MackinnonBuck commented Sep 8, 2025

Summary

This draft introduces the tool reduction feature set:

  • New abstractions (IToolReductionStrategy) for selecting a subset of tools per request.
  • Middleware (ToolReducingChatClient) that applies a configured strategy automatically.
  • An initial embedding-based implementation (EmbeddingToolReductionStrategy) as a first strategy.
  • Supporting builder extensions and tests.

All components are marked experimental and may evolve. Additional strategies can be added in follow-up work.

Next Steps

  • Additional tool reduction strategy implementations.
  • Refinements to abstractions as necessary.

Feedback welcome before expanding scope.

Fixes #6670

Microsoft Reviewers: Open in CodeFlow

@github-actions github-actions bot added the area-ai Microsoft.Extensions.AI libraries label Sep 8, 2025
Comment on lines 50 to 51
var messageTexts = messages.Select(m => m.Text).Where(s => !string.IsNullOrEmpty(s));
return string.Join("\n", messageTexts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a helper that ChatMessage and ChatResponse use to do this a bit more efficiently. Maybe we should look at exposing that in a way this could use as well.

That said, I wonder if we'd want the default implementation here also including reasoning content, which this implementation doesn't currently, since ChatMessage.Text only includes TextContent and not TextReasoningContent.

Comment on lines +105 to +106
/// If <see langword="false"/> (default), tools are ordered by descending similarity.
/// If <see langword="true"/>, the top-N tools by similarity are re-emitted in their original order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious where this ordering comes into play?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that consistent ordering might matter if there's some kind of downstream cache that uses the set of available tools as a key, or a chat client that expects a meaningful relative tool order. We can probably remove this though, as these are just hypothetical scenarios for now.

Copy link

@PederHP PederHP Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordering matters for caching yes. Caching, whether automatic (OpenAI) or explicit (Anthropic) is prefix-based, so it requries everything up to a certain point in the raw request to be the same. Stable tool ordering is important.


if (options?.Tools is not { Count: > 0 } tools)
{
return options?.Tools ?? [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just []?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but ToolReducingChatClient has an optimization where if the returned IEnumerable<AITool> is reference-equal to the original tools list, it just returns the original options rather than copying the result into a List<AITool> and using a cloned ChatOptions.

I can add a comment to clarify this.

/// A delegating chat client that applies a tool reduction strategy before invoking the inner client.
/// </summary>
/// <remarks>
/// Insert this into a pipeline (typically before function invocation middleware) to automatically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we'd implement a "virtual tools" strategy ala what VS Code uses. Tools come in, they're grouped by an LLM into buckets, and then fake tools are created for each bucket; those tools are passed along to the LLM, which can then invoke one of those tools in order to expand it out into the full set from that bucket. Probably requires some prototyping. It might be that we leverage AIFunctionDeclaration so that the FICC lets a virtual tool pass through back to the TRCC. Or maybe it's another feature in FICC that enables a tool to modify the options that'll be sent back on a subsequent request. Not sure.

Copy link

@PederHP PederHP Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I've used the "virtual tools" strategy I've had success with having a single tool that allows expansion. There's no need to have a tool per bucket. It can just be an argument to the expand and collapse tools, with the bucket descriptions passed in the meta-tool description or injected into system. This is greatly more token efficient and scales better in terms of tool count. This also makes it a well-known tool with some dynamic context rather than an entirely dynamic tool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PederHP, that sounds like a promising approach. Just to clarify: is the LLM responsible for both expanding and collapsing tool groups, or does it only expand them, with collapsing happening automatically later (e.g. after a response completes)? In other words, how long does an expanded group remain open?

I also wonder if tool grouping might be more naturally handled through agents. Each agent has a limited toolset tailored to its purpose, so as work is handed off between agents, the available tools change without needing stateful expand/collapse logic in a single chat client. A lightweight "tool reduction" middleware in the chat client (such as what's prototyped in this PR) could still help narrow focus further, but in a more stateless way.

If we do want multiple tool groups in a single client, one option could be allowing only one group to be expanded at a time. This would mimic the agent handoff scenario I described above. All groups are collapsed at the start of a response, and the LLM can call a well-known, non-invocable "expand" function with a tool group name. A middleware earlier in the pipeline can then generate another response with the expanded set of tools. Further expansions close the already-expanded tool group. This approach reduces the responsibility of the LLM to decide when to collapse a tool group, and it allows tool reduction middleware to run after each expansion, if needed. That type of composition might be harder to achieve if functions can directly mutate the tools list inside FunctionInvokingChatClient iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-ai Microsoft.Extensions.AI libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Middleware for tool reduction
3 participants