A lightweight Python library that simplifies the process of exposing functions as tools for Large Language Models.
LLM Tooling (or function calling) is a powerful way to connect LLMs to the real world. However, defining tool definitions can be cumbersome, as it requires defining both the tool function, and a JSON schema that describes the tool. Additionally, in some cases, you may want to control certain parameters passed into tools rather than have the LLM decide what to pass in. PyToolsmith aims to solve this by providing a simple API to define tools from function definitions and automatically generate the JSON schema to pass to the LLM.
- Generates JSON schemas directly from your function definitions.
- Parses Google-style docstrings to describe your tools in the schema.
- Pass the same tools into different LLM providers with a simple method call.
- Define custom type mappings to extend functionality.
- Anthropic
- AWS Bedrock
- OpenAI
- Google Gemini (limited)
Part of being able to define schemas is mapping certain types to a JSON-compatible format. As such, PyToolsmith allows you to define custom type maps to be used to generate the JSON schema. However, it comes out-of-the-box with support for:
- Standard based objects
str,int,float,bool, etc. - UUIDs
Simply define a tool definition as such:
from pytoolsmith import ToolDefinition
# 1. Define your function
def get_user_by_id(user_id: str, tenant_id: str) -> dict:
"""
This a tool that gets a user by its ID.
Args:
user_id: The user to search for.
tenant_id: The tenant to search inside of
Returns: A dictionary representing the user.
"""
# Your tool logic here.
return {
"user_id": user_id,
"tenant_id": tenant_id,
"user_name": "John Doe",
"email": "john.doe@example.com",
"phone": "123-456-7890"
}
# 2. Make a tool definition, calling out the injected parameter.
tool_definition = ToolDefinition(
function=get_user_by_id,
injected_parameters=["tenant_id"],
# You can also define a message that can be sent to the user here.
# Use mustache template syntax to inject parameter values into the message.
user_message="Looking up a user using id {{user_id}}."
)
# 3. Get a schema representing the tool automatically
schema = tool_definition.build_json_schema()
# 4. Get a tool definition ready to pass directly into LLM calls.
# Note that the LLM does not have the context for the controlled parameter.
schema.to_openai()
schema.to_anthropic()
schema.to_bedrock()
# Bedrock Output:
# {
# "name": "get_user_by_id",
# "inputSchema": {
# "json": {
# "type": "object",
# "properties": {
# "user_id": {
# "type": "string",
# "description": "The user to search for.",
# }
# },
# "required": ["my_param"],
# }
# },
# "description": "This a tool that gets a user by its ID. "
# "Returns: A dictionary representing the user.",
# }Additionally, you can use the ToolLibrary class to make it easy to pass in a list of tools directly to your LLM call.
# ^ continuing from above
from pytoolsmith import ToolLibrary
# Make a library:
tool_library = ToolLibrary()
tool_library.add_tool(tool_definition)
tool_library.add_tool(other_tool_definition)
# Get a tool list ready to pass directly into LLM calls.
tool_library.to_openai()
tool_library.to_anthropic()
tool_library.to_bedrock()
# All of these are a list that can be passed in directly to your LLM call.When you implement your LLM call, you can use the tool library to get the tool list, and call it directly.
# ^ continuing from above
from anthropic import Anthropic
from anthropic.types import MessageParam, TextBlockParam
from pytoolsmith import ToolDefinition
client = Anthropic()
hardset_parameters = {"tenant_id": "1234"}
# Get the tool name and parameters from the LLM call
llm_result = client.messages.create(
system="You are a helpful assistant who can look up users by their ID.",
tools=tool_library.to_anthropic(),
model="claude-3-7-sonnet-latest",
messages=[
MessageParam(
role="user",
content=[TextBlockParam(text="Can you help me look up my account? My id is 5678", type="text")],
)
],
max_tokens=100,
)
llm_set_params, tool_name = parse_llm_result(llm_result)
# `llm_set_params` would be like: {"user_id": "5678"}
# This was decided by the LLM and passed back as something to call.
tool: ToolDefinition = tool_library.get_tool_from_name(tool_name)
user_message = tool.format_message_for_call(llm_parameters=llm_set_params, hardset_parameters=hardset_parameters)
# The user message will be `Looking up a user using id 5678.`
tool_result = tool.call_tool(
llm_parameters=llm_set_params,
hardset_parameters=hardset_parameters,
)
# Result is ready to be passed back to the next LLM call.
# The user message will be `Looking up a user using id 5678.`Additionally, you can control the serialization parameters:
from bson import ObjectId
from pytoolsmith import pytoolsmith_config
pytoolsmith_config.update_type_map({ObjectId: "string"})
# Now you can define the following function as a tool:
def get_object_by_id(object_id: ObjectId) -> dict:
...One of the limitations of declaring your tool descriptions as docstrings is that they require code changes &
redeployment to update a prompt. Services like
Langfuse's Prompt Management System allow you to manage your prompts
outside of code. You can take advantage of this functionality and set prompts on the library as shown below, or on your
call to build_json_schema:
from pytoolsmith import ToolLibrary, ToolDefinition
def new_tool(user_id: str) -> str:
"""
{{MAIN_DESCRIPTION}}
Args:
user_id: {{USER_ID_DESCRIPTION}}
"""
pass
tool_library = ToolLibrary()
tool_library.add_tool(ToolDefinition(function=new_tool))
# When we are about to use the tool library, inject the prompt variables:
tool_library.set_schema_vars(
{
"MAIN_DESCRIPTION": "This is the main description",
"USER_ID_DESCRIPTION": "This is the user id description"
}
)
tool_library.to_anthropic()
# Will now output with the injected variables.
tool_library.clear_schema_vars()
# Now will output without the injected variables.Library Subsetting
Sometimes you may not want to pass all of your tools to the LLM.
Subsetting allows you to select a smaller set of tools to pass in- either individually or by tagging them with
a tool_group parameter on your ToolDefinitions.
To use, call the subset() method on a ToolLibrary instance to get a smaller library generated. Additionally, you can
use exclude() to get the opposite effect.
Field Exclusion
Sometimes, your tool definitions may have fields that you don't want to pass to the LLM. You can use
the exclude_fields parameter on the to_<provider> methods to remove these.
Batch Tool
When using Claude 3.7, Anthropic suggests adding
a Batch Tool to be able to
call multiple tools at once. To use within PyToolsmith, set include_batch_tool=True when creating your tool library.
You can also set a custom serialization function to load the LLM's arguments into function called from the batch tool
with
pytoolsmith_config.set_batch_tool_serializer(custom_serializer).
The default batch tool will run each tool call in series. However, you can create your own function to run them in
parallel by setting pytoolsmith_config.set_batch_runner(custom_runner). This function takes a function that can
process a list of callables and returns the list of results in order.
Vendor-Specific Options
If needed, additional OpenAPI spec can be passed into a ToolDefinition constructor with the additional_parameters
parameter.
The following client-specific configuration options are also available as options on the to_<provider> methods:
- OpenAI Strict Mode with
strict_model=True - Anthropic Prompt Caching with
use_cache_control=Trueon theto_anthropic()andto_bedrock()methods
- Support for tuples as fixed-length lists
- Extendable serialization support (for LLM messages -> function inputs and vise versa)
I was heavily inspired by FastAPI's batteries-included ability to create automatic OpenAPI specs for web applications. Having a single source of truth for your API docs defined as code speeds up development and reduces the chance of errors - why not apply that to LLM interfaces? 🤠