Merge branch 'main' into colesmcintosh/main

huggingface · Feb 20, 2025 · bda33be · bda33be
2 parents 3360d11 + 1f8f987
commit bda33be
Show file tree

Hide file tree

Showing 80 changed files with 2,710 additions and 1,020 deletions.
diff --git a/.github/workflows/build_documentation.yml b/.github/workflows/build_documentation.yml
@@ -20,6 +20,7 @@ jobs:
       commit_sha: ${{ github.sha }}
       package: smolagents
       languages: en
+      notebook_folder: smolagents_doc
       # additional_args: --not_python_module # use this arg if repository is documentation only
     secrets:
       token: ${{ secrets.HUGGINGFACE_PUSH }}

diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -93,6 +93,11 @@ jobs:
           uv run pytest ./tests/test_tools.py
         if: ${{ success() || failure() }}
 
+      - name: Tool validation tests
+        run: |
+          uv run pytest ./tests/test_tool_validation.py
+        if: ${{ success() || failure() }}
+
       - name: Types tests
         run: |
           uv run pytest ./tests/test_types.py
@@ -103,6 +108,11 @@ jobs:
           uv run pytest ./tests/test_utils.py
         if: ${{ success() || failure() }}
 
+      - name: Gradio UI tests
+        run: |
+          uv run pytest ./tests/test_gradio_ui.py
+        if: ${{ success() || failure() }}
+
       - name: Function type hints utils tests
         run: |
           uv run pytest ./tests/test_function_type_hints_utils.py

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -33,19 +33,17 @@ However you choose to contribute, please be mindful and respect our
 
 There are several ways you can contribute to smolagents.
 
-* Fix outstanding issues with the existing code.
 * Submit issues related to bugs or desired new features.
 * Contribute to the examples or to the documentation.
+* Fix outstanding issues with the existing code.
 
 > All contributions are equally valuable to the community. 🥰
 
-## Fixing outstanding issues
-
-If you notice an issue with the existing code and have a fix in mind, feel free to [start contributing](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) and open
-a Pull Request!
-
 ## Submitting a bug-related issue or feature request
 
+At any moment, feel welcome to open an issue, citing your exact error traces and package versions if it's a bug.
+It's often even better to open a PR with your proposed fixes/changes!
+
 Do your best to follow these guidelines when submitting a bug-related issue or a feature
 request. It will make it easier for us to come back to you quickly and with good
 feedback.
@@ -89,10 +87,41 @@ We're always looking for improvements to the documentation that make it more cle
 how the documentation can be improved such as typos and any content that is missing, unclear or inaccurate. We'll be 
 happy to make the changes or help you make a contribution if you're interested!
 
+## Fixing outstanding issues
+
+If you notice an issue with the existing code and have a fix in mind, feel free to [start contributing](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) and open
+a Pull Request!
+
+### Making code changes
+
+To install dev dependencies, run:
+```
+pip install -e ".[dev]"
+```
+
+When making changes to the codebase, please check that it follows the repo's code quality requirements by running:
+To check code quality of the source code:
+```
+make quality
+```
+
+If the checks fail, you can run the formatter with:
+```
+make style
+```
+
+And commit the changes.
+
+To run tests locally, run this command:
+```bash
+make test
+```
+</details>
+
 ## I want to become a maintainer of the project. How do I get there?
 
 smolagents is a project led and managed by Hugging Face. We are more than
 happy to have motivated individuals from other organizations join us as maintainers with the goal of helping smolagents
 make a dent in the world of Agents.
 
-If you are such an individual (or organization), please reach out to us and let's collaborate.
+If you are such an individual (or organization), please reach out to us and let's collaborate.
diff --git a/README.md b/README.md
@@ -32,7 +32,7 @@ limitations under the License.
 
 `smolagents` is a library that enables you to run powerful agents in a few lines of code. It offers:
 
-✨ **Simplicity**: the logic for agents fits in 1,000 lines of code (see [agents.py](https://github.com/huggingface/smolagents/blob/main/src/smolagents/agents.py)). We kept abstractions to their minimal shape above raw code!
+✨ **Simplicity**: the logic for agents fits in ~1,000 lines of code (see [agents.py](https://github.com/huggingface/smolagents/blob/main/src/smolagents/agents.py)). We kept abstractions to their minimal shape above raw code!
 
 🧑‍💻 **First-class support for Code Agents**. Our [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) writes its actions in code (as opposed to "agents being used to write code"). To make it secure, we support executing in sandboxed environments via [E2B](https://e2b.dev/).
 
@@ -49,16 +49,6 @@ Full documentation can be found [here](https://huggingface.co/docs/smolagents/in
 > [!NOTE]
 > Check the our [launch blog post](https://huggingface.co/blog/smolagents) to learn more about `smolagents`!
 
-## Table of Contents
-- [Introduction](#introduction)
-- [Quick Demo](#quick-demo)
-- [Command Line Interface](#command-line-interface)
-- [Code Agents](#code-agents)
-- [How smol is this library?](#how-smol-is-this-library)
-- [How Strong are Open Models for Agentic Workflows?](#how-strong-are-open-models-for-agentic-workflows)
-- [Contributing](#contributing)
-- [Citing smolagents](#citing-smolagents)
-
 ## Quick demo
 
 First install the package.
@@ -77,6 +67,13 @@ agent.run("How many seconds would it take for a leopard at full speed to run thr
 
 https://github.com/user-attachments/assets/cd0226e2-7479-4102-aea0-57c22ca47884
 
+You can even share your agent to hub:
+```py
+agent.push_to_hub("m-ric/my_agent")
+
+# agent.from_hub("m-ric/my_agent") to load an agent from Hub
+```
+
 Our library is LLM-agnostic: you could switch the example above to any inference provider.
 
 <details>
@@ -147,42 +144,60 @@ model = AzureOpenAIServerModel(
 ```
 </details>
 
-## Command Line Interface
-
-You can run agents from CLI using two commands: `smolagent` and `webagent`. `smolagent` is a generalist command to run a multi-step `CodeAgent` that can be equipped with various tools, meanwhile `webagent` is a specific web-browsing agent using [helium](https://github.com/mherrmann/helium).
+## CLI
 
-**Web Browser Agent in CLI**
+You can run agents from CLI using two commands: `smolagent` and `webagent`.
 
-`webagent` allows users to automate web browsing tasks. It uses the [helium](https://github.com/mherrmann/helium) library to interact with web pages and uses defined tools to browse the web. Read more about this agent [here](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py).
+`smolagent` is a generalist command to run a multi-step `CodeAgent` that can be equipped with various tools.
 
-Run the following command to get started:
 ```bash
-webagent {YOUR_PROMPT_HERE} --model "LiteLLMModel" --model-id "gpt-4o"
+smolagent "Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7."  --model-type "HfApiModel" --model-id "Qwen/Qwen2.5-Coder-32B-Instruct" --imports "pandas numpy" --tools "web_search"
 ```
 
+Meanwhile `webagent` is a specific web-browsing agent using [helium](https://github.com/mherrmann/helium) (read more [here](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py)).
+
 For instance:
 ```bash
-webagent --prompt "go to xyz.com/women, get to sale section, click the first clothing item you see. Get the product details, and the price, return them. note that I'm shopping from France"
+webagent "go to xyz.com/men, get to sale section, click the first clothing item you see. Get the product details, and the price, return them. note that I'm shopping from France" --model-type "LiteLLMModel" --model-id "gpt-4o"
 ```
-We redacted the website here, modify it with the website of your choice.
 
-**CodeAgent in CLI**
+## How do Code agents work?
 
-Use `smolagent` to run a multi-step agent with [tools](https://huggingface.co/docs/smolagents/en/reference/tools). It uses web search tool by default.
-You can easily get started with `$ smolagent {YOUR_PROMPT_HERE}`. You can customize this as follows (more details [here](https://github.com/huggingface/smolagents/blob/main/src/smolagents/cli.py)).
+Our [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) works mostly like classical ReAct agents - the exception being that the LLM engine writes its actions as Python code snippets.
 
-```bash
-smolagent {YOUR_PROMPT_HERE} --model-type "HfApiModel" --model-id "Qwen/Qwen2.5-Coder-32B-Instruct" --imports "pandas numpy" --tools "web_search translation"
-```
+```mermaid
+flowchart TB
+    Task[User Task]
+    Memory[agent.memory]
+    Generate[Generate from agent.model]
+    Execute[Execute Code action - Tool calls are written as functions]
+    Answer[Return the argument given to 'final_answer']
 
-For instance:
-```bash
-smolagent "Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7. Allocate time according to number of public attraction in each, and optimize for distance and travel time. Bring all the public transportation options."
-``` 
+    Task -->|Add task to agent.memory| Memory
+
+    subgraph ReAct[ReAct loop]
+        Memory -->|Memory as chat messages| Generate
+        Generate -->|Parse output to extract code action| Execute
+        Execute -->|No call to 'final_answer' tool => Store execution logs in memory and keep running| Memory
+    end
+    
+    Execute -->|Call to 'final_answer' tool| Answer
 
-## Code agents?
+    %% Styling
+    classDef default fill:#d4b702,stroke:#8b7701,color:#ffffff
+    classDef io fill:#4a5568,stroke:#2d3748,color:#ffffff
+    
+    class Task,Answer io
+```
+
+Actions are now Python code snippets. Hence, tool calls will be performed as Python function calls. For instance, here is how the agent can perform web search over several websites in one single action:
+```py
+requests_to_search = ["gulf of mexico america", "greenland denmark", "tariffs"]
+for request in requests_to_search:
+    print(f"Here are the search results for {request}:", web_search(request))
+```
 
-In our [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent),  the LLM engine writes its actions in code. This approach is demonstrated to work better than the current industry practice of letting the LLM output a dictionary of the tools it wants to calls: [uses 30% fewer steps](https://huggingface.co/papers/2402.01030) (thus 30% fewer LLM calls) and [reaches higher performance on difficult benchmarks](https://huggingface.co/papers/2411.01747). Head to [our high-level intro to agents](https://huggingface.co/docs/smolagents/conceptual_guides/intro_agents) to learn more on that.
+Writing actions as code snippets is demonstrated to work better than the current industry practice of letting the LLM output a dictionary of the tools it wants to call: [uses 30% fewer steps](https://huggingface.co/papers/2402.01030) (thus 30% fewer LLM calls) and [reaches higher performance on difficult benchmarks](https://huggingface.co/papers/2411.01747). Head to [our high-level intro to agents](https://huggingface.co/docs/smolagents/conceptual_guides/intro_agents) to learn more on that.
 
 Especially, since code execution can be a security concern (arbitrary code execution!), we provide options at runtime:
   - a secure python interpreter to run code more safely in your environment (more secure than raw code execution but still risky)
@@ -211,34 +226,7 @@ This comparison shows that open-source models can now take on the best closed mo
 
 ## Contribute
 
-To contribute, follow our [contribution guide](https://github.com/huggingface/smolagents/blob/main/CONTRIBUTING.md).
-
-At any moment, feel welcome to open an issue, citing your exact error traces and package versions if it's a bug.
-It's often even better to open a PR with your proposed fixes/changes!
-
-To install dev dependencies, run:
-```
-pip install -e ".[dev]"
-```
-
-When making changes to the codebase, please check that it follows the repo's code quality requirements by running:
-To check code quality of the source code:
-```
-make quality
-```
-
-If the checks fail, you can run the formatter with:
-```
-make style
-```
-
-And commit the changes.
-
-To run tests locally, run this command:
-```bash
-make test
-```
-</details>
+Everyone is welcome to contribute, get started with our [contribution guide](https://github.com/huggingface/smolagents/blob/main/CONTRIBUTING.md).
 
 ## Cite smolagents
 

diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
@@ -14,6 +14,8 @@
     title: 🛠️ Tools - in-depth guide
   - local: tutorials/secure_code_execution
     title: 🛡️ Secure your code execution with E2B
+  - local: tutorials/memory
+    title: 📚 Manage your agent's memory
 - title: Conceptual guides
   sections:
   - local: conceptual_guides/intro_agents

diff --git a/...urce/en/conceptual_guides/intro_agents.md → ...rce/en/conceptual_guides/intro_agents.mdx b/...urce/en/conceptual_guides/intro_agents.md → ...rce/en/conceptual_guides/intro_agents.mdx
diff --git a/docs/source/en/conceptual_guides/react.md → docs/source/en/conceptual_guides/react.mdx b/docs/source/en/conceptual_guides/react.md → docs/source/en/conceptual_guides/react.mdx
@@ -38,11 +38,6 @@ For a `CodeAgent`, it looks like the figure below.
 
 <div class="flex justify-center">
     <img
-        class="block dark:hidden"
-        src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/codeagent_docs.png"
-    />
-    <img
-        class="hidden dark:block"
         src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/codeagent_docs.png"
     />
 </div>
@@ -60,14 +55,9 @@ Here is a video overview of how that works:
     />
 </div>
 
-![Framework of a React Agent](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/open-source-llms-as-agents/ReAct.png)
-
 We implement two versions of agents: 
 - [`CodeAgent`] is the preferred type of agent: it generates its tool calls as blobs of code.
 - [`ToolCallingAgent`] generates tool calls as a JSON in its output, as is commonly done in agentic frameworks. We incorporate this option because it can be useful in some narrow cases where you can do fine with only one tool call per step: for instance, for web browsing, you need to wait after each action on the page to monitor how the page changes.
 
 > [!TIP]
-> We also provide an option to run agents in one-shot: just pass `single_step=True` when launching the agent, like `agent.run(your_task, single_step=True)`
-
-> [!TIP]
-> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more about multi-step agents.
+> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more about multi-step agents.
diff --git a/docs/source/en/examples/multiagents.md → docs/source/en/examples/multiagents.mdx b/docs/source/en/examples/multiagents.md → docs/source/en/examples/multiagents.mdx
diff --git a/docs/source/en/examples/rag.md → docs/source/en/examples/rag.mdx b/docs/source/en/examples/rag.md → docs/source/en/examples/rag.mdx
diff --git a/docs/source/en/examples/text_to_sql.md → docs/source/en/examples/text_to_sql.mdx b/docs/source/en/examples/text_to_sql.md → docs/source/en/examples/text_to_sql.mdx
diff --git a/docs/source/en/examples/web_browser.md → docs/source/en/examples/web_browser.mdx b/docs/source/en/examples/web_browser.md → docs/source/en/examples/web_browser.mdx
diff --git a/docs/source/en/guided_tour.md → docs/source/en/guided_tour.mdx b/docs/source/en/guided_tour.md → docs/source/en/guided_tour.mdx
@@ -28,10 +28,11 @@ To initialize a minimal agent, you need at least these two arguments:
     - [`HfApiModel`] leverages a `huggingface_hub.InferenceClient` under the hood and supports all Inference Providers on the Hub.
     - [`LiteLLMModel`] similarly lets you call 100+ different models and providers through [LiteLLM](https://docs.litellm.ai/)!
     - [`AzureOpenAIServerModel`] allows you to use OpenAI models deployed in [Azure](https://azure.microsoft.com/en-us/products/ai-services/openai-service).
+    - [`MLXModel`] creates a [mlx-lm](https://pypi.org/project/mlx-lm/) pipeline to run inference on your local machine.
 
 - `tools`, a list of `Tools` that the agent can use to solve the task. It can be an empty list. You can also add the default toolbox on top of your `tools` list by defining the optional argument `add_base_tools=True`.
 
-Once you have these two arguments, `tools` and `model`,  you can create an agent and run it. You can use any LLM you'd like, either through [Inference Providers](https://huggingface.co/blog/inference-providers), [transformers](https://github.com/huggingface/transformers/), [ollama](https://ollama.com/), [LiteLLM](https://www.litellm.ai/), or [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service).
+Once you have these two arguments, `tools` and `model`,  you can create an agent and run it. You can use any LLM you'd like, either through [Inference Providers](https://huggingface.co/blog/inference-providers), [transformers](https://github.com/huggingface/transformers/), [ollama](https://ollama.com/), [LiteLLM](https://www.litellm.ai/), [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service), or [mlx-lm](https://pypi.org/project/mlx-lm/).
 
 <hfoptions id="Pick a LLM">
 <hfoption id="HF Inference API">
@@ -148,6 +149,19 @@ agent.run(
 )
 ```
 
+</hfoption>
+<hfoption id="mlx-lm">
+
+```python
+# !pip install smolagents[mlx-lm]
+from smolagents import CodeAgent, MLXModel
+
+mlx_model = MLXModel("mlx-community/Qwen2.5-Coder-32B-Instruct-4bit")
+agent = CodeAgent(model=mlx_model, tools=[], add_base_tools=True)
+
+agent.run("Could you give me the 118th number in the Fibonacci sequence?")
+```
+
 </hfoption>
 </hfoptions>
 
@@ -206,7 +220,7 @@ When the agent is initialized, the tool attributes are used to generate a tool d
 
 ### Default toolbox
 
-Transformers comes with a default toolbox for empowering agents, that you can add to your agent upon initialization with argument `add_base_tools = True`:
+`smolagents` comes with a default toolbox for empowering agents, that you can add to your agent upon initialization with argument `add_base_tools = True`:
 
 - **DuckDuckGo web search***: performs a web search using DuckDuckGo browser.
 - **Python code interpreter**: runs your LLM generated Python code in a secure environment. This tool will only be added to [`ToolCallingAgent`] if you initialize it with `add_base_tools=True`, since code-based agent can already natively execute Python code
@@ -344,24 +358,25 @@ It empirically yields better performance on most benchmarks. The reason for this
 
 You can easily build hierarchical multi-agent systems with `smolagents`.
 
-To create a managed agent, give your `CodeAgent` or `ToolCallingAgent` the attributes `name` and `description` - these are mandatory to make the agent callable by its manager agent. The manager agent will receive the managed agent via its managed_agents argument during initialization.
+To do so, just ensure your agent has `name` and`description` attributes, which will then be embedded in the manager agent's system prompt to let it know how to call this managed agent, as we also do for tools.
+Then you can pass this managed agent in the parameter managed_agents upon initialization of the manager agent.
 
 Here's an example of making an agent that managed a specific web search agent using our [`DuckDuckGoSearchTool`]:
 
 ```py
-from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool, ToolCallingAgent
+from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool
 
 model = HfApiModel()
 
-managed_web_agent = CodeAgent(
+web_agent = CodeAgent(
     tools=[DuckDuckGoSearchTool()],
     model=model,
     name="web_search",
     description="Runs web searches for you. Give it your query as an argument."
 )
 
 manager_agent = CodeAgent(
-    tools=[], model=model, managed_agents=[managed_web_agent]
+    tools=[], model=model, managed_agents=[web_agent]
 )
 
 manager_agent.run("Who is the CEO of Hugging Face?")
@@ -401,6 +416,17 @@ You can also use this `reset=False` argument to keep the conversation going in a
 
 ## Next steps
 
+Finally, when you've configured your agent to your needs, you can share it to the Hub!
+
+```py
+agent.push_to_hub("m-ric/my_agent")
+```
+
+Similarly, to load an agent that has been pushed to hub, if you trust the code from its tools, use:
+```py
+agent.from_hub("m-ric/my_agent", trust_remote_code=True)
+```
+
 For more in-depth usage, you will then want to check out our tutorials:
 - [the explanation of how our code agents work](./tutorials/secure_code_execution)
 - [this guide on how to build good agents](./tutorials/building_good_agents).

diff --git a/docs/source/en/index.md → docs/source/en/index.mdx b/docs/source/en/index.md → docs/source/en/index.mdx
diff --git a/docs/source/en/reference/agents.md → docs/source/en/reference/agents.mdx b/docs/source/en/reference/agents.md → docs/source/en/reference/agents.mdx
@@ -57,3 +57,13 @@ _This class is deprecated since 1.8.0: now you simply need to pass attributes `n
 > You must have `gradio` installed to use the UI. Please run `pip install smolagents[gradio]` if it's not the case.
 
 [[autodoc]] GradioUI
+
+## Prompts
+
+[[autodoc]] smolagents.agents.PromptTemplates
+
+[[autodoc]] smolagents.agents.PlanningPromptTemplate
+
+[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate
+
+[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate