LLM Agent X is an interactive, multi-agent framework designed for executing complex tasks with real-time human supervision. It leverages a powerful InteractiveDAGAgent that models tasks as a Directed Acyclic Graph (DAG), enabling sophisticated, non-linear workflows.
The entire system is containerized with Docker and orchestrated through a message queue, providing a robust, scalable architecture. A web-based "Mission Control" UI allows operators to launch tasks, monitor progress, inspect results, and provide real-time guidance to the agent swarm.
⚠️ Security Warning: This project is a research demonstration. It can be configured to execute arbitrary code generated by a language model, which is inherently dangerous. Use this in a sandboxed environment and only with trusted inputs.
- Interactive DAG Agent: A persistent agent that models tasks as a graph, allowing for complex dependencies and adaptive planning.
- Real-time Mission Control UI: A Next.js frontend for launching, monitoring, and controlling agent execution in real-time.
- Message Queue Architecture: Decoupled Gateway and Worker components communicate via RabbitMQ for resilience and scalability.
- Fully Dockerized: The entire stack—UI, Gateway, Worker, and RabbitMQ—is managed with a single
docker-composecommand. - Human-in-the-Loop: Operators can pause, resume, cancel, and redirect tasks, or answer questions posed by the agent.
- Extensible Tool Use: Agents can be equipped with tools like web search (
brave_web_search) and sandboxed code execution (exec_python).
git clone https://github.com/cvaz1306/llm_agent_x.git
cd llm_agent_xCreate a .env file in the root of the project by copying the example.
cp .env.example .envNow, edit the .env file and add your required API keys:
# Required for the agent worker to function
OPENAI_API_KEY="your_openai_api_key"
BRAVE_API_KEY="your_brave_search_api_key"
# Other variables are pre-configured for Docker ComposeBuild and launch all services using Docker Compose.
docker-compose up --buildThis command will:
- Build the Docker images for the UI, gateway, worker, and sandbox.
- Start all the necessary services.
- Begin streaming logs from all containers to your terminal.
Once the services are running, open your web browser and navigate to:
You can now use the Mission Control UI to launch and monitor your agents.
While the primary interface is the interactive UI, the underlying llm-agent-x Python package can be used directly for custom integrations. See the API Reference for details on the Gateway API and the source code for direct agent usage.
- Running the Application: The primary guide to get the full application running with Docker.
- Interactive Mode: An in-depth look at the architecture and how to control the agent.
- Gateway API Reference: Detailed documentation for the REST and Socket.IO API.
- Python Sandbox: Information on the code execution sandbox.
This project is licensed under the MIT License.
When running in interactive mode, LLM Agent X exposes a Gateway service. This service provides a REST API for high-level control and a Socket.IO endpoint for receiving real-time state updates. This is the primary programmatic interface for the application.
The Gateway's REST API is the entry point for creating tasks and sending commands (directives) to the agent swarm.
- Description: Retrieves a snapshot of the current state of all tasks and documents known to the Gateway. The state is cached and updated in real-time via the worker's broadcasts.
- Method:
GET - Response (200 OK):
{ "tasks": { "TASK_ID_1": { "id": "TASK_ID_1", "desc": "Description of the task...", "status": "running", "parent": "ROOT_TASK", "children": ["CHILD_ID_1"], "deps": ["dep_id_1"], "result": null }, "DOC_ID_1": { "id": "DOC_ID_1", "desc": "Document: My Report", "status": "complete", "task_type": "document", "document_state": { "content": "The content of the document...", "version": 1 } } } }
- Description: Creates a new root task for the agent to begin working on.
- Method:
POST - Request Body:
{ "desc": "The high-level objective for the agent.", "mcp_servers": [] // Optional: configuration for MCP servers } - Response (200 OK):
{ "status": "new task submitted" }
- Description: Sends a specific control command (a "directive") to a running task.
- Method:
POST - URL Parameter:
task_id- The ID of the task to control. - Request Body:
{ "command": "DIRECTIVE_NAME", "payload": "..." // Value depends on the command } - Response (200 OK):
{ "status": "directive sent" }
Directives are the core mechanism for interacting with the agent. They are sent to the /api/tasks/{task_id}/directive endpoint.
| Command | Payload | Description |
|---|---|---|
PAUSE |
null |
Pauses the execution of the specified task. The task's state is preserved. |
RESUME |
null |
Resumes a task that was previously paused by a human operator. |
ANSWER_QUESTION |
string |
Provides an answer to a question the agent asked when it entered the waiting_for_user_response state. The agent will consume the answer and resume. |
REDIRECT |
string |
Provides a new instruction or clarification to the task. This forces the agent to re-evaluate its approach. |
MANUAL_OVERRIDE |
string |
Forces a task to be marked as complete and sets its result to the provided payload. This is useful for manually completing a step. |
CANCEL |
string (optional reason) |
Marks a task as cancelled. The task is not deleted but is removed from the active execution flow. |
PRUNE_TASK |
string (optional reason) |
Permanently removes a task and all its children from the graph. This is a destructive action used to clean up the task tree. |
TERMINATE |
string (optional reason) |
Forcibly marks a task as failed. |
The Gateway broadcasts state changes from the agent worker via Socket.IO, allowing a UI or other clients to update in real-time.
- Event:
task_update - Description: Emitted whenever a task's state changes. This is the primary event for monitoring the system.
- Payload: A JSON object containing the complete, updated state of a single task.
Clients should listen for this event to receive the latest state of any task that has been modified by the agent worker.
{ "task": { "id": "TASK_ID", "desc": "Task description", "status": "running", "result": null, "children": ["CHILD_ID_1"], // ... and all other fields from the Task model } }
Note: This command-line interface is intended for simple, non-interactive tests and demonstrations of the underlying agent library. For the full experience with real-time monitoring, human-in-the-loop control, and DAG visualization, please use the Interactive Mode which is the recommended way to run LLM Agent X.
The llm-agent-x CLI provides a way to run the RecursiveAgent or a non-interactive DAGAgent directly from your terminal for one-off tasks.
llm-agent-x <agent_type> "Your task description" [options]| Argument | Description |
|---|---|
agent_type |
(Positional) The type of agent to run. Choices: recursive, dag. |
task |
(Positional) The main objective for the agent to execute. |
| Argument | Description | Default Value |
|---|---|---|
--model |
The name of the LLM to use (e.g., gpt-4o-mini). |
gpt-4o-mini |
--output |
Path to save the final response. | None (prints to console) |
--enable-python-execution |
Enable the exec_python tool. Requires the Sandbox to be running. |
Disabled |
-
Basic Research Task:
llm-agent-x recursive "Research the history of artificial intelligence, focusing on key breakthroughs." -
Task with Custom Decomposition Limits:
llm-agent-x recursive "Write a Python script to fetch today's weather." --task_limit "[2,1,0]" --enable-python-execution
-
Analysis Task with Initial Documents:
First, create a
documents.jsonfile:[ {"name": "Q2_Revenue", "content": "Q2 2024 Revenue was $50M."}, {"name": "Q2_Sales", "content": "The new product line accounted for 80% of sales growth."} ]Then, run the
dagagent:llm-agent-x dag "Analyze the Q2 financial performance and create a summary." --dag-documents documents.json --output q2_summary.md
These examples demonstrate how to use the LLM Agent X application through its primary interface, the Mission Control UI. For instructions on how to run the application, see the Running the Application guide.
This is the most straightforward use case, where the agent must use its tools (like web search) to find information and complete an objective.
- Open the UI: Navigate to
http://localhost:3000in your web browser. - Locate the Form: Find the "Launch New Task" form in the bottom-left panel.
- Enter Objective: In the input field, type your high-level goal. For example:
Investigate the supply chain of cobalt, its primary uses, and the ethical implications associated with its mining. - Launch: Click the "Send" button (paper airplane icon).
- Observe:
- A new root task will appear in the "Task Swarm" list and in the "DAG View".
- The agent will enter a
planningstate, breaking the objective into smaller sub-tasks (e.g., "Research cobalt supply chain," "Identify ethical concerns"). - These sub-tasks will appear as nodes in the graph, connected to the root task.
- The agent will then begin executing these tasks, using its web search tool to gather information. You can watch the status of each node change from
pendingtorunningtocomplete.
In this scenario, you provide the agent with all the necessary information upfront, and its job is to analyze and synthesize that data.
- Open the Document Manager: In the UI header, click the "Manage Documents" button (file icon). This will open a side panel.
- Create a New Document: Click the "New Document" button.
- Add Financial Data:
- Name:
Q2_Revenue_Report - Content:
Q2 2024 Revenue: $50M, a 15% increase year-over-year. Net Profit: $10M. - Click "Save Document".
- Name:
- Add Another Document:
- Click the "Back" arrow, then "New Document" again.
- Name:
Q2_Sales_Analysis - Content:
The new "QuantumLeap" product line, launched in Q2, accounted for 80% of our sales growth and has a customer satisfaction score of 95%. - Click "Save Document".
- Close the Drawer: Click the "X" or press
Esc. - Launch the Analysis Task: In the "Launch New Task" form, enter:
Analyze the Q2 financial and sales performance and create a concise summary for the leadership team. - Observe: The agent will create a plan that directly references the documents you provided. It will create tasks like "Synthesize Q2 reports" that have dependencies on
Q2_Revenue_ReportandQ2_Sales_Analysisinstead of using web search.
This example shows how to respond when an agent requires human input to proceed.
- Launch a Vague Task: Start a task that is intentionally ambiguous. For example:
Prepare a marketing brief. - Wait for the Question: The agent may determine it lacks critical information. The root task's status will change to
Question(orange), and it will pause. - Inspect the Task: Click on the paused task in the Task List or DAG View.
- Read the Question: In the "Task Inspector" panel on the right, you will see a highlighted section under "Agent's Question," such as:
Priority 8/10: What product is the marketing brief for, and who is the target audience? - Provide an Answer: In the input box below the question, type your response:
The brief is for the "QuantumLeap" product line. The target audience is enterprise-level CTOs. - Submit and Observe: Click "Submit Answer". The agent will consume your response, and the task status will change back to
runningas it continues its work with the new information.
LLM Agent X is an interactive, multi-agent framework for performing complex tasks with real-time human supervision. It uses a message-driven architecture to coordinate between a user interface, a gateway, and one or more agent workers.
This documentation provides a comprehensive guide to running the application and interacting with the agent system.
- Interactive DAG Agent: The core of the system is a persistent agent that models tasks as a graph, allowing for adaptive planning and execution.
- Real-time UI: A web-based "Mission Control" for launching tasks, visualizing the task graph, and providing real-time guidance.
- Dockerized Environment: The entire application stack is containerized for easy setup and deployment with Docker Compose.
- Human-in-the-Loop Control: Operators can pause, resume, cancel, and redirect tasks, or answer questions posed by the agent.
- REST & Socket.IO API: A gateway provides programmatic access for control and real-time state monitoring.
- Running the Application: The primary guide to set up and run the entire LLM Agent X stack using Docker.
- Interactive Mode: Learn about the system's architecture and how to control the agent via the UI and API.
- Gateway API Reference: Detailed documentation for the REST and Socket.IO API.
- Python Sandbox: Learn about the optional sandbox for safe code execution.
- Usage Examples: See practical examples of how to use the Mission Control UI.
This project is licensed under the MIT License. See the LICENSE file for details.
This guide explains how to run the complete LLM Agent X application stack—including the Mission Control UI, the API Gateway, and the Agent Worker—using Docker Compose.
- Docker and Docker Compose (usually included with Docker Desktop).
- Git for cloning the repository.
- An OpenAI API Key.
- A Brave Search API Key (for the web search tool).
Open your terminal and clone the project repository:
git clone https://github.com/cvaz1306/llm_agent_x.git
cd llm_agent_xThe project uses a .env file to manage secret keys and configuration. Create one by copying the example file:
cp .env.example .envNow, open the newly created .env file in a text editor and add your API keys. It should look like this:
# Required: Add your API keys here
OPENAI_API_KEY="sk-..."
BRAVE_API_KEY="..."
# --- Pre-configured for Docker ---
# These variables are already set up for the Docker environment.
# You generally do not need to change them.
RABBITMQ_HOST=rabbitmq
GATEWAY_RELOAD=false
NEXT_PUBLIC_API_URL=http://localhost:8000OPENAI_API_KEY: Essential for the agent worker to use language models.BRAVE_API_KEY: Required for thebrave_web_searchtool.
From the root directory of the project, run the following command:
docker-compose up --buildThis command will:
- Build the Docker images for all services (
ui,gateway,worker,sandbox) if they don't already exist. - Create and start containers for all services, including the RabbitMQ message broker.
- Network the containers so they can communicate with each other.
- Stream logs from all containers to your terminal, so you can see what's happening.
The first time you run this, it may take a few minutes to download the base images and build everything. Subsequent launches will be much faster.
Once all the services have started (you'll see log output from the ui, gateway, and worker), open your web browser and navigate to:
You should see the Mission Control UI, ready to accept new tasks.
To stop all the running services, press Ctrl+C in the terminal where docker-compose is running.
LLM Agent X is designed to run as a persistent, interactive service. This mode is ideal for complex, long-running tasks that may require human supervision, intervention, or dynamic goal changes. The system is composed of several containerized services that work together, with a web UI for operator control.
The interactive mode operates on a distributed, microservice-style architecture:
-
Gateway Server:
- A FastAPI web server that acts as the primary entry point for all user interactions.
- Exposes a REST API for creating tasks and sending control directives.
- Hosts a Socket.IO server to broadcast real-time state updates to connected clients (like the Mission Control UI).
- Communicates with the agent worker by publishing directives to a RabbitMQ message queue.
-
Agent Worker:
- A long-running process that runs the
InteractiveDAGAgent. - Listens for directives from the Gateway on a RabbitMQ queue.
- Executes the agent's lifecycle, including planning, task execution, and state changes.
- Publishes comprehensive state updates back to a RabbitMQ exchange, which are then picked up by the Gateway and broadcast to all connected clients.
- A long-running process that runs the
-
Mission Control UI:
- A Next.js web application that provides the main user interface.
- Connects to the Gateway's API and Socket.IO server to display the task graph and allow the operator to send commands.
-
RabbitMQ:
- The message broker that decouples the Gateway and Worker, ensuring reliable communication between them.
This separation ensures that the core agent logic is decoupled from the web interface, allowing for robust, scalable, and resilient operation.
The entire application stack is managed via Docker Compose. To run the system in interactive mode, please follow the main guide:
The guide covers cloning the repository, setting up your environment variables, and launching all services with a single command.
Once the services are running, open a web browser and navigate to the Mission Control UI, which is typically available at:
The primary way to interact with the agent is through the Mission Control UI, which uses the Gateway API behind the scenes. You can:
- Launch New Tasks: Use the form to define a high-level objective.
- Monitor Progress: Watch the DAG View update in real-time as the agent creates and executes tasks.
- Inspect Tasks: Click on any task node to see its details, dependencies, and results in the inspector pane.
- Issue Directives: Use the command palette in the inspector to pause, cancel, redirect, or manually complete tasks.
- Answer Questions: If an agent needs clarification, the task will pause, and an input box will appear in the inspector for you to provide an answer.
For detailed information on the programmatic interface, see the Gateway API Reference.