LLM Agent X

Overview

LLM Agent X is an interactive, multi-agent framework designed for executing complex tasks with real-time human supervision. It leverages a powerful InteractiveDAGAgent that models tasks as a Directed Acyclic Graph (DAG), enabling sophisticated, non-linear workflows.

The entire system is containerized with Docker and orchestrated through a message queue, providing a robust, scalable architecture. A web-based "Mission Control" UI allows operators to launch tasks, monitor progress, inspect results, and provide real-time guidance to the agent swarm.

⚠️ Security Warning: This project is a research demonstration. It can be configured to execute arbitrary code generated by a language model, which is inherently dangerous. Use this in a sandboxed environment and only with trusted inputs.

Features

Interactive DAG Agent: A persistent agent that models tasks as a graph, allowing for complex dependencies and adaptive planning.
Real-time Mission Control UI: A Next.js frontend for launching, monitoring, and controlling agent execution in real-time.
Message Queue Architecture: Decoupled Gateway and Worker components communicate via RabbitMQ for resilience and scalability.
Fully Dockerized: The entire stack—UI, Gateway, Worker, and RabbitMQ—is managed with a single docker-compose command.
Human-in-the-Loop: Operators can pause, resume, cancel, and redirect tasks, or answer questions posed by the agent.
Extensible Tool Use: Agents can be equipped with tools like web search (brave_web_search) and sandboxed code execution (exec_python).

Getting Started

Prerequisites

Docker and Docker Compose
Git

1. Clone the Repository

git clone https://github.com/cvaz1306/llm_agent_x.git
cd llm_agent_x

2. Configure Environment Variables

Create a .env file in the root of the project by copying the example.

cp .env.example .env

Now, edit the .env file and add your required API keys:

# Required for the agent worker to function
OPENAI_API_KEY="your_openai_api_key"
BRAVE_API_KEY="your_brave_search_api_key"

# Other variables are pre-configured for Docker Compose

3. Run the Application

Build and launch all services using Docker Compose.

docker-compose up --build

This command will:

Build the Docker images for the UI, gateway, worker, and sandbox.
Start all the necessary services.
Begin streaming logs from all containers to your terminal.

4. Access Mission Control

Once the services are running, open your web browser and navigate to:

http://localhost:3000

You can now use the Mission Control UI to launch and monitor your agents.

For Developers: Programmatic Use

While the primary interface is the interactive UI, the underlying llm-agent-x Python package can be used directly for custom integrations. See the API Reference for details on the Gateway API and the source code for direct agent usage.

Documentation

Running the Application: The primary guide to get the full application running with Docker.
Interactive Mode: An in-depth look at the architecture and how to control the agent.
Gateway API Reference: Detailed documentation for the REST and Socket.IO API.
Python Sandbox: Information on the code execution sandbox.

License

This project is licensed under the MIT License.

Gateway API Reference

When running in interactive mode, LLM Agent X exposes a Gateway service. This service provides a REST API for high-level control and a Socket.IO endpoint for receiving real-time state updates. This is the primary programmatic interface for the application.

REST API

The Gateway's REST API is the entry point for creating tasks and sending commands (directives) to the agent swarm.

`GET /api/tasks`

Description: Retrieves a snapshot of the current state of all tasks and documents known to the Gateway. The state is cached and updated in real-time via the worker's broadcasts.
Method: GET

Response (200 OK):

{
  "tasks": {
    "TASK_ID_1": {
      "id": "TASK_ID_1",
      "desc": "Description of the task...",
      "status": "running",
      "parent": "ROOT_TASK",
      "children": ["CHILD_ID_1"],
      "deps": ["dep_id_1"],
      "result": null
    },
    "DOC_ID_1": {
      "id": "DOC_ID_1",
      "desc": "Document: My Report",
      "status": "complete",
      "task_type": "document",
      "document_state": {
          "content": "The content of the document...",
          "version": 1
      }
    }
  }
}

`POST /api/tasks`

Description: Creates a new root task for the agent to begin working on.
Method: POST

Request Body:

{
  "desc": "The high-level objective for the agent.",
  "mcp_servers": [] // Optional: configuration for MCP servers
}

Response (200 OK):
```
{ "status": "new task submitted" }
```

`POST /api/tasks/{task_id}/directive`

Description: Sends a specific control command (a "directive") to a running task.
Method: POST
URL Parameter: task_id - The ID of the task to control.

Request Body:

{
  "command": "DIRECTIVE_NAME",
  "payload": "..." // Value depends on the command
}

Response (200 OK):
```
{ "status": "directive sent" }
```

Available Directives

Directives are the core mechanism for interacting with the agent. They are sent to the /api/tasks/{task_id}/directive endpoint.

Command	Payload	Description
`PAUSE`	`null`	Pauses the execution of the specified task. The task's state is preserved.
`RESUME`	`null`	Resumes a task that was previously paused by a human operator.
`ANSWER_QUESTION`	`string`	Provides an answer to a question the agent asked when it entered the `waiting_for_user_response` state. The agent will consume the answer and resume.
`REDIRECT`	`string`	Provides a new instruction or clarification to the task. This forces the agent to re-evaluate its approach.
`MANUAL_OVERRIDE`	`string`	Forces a task to be marked as `complete` and sets its result to the provided payload. This is useful for manually completing a step.
`CANCEL`	`string` (optional reason)	Marks a task as `cancelled`. The task is not deleted but is removed from the active execution flow.
`PRUNE_TASK`	`string` (optional reason)	Permanently removes a task and all its children from the graph. This is a destructive action used to clean up the task tree.
`TERMINATE`	`string` (optional reason)	Forcibly marks a task as `failed`.

Socket.IO Events

The Gateway broadcasts state changes from the agent worker via Socket.IO, allowing a UI or other clients to update in real-time.

Event: task_update
Description: Emitted whenever a task's state changes. This is the primary event for monitoring the system.

Payload: A JSON object containing the complete, updated state of a single task.

{
  "task": {
    "id": "TASK_ID",
    "desc": "Task description",
    "status": "running",
    "result": null,
    "children": ["CHILD_ID_1"],
    // ... and all other fields from the Task model
  }
}

Clients should listen for this event to receive the latest state of any task that has been modified by the agent worker.

Legacy CLI

Note: This command-line interface is intended for simple, non-interactive tests and demonstrations of the underlying agent library. For the full experience with real-time monitoring, human-in-the-loop control, and DAG visualization, please use the Interactive Mode which is the recommended way to run LLM Agent X.

The llm-agent-x CLI provides a way to run the RecursiveAgent or a non-interactive DAGAgent directly from your terminal for one-off tasks.

Basic Usage

llm-agent-x <agent_type> "Your task description" [options]

Agent Selection

Argument	Description
`agent_type`	(Positional) The type of agent to run. Choices: `recursive`, `dag`.
`task`	(Positional) The main objective for the agent to execute.

Common Options

Argument	Description	Default Value
`--model`	The name of the LLM to use (e.g., `gpt-4o-mini`).	`gpt-4o-mini`
`--output`	Path to save the final response.	`None` (prints to console)
`--enable-python-execution`	Enable the `exec_python` tool. Requires the Sandbox to be running.	Disabled

`recursive` Agent Examples

Basic Research Task:

llm-agent-x recursive "Research the history of artificial intelligence, focusing on key breakthroughs."

Task with Custom Decomposition Limits:

llm-agent-x recursive "Write a Python script to fetch today's weather." --task_limit "[2,1,0]" --enable-python-execution

`dag` Agent Examples

Analysis Task with Initial Documents:

First, create a documents.json file:

[
  {"name": "Q2_Revenue", "content": "Q2 2024 Revenue was $50M."},
  {"name": "Q2_Sales", "content": "The new product line accounted for 80% of sales growth."}
]

Then, run the dag agent:

llm-agent-x dag "Analyze the Q2 financial performance and create a summary." --dag-documents documents.json --output q2_summary.md

Usage Examples (Interactive Mode)

These examples demonstrate how to use the LLM Agent X application through its primary interface, the Mission Control UI. For instructions on how to run the application, see the Running the Application guide.

Example 1: Starting a New Research Task

This is the most straightforward use case, where the agent must use its tools (like web search) to find information and complete an objective.

Open the UI: Navigate to http://localhost:3000 in your web browser.
Locate the Form: Find the "Launch New Task" form in the bottom-left panel.
Enter Objective: In the input field, type your high-level goal. For example:

Investigate the supply chain of cobalt, its primary uses, and the ethical implications associated with its mining.
Launch: Click the "Send" button (paper airplane icon).
Observe:
- A new root task will appear in the "Task Swarm" list and in the "DAG View".
- The agent will enter a planning state, breaking the objective into smaller sub-tasks (e.g., "Research cobalt supply chain," "Identify ethical concerns").
- These sub-tasks will appear as nodes in the graph, connected to the root task.
- The agent will then begin executing these tasks, using its web search tool to gather information. You can watch the status of each node change from pending to running to complete.

Example 2: Analyzing Provided Documents

In this scenario, you provide the agent with all the necessary information upfront, and its job is to analyze and synthesize that data.

Open the Document Manager: In the UI header, click the "Manage Documents" button (file icon). This will open a side panel.
Create a New Document: Click the "New Document" button.
Add Financial Data:
- Name: Q2_Revenue_Report
- Content: Q2 2024 Revenue: $50M, a 15% increase year-over-year. Net Profit: $10M.
- Click "Save Document".
Add Another Document:
- Click the "Back" arrow, then "New Document" again.
- Name: Q2_Sales_Analysis
- Content: The new "QuantumLeap" product line, launched in Q2, accounted for 80% of our sales growth and has a customer satisfaction score of 95%.
- Click "Save Document".
Close the Drawer: Click the "X" or press Esc.
Launch the Analysis Task: In the "Launch New Task" form, enter:

Analyze the Q2 financial and sales performance and create a concise summary for the leadership team.
Observe: The agent will create a plan that directly references the documents you provided. It will create tasks like "Synthesize Q2 reports" that have dependencies on Q2_Revenue_Report and Q2_Sales_Analysis instead of using web search.

Example 3: Human-in-the-Loop Interaction

This example shows how to respond when an agent requires human input to proceed.

Launch a Vague Task: Start a task that is intentionally ambiguous. For example:

Prepare a marketing brief.
Wait for the Question: The agent may determine it lacks critical information. The root task's status will change to Question (orange), and it will pause.
Inspect the Task: Click on the paused task in the Task List or DAG View.
Read the Question: In the "Task Inspector" panel on the right, you will see a highlighted section under "Agent's Question," such as:

Priority 8/10: What product is the marketing brief for, and who is the target audience?
Provide an Answer: In the input box below the question, type your response:

The brief is for the "QuantumLeap" product line. The target audience is enterprise-level CTOs.
Submit and Observe: Click "Submit Answer". The agent will consume your response, and the task status will change back to running as it continues its work with the new information.

LLM Agent X Documentation

Overview

LLM Agent X is an interactive, multi-agent framework for performing complex tasks with real-time human supervision. It uses a message-driven architecture to coordinate between a user interface, a gateway, and one or more agent workers.

This documentation provides a comprehensive guide to running the application and interacting with the agent system.

Key Features

Interactive DAG Agent: The core of the system is a persistent agent that models tasks as a graph, allowing for adaptive planning and execution.
Real-time UI: A web-based "Mission Control" for launching tasks, visualizing the task graph, and providing real-time guidance.
Dockerized Environment: The entire application stack is containerized for easy setup and deployment with Docker Compose.
Human-in-the-Loop Control: Operators can pause, resume, cancel, and redirect tasks, or answer questions posed by the agent.
REST & Socket.IO API: A gateway provides programmatic access for control and real-time state monitoring.

Getting Started

Running the Application: The primary guide to set up and run the entire LLM Agent X stack using Docker.
Interactive Mode: Learn about the system's architecture and how to control the agent via the UI and API.
Gateway API Reference: Detailed documentation for the REST and Socket.IO API.
Python Sandbox: Learn about the optional sandbox for safe code execution.
Usage Examples: See practical examples of how to use the Mission Control UI.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Running the Application

This guide explains how to run the complete LLM Agent X application stack—including the Mission Control UI, the API Gateway, and the Agent Worker—using Docker Compose.

Prerequisites

Docker and Docker Compose (usually included with Docker Desktop).
Git for cloning the repository.
An OpenAI API Key.
A Brave Search API Key (for the web search tool).

Step 1: Clone the Repository

Open your terminal and clone the project repository:

git clone https://github.com/cvaz1306/llm_agent_x.git
cd llm_agent_x

Step 2: Configure Environment Variables

The project uses a .env file to manage secret keys and configuration. Create one by copying the example file:

cp .env.example .env

Now, open the newly created .env file in a text editor and add your API keys. It should look like this:

# Required: Add your API keys here
OPENAI_API_KEY="sk-..."
BRAVE_API_KEY="..."

# --- Pre-configured for Docker ---
# These variables are already set up for the Docker environment.
# You generally do not need to change them.
RABBITMQ_HOST=rabbitmq
GATEWAY_RELOAD=false
NEXT_PUBLIC_API_URL=http://localhost:8000

OPENAI_API_KEY: Essential for the agent worker to use language models.
BRAVE_API_KEY: Required for the brave_web_search tool.

Step 3: Launch with Docker Compose

From the root directory of the project, run the following command:

docker-compose up --build

This command will:

Build the Docker images for all services (ui, gateway, worker, sandbox) if they don't already exist.
Create and start containers for all services, including the RabbitMQ message broker.
Network the containers so they can communicate with each other.
Stream logs from all containers to your terminal, so you can see what's happening.

The first time you run this, it may take a few minutes to download the base images and build everything. Subsequent launches will be much faster.

Step 4: Access Mission Control

Once all the services have started (you'll see log output from the ui, gateway, and worker), open your web browser and navigate to:

http://localhost:3000

You should see the Mission Control UI, ready to accept new tasks.

Shutting Down

To stop all the running services, press Ctrl+C in the terminal where docker-compose is running.

Interactive Mode

LLM Agent X is designed to run as a persistent, interactive service. This mode is ideal for complex, long-running tasks that may require human supervision, intervention, or dynamic goal changes. The system is composed of several containerized services that work together, with a web UI for operator control.

Architecture Overview

The interactive mode operates on a distributed, microservice-style architecture:

Gateway Server:
- A FastAPI web server that acts as the primary entry point for all user interactions.
- Exposes a REST API for creating tasks and sending control directives.
- Hosts a Socket.IO server to broadcast real-time state updates to connected clients (like the Mission Control UI).
- Communicates with the agent worker by publishing directives to a RabbitMQ message queue.
Agent Worker:
- A long-running process that runs the InteractiveDAGAgent.
- Listens for directives from the Gateway on a RabbitMQ queue.
- Executes the agent's lifecycle, including planning, task execution, and state changes.
- Publishes comprehensive state updates back to a RabbitMQ exchange, which are then picked up by the Gateway and broadcast to all connected clients.
Mission Control UI:
- A Next.js web application that provides the main user interface.
- Connects to the Gateway's API and Socket.IO server to display the task graph and allow the operator to send commands.
RabbitMQ:
- The message broker that decouples the Gateway and Worker, ensuring reliable communication between them.

This separation ensures that the core agent logic is decoupled from the web interface, allowing for robust, scalable, and resilient operation.

How to Run

The entire application stack is managed via Docker Compose. To run the system in interactive mode, please follow the main guide:

➡️ Running the Application

The guide covers cloning the repository, setting up your environment variables, and launching all services with a single command.

Accessing the UI

Once the services are running, open a web browser and navigate to the Mission Control UI, which is typically available at:

http://localhost:3000

Interacting with the Agent

The primary way to interact with the agent is through the Mission Control UI, which uses the Gateway API behind the scenes. You can:

Launch New Tasks: Use the form to define a high-level objective.
Monitor Progress: Watch the DAG View update in real-time as the agent creates and executes tasks.
Inspect Tasks: Click on any task node to see its details, dependencies, and results in the inspector pane.
Issue Directives: Use the command palette in the inspector to pause, cancel, redirect, or manually complete tasks.
Answer Questions: If an agent needs clarification, the task will pause, and an input box will appear in the inspector for you to provide an answer.

For detailed information on the programmatic interface, see the Gateway API Reference.

Name		Name	Last commit message	Last commit date
Latest commit History 364 Commits
.github/workflows		.github/workflows
docs		docs
llm_agent_x		llm_agent_x
mission-control-ui		mission-control-ui
samples		samples
sandbox		sandbox
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.python		Dockerfile.python
LICENSE		LICENSE
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
prompts.json		prompts.json
pyproject.toml		pyproject.toml
readme.md		readme.md

License

llm-agent-x/llm-agent-x

Folders and files

Latest commit

History

Repository files navigation

LLM Agent X

Overview

Features

Getting Started

Prerequisites

1. Clone the Repository

2. Configure Environment Variables

3. Run the Application

4. Access Mission Control

For Developers: Programmatic Use

Documentation

License

Gateway API Reference

REST API

GET /api/tasks

POST /api/tasks

POST /api/tasks/{task_id}/directive

Available Directives

Socket.IO Events

Legacy CLI

Basic Usage

Agent Selection

Common Options

recursive Agent Examples

dag Agent Examples

Usage Examples (Interactive Mode)

Example 1: Starting a New Research Task

Example 2: Analyzing Provided Documents

Example 3: Human-in-the-Loop Interaction

LLM Agent X Documentation

Overview

Key Features

Getting Started

License

Running the Application

Prerequisites

Step 1: Clone the Repository

Step 2: Configure Environment Variables

Step 3: Launch with Docker Compose

Step 4: Access Mission Control

Shutting Down

Interactive Mode

Architecture Overview

How to Run

Accessing the UI

Interacting with the Agent

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

`GET /api/tasks`

`POST /api/tasks`

`POST /api/tasks/{task_id}/directive`

`recursive` Agent Examples

`dag` Agent Examples

Packages