Shell MCP

A Model Context Protocol (MCP) server that gives AI agents direct access to the host operating system shell. It exposes four tools over the MCP stdio transport so any MCP-compatible client (LangChain, Claude Desktop, etc.) can execute commands, manage background processes, and detect the host OS, all through a standardised interface.

Why This Project Exists

This project was born from hands-on experience building coding agents. While developing my own coding agent, I realized that shell access is one of the most critical capabilities an agent can have, and getting it right is harder than it looks.

The key challenges I spent a lot of time solving:

Foreground vs. background execution: some commands need to run and return immediately, others (like dev servers or long builds) need to keep running in the background while the agent continues working.
Timeouts and process lifecycle: knowing when to wait, when to kill, and how to gracefully handle processes that hang or crash early.
Reading output from live processes: giving the agent a way to check on background work without blocking.

These are problems every coding agent builder will face. The tool names and functionality here (execut_command, read_output, kill_process, get_system) represent what most real-world coding agents need. In your own project you'll likely have tools with the same purpose, adapted to your stack and language, but the architecture and the patterns stay the same.

Think of this project as a reference implementation. A working example of how to structure shell tools for a coding agent, how to handle the tricky parts (background processes, timeouts, graceful termination), and how to expose them over MCP so any client can use them. It's not a framework; it's a blueprint you can learn from and adapt.

How It Was Built

Stack

Layer	Technology	Role
MCP Framework	FastMCP (`mcp[cli]`)	Provides the MCP server skeleton, tool registration via `@mcp.tool()` decorators, and the `stdio` transport. This is the core that makes the tools discoverable by any MCP client.
Python	3.13+	The entire server is pure Python with no external runtime dependencies beyond the standard library and FastMCP.
asyncio	stdlib	Used for non-blocking foreground command execution (`asyncio.create_subprocess_shell`) and timeout handling (`asyncio.wait_for`).
subprocess	stdlib	Powers background process spawning (`subprocess.Popen`) and process inspection (`lsof`, `tasklist`).
os / signal	stdlib	Handles cross-platform process lifecycle: existence checks (`os.kill(pid, 0)`), graceful termination (`SIGTERM`), and force kill (`SIGKILL`).
platform	stdlib	OS detection so the agent knows whether it's talking to Linux, macOS, or Windows.
logging	stdlib	All logs go to `stderr` (never `stdout`, since MCP uses `stdout` for JSON-RPC communication).
uv	Package manager	Manages dependencies and runs the server (`uv run main.py`).

Architecture Decisions

One tool per file (tools/ directory): each MCP tool lives in its own module with its own docstring. The @mcp.tool() decorator auto-registers it when imported. This keeps things clean and makes it easy to add or remove tools.
ShellWrapper singleton: all shell logic lives in a single class (ShellWrapper) instantiated once as sw. The tool files are thin wrappers that just call sw.method(). This separates the MCP layer from the actual shell logic, so you can test or reuse ShellWrapper independently.
Foreground with timeout vs. background with PID: instead of one generic "run" function, the server explicitly separates these two modes. Foreground commands block and return output. Background commands return a PID immediately, and the agent can poll or kill later. This distinction is critical for coding agents that need to start servers, run builds, or do long tasks.
Graceful termination with fallback: kill_process first sends SIGTERM, waits up to 1 second, then escalates to SIGKILL. This prevents zombie processes while still giving well-behaved processes a chance to clean up.
Logging to stderr only: MCP communicates over stdout using JSON-RPC. Any stray print() would corrupt the protocol. All logging is routed to stderr via Python's logging module.

Tools

Tool	Description
`get_system`	Returns the host OS name (`Linux`, `Darwin`, `Windows`) so the agent can pick the right shell syntax.
`execut_command`	Runs a shell command in the foreground (with timeout) or in the background (returns a PID).
`read_output`	Inspects a running background process and returns its current state and open resources.
`kill_process`	Gracefully terminates a background process (SIGTERM then SIGKILL fallback).

Project Structure

shell_mcp/
├── main.py              # Entry point, starts the MCP server on stdio
├── mcp_engine.py        # FastMCP server instance + instructions
├── shell_wrapper.py      # Core shell logic (run, background, read, kill)
├── logger.py            # Logging config (writes to stderr, not stdout)
├── tools/               # One file per MCP tool
│   ├── __init__.py
│   ├── execut_command.py
│   ├── get_system.py
│   ├── read_output.py
│   └── kill_process.py
├── pyproject.toml
└── use.py               # Example: LangChain agent using the MCP server

Prerequisites

Python 3.13+
uv, a fast Python package manager

Installation

# Clone the repository
git clone <your-repo-url> shell_mcp
cd shell_mcp

# Install dependencies with uv
uv sync

Integration Guide

1. LangChain + LangGraph Agent (with PostgreSQL memory)

This is a complete, copy-pasteable example. It connects to the Shell MCP server over stdio, creates a LangChain agent, and runs a conversational loop.

Step 1:Install dependencies in your client project

uv add langchain langchain-mcp-adapters langchain-openai langgraph-checkpoint-postgres python-dotenv

Step 2:Create a `.env` file

OPENAI_API_KEY=sk-...

Step 3:Create `app.py`

import os
import asyncio

from dotenv import load_dotenv
load_dotenv(override=True)

from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver

# --- Config ---
DB_URI = "postgresql://postgres:postgres@localhost:5432/postgres"
THREAD_ID = "1"
SHELL_MCP_DIR = "/absolute/path/to/shell_mcp"  # <-- change this


async def main():
    # 1. Connect to Postgres for conversation memory
    async with AsyncPostgresSaver.from_conn_string(DB_URI) as checkpointer:
        await checkpointer.setup()

        # 2. Start the MCP server as a subprocess
        client = MultiServerMCPClient(
            {
                "shell_mcp": {
                    "transport": "stdio",
                    "command": "uv",
                    "args": ["run", "--directory", SHELL_MCP_DIR, "main.py"],
                },
            }
        )

        # 3. Fetch the tools the MCP server exposes
        tools = await client.get_tools()

        # 4. Create a LangChain agent with those tools
        agent = create_agent(
            "openai:gpt-4.1",
            tools,
            checkpointer=checkpointer,
        )
        config = {"configurable": {"thread_id": THREAD_ID}}

        # 5. Chat loop
        print("Chat with Shell MCP (type 'exit' to quit)\n")
        while True:
            user_input = input("You: ").strip()
            if user_input.lower() == "exit":
                break

            response = await agent.ainvoke(
                {"messages": [HumanMessage(content=user_input)]},
                config=config,
            )
            print(f"AI: {response.get('messages')[-1].content}\n")


if __name__ == "__main__":
    asyncio.run(main())

Step 4:Run it

uv run app.py

2. LangChain Agent without PostgreSQL (in-memory, simpler setup)

If you don't need persistent conversation memory, you can skip Postgres entirely:

import os
import asyncio

from dotenv import load_dotenv
load_dotenv(override=True)

from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage

SHELL_MCP_DIR = "/absolute/path/to/shell_mcp"  # <-- change this


async def main():
    client = MultiServerMCPClient(
        {
            "shell_mcp": {
                "transport": "stdio",
                "command": "uv",
                "args": ["run", "--directory", SHELL_MCP_DIR, "main.py"],
            },
        }
    )

    tools = await client.get_tools()

    agent = create_agent("openai:gpt-4.1", tools)
    config = {"configurable": {"thread_id": "1"}}

    response = await agent.ainvoke(
        {"messages": [HumanMessage(content="What OS is this running on?")]},
        config=config,
    )
    print(f"AI: {response.get('messages')[-1].content}")


if __name__ == "__main__":
    asyncio.run(main())

3. Claude Desktop Integration

Add this to your Claude Desktop MCP config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "shell_mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/absolute/path/to/shell_mcp", "main.py"]
    }
  }
}

Restart Claude Desktop:the four shell tools will appear automatically.

Example Prompts

Once integrated, you can ask your agent things like:

"What operating system is this?"
"List all files in /Users/me/projects"
"Run npm run dev in the background in /Users/me/app, wait a minute, then show me the output"
"Kill process 12345"

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
tools		tools
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
logger.py		logger.py
main.py		main.py
mcp_engine.py		mcp_engine.py
pyproject.toml		pyproject.toml
shell_wrapper.py		shell_wrapper.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shell MCP

Why This Project Exists

How It Was Built

Stack

Architecture Decisions

Tools

Project Structure

Prerequisites

Installation

Integration Guide

1. LangChain + LangGraph Agent (with PostgreSQL memory)

Step 1:Install dependencies in your client project

Step 2:Create a `.env` file

Step 3:Create `app.py`

Step 4:Run it

2. LangChain Agent without PostgreSQL (in-memory, simpler setup)

3. Claude Desktop Integration

Example Prompts

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shell MCP

Why This Project Exists

How It Was Built

Stack

Architecture Decisions

Tools

Project Structure

Prerequisites

Installation

Integration Guide

1. LangChain + LangGraph Agent (with PostgreSQL memory)

Step 1:Install dependencies in your client project

Step 2:Create a .env file

Step 3:Create app.py

Step 4:Run it

2. LangChain Agent without PostgreSQL (in-memory, simpler setup)

3. Claude Desktop Integration

Example Prompts

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 2:Create a `.env` file

Step 3:Create `app.py`

Packages