Skip to content

[BUG] No mid-loop compaction for custom MCP tools; parallel tool batches skip compaction check #531

@sahasra098

Description

@sahasra098

Description

During a tool execution loop, the CLI does not trigger compaction for custom MCP tool results. Built-in tools (Read, Bash) do compact in sequential mode, but parallel batches of any tool type overflow context with no compaction opportunity.

This means:

  1. Custom MCP tools: Context grows unbounded regardless of execution mode (sequential or parallel). Currently succeeds only because the API context limit is high enough — but any reduction in API limits or increase in tool output volume will cause hard failures.
  2. Parallel built-in tools: All tool results land in one batch — no mid-loop compaction check — resulting in "Prompt is too long" error.

Two distinct bugs

  • Bug A — Custom MCP tools never compact: Custom MCP tool results arrive via the MCP protocol and bypass the compaction check path entirely. Context grows unbounded in both sequential and parallel modes.
  • Bug B — Parallel batches skip compaction: When any tools (built-in or custom) execute in parallel, all results land in a single message with no inter-iteration compaction checkpoint.

Environment

  • Claude Code CLI: 2.1.23 (also reproduces on 2.0.62)
  • claude-agent-sdk: 0.1.25
  • Platform: macOS (Darwin 25.2.0)
  • Python: 3.13

Reproduction

Parallel built-in Read (fails with "Prompt is too long")

"""Minimal repro: 10 files x ~46KB each, parallel Read."""
import asyncio, shutil, tempfile
from pathlib import Path
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
from claude_agent_sdk.types import AssistantMessage, ResultMessage, ToolUseBlock

NUM_FILES = 10
LINES_PER_FILE = 800  # ~46k chars per file

async def run():
    work_dir = Path(tempfile.mkdtemp(prefix="compaction_repro_"))
    # Generate files
    for i in range(1, NUM_FILES + 1):
        content = "".join(
            f"file{i}_line{j:05d}: abcdefghijabcdefghijabcdefghijabcdefghij\n"
            for j in range(LINES_PER_FILE)
        )
        (work_dir / f"file_{i}.txt").write_text(content)

    file_list = "\n".join(str(work_dir / f"file_{i}.txt") for i in range(1, NUM_FILES + 1))
    prompt = f"""Read ALL of these files using the Read tool. Read each file in FULL.

{file_list}

RULES:
- Use ONLY the Read tool
- Read ALL {NUM_FILES} files completely
- After reading all files, say "DONE"
"""

    client = ClaudeSDKClient(ClaudeAgentOptions(
        cwd=str(work_dir),
        permission_mode="bypassPermissions",
        allowed_tools=["Read"],
    ))
    await client.connect()
    await client.query(prompt)

    async for event in client.receive_response():
        if isinstance(event, ResultMessage):
            print(f"Error: {event.is_error}, Result: {event.result[:200] if event.result else ''}")
            break
        elif isinstance(event, AssistantMessage):
            tool_uses = [b for b in event.content if isinstance(b, ToolUseBlock)]
            if tool_uses:
                print(f"Tool calls: {len(tool_uses)}")

    await client.disconnect()
    shutil.rmtree(work_dir, ignore_errors=True)

asyncio.run(run())

Result: Prompt is too long — Claude batches all 10 Read calls in parallel, context exceeds limit in one shot.

Sequential custom MCP tool (no compaction, context unbounded)

"""Minimal repro: custom MCP tool returning large payloads, no compaction triggered."""
import asyncio
from mcp.server.fastmcp import FastMCP
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient, create_sdk_mcp_server
from claude_agent_sdk.types import ResultMessage

mcp = FastMCP("large-payload-server")

@mcp.tool()
def read_large_file(file_id: int) -> str:
    """Read a large file by ID. Returns the full file content."""
    return f"file_{file_id}_content_" + ("x" * 100_000)

async def run():
    server = create_sdk_mcp_server(mcp)
    file_list = "\n".join(f"- file_id={i}" for i in range(1, 21))
    prompt = f"""Call read_large_file for each of these files, one at a time:
{file_list}
After all 20 calls, say DONE."""

    client = ClaudeSDKClient(ClaudeAgentOptions(
        permission_mode="bypassPermissions",
        mcp_servers=[server],
    ))
    await client.connect()
    await client.query(prompt)

    async for event in client.receive_response():
        if isinstance(event, ResultMessage):
            print(f"Error: {event.is_error}, Result: {event.result[:200] if event.result else ''}")
            break

    await client.disconnect()

asyncio.run(run())

Result: All 20 calls succeed with no compaction events (0 SystemMessage pauses). Context grows to ~2M chars unbounded. No error today, but only because the API limit is high enough to absorb it.

CLI manual repro (parallel built-in Read)

First, generate test files:

mkdir -p /tmp/compaction_repro
for i in $(seq 1 10); do
  python3 -c "
for j in range(800):
    print(f'file\${i}_line{j:05d}: abcdefghijabcdefghijabcdefghijabcdefghij')
" > /tmp/compaction_repro/file_\${i}.txt
done

Then paste into a fresh Claude Code session:

Read ALL of these files completely using the Read tool. Read each file in FULL - do not skip content, do not use head/tail/bash. Use ONLY the Read tool.

/tmp/compaction_repro/file_1.txt
/tmp/compaction_repro/file_2.txt
/tmp/compaction_repro/file_3.txt
/tmp/compaction_repro/file_4.txt
/tmp/compaction_repro/file_5.txt
/tmp/compaction_repro/file_6.txt
/tmp/compaction_repro/file_7.txt
/tmp/compaction_repro/file_8.txt
/tmp/compaction_repro/file_9.txt
/tmp/compaction_repro/file_10.txt

RULES:
- Use ONLY the Read tool
- Read ALL 10 files completely
- After reading all files, say "DONE"

Result: "Prompt is too long" after parallel Read batch.

Test matrix

Mode Tool Type Result Compactions
Sequential Built-in Read PASS 3 (working correctly)
Parallel Built-in Read FAIL ("Prompt is too long") 0
Sequential Custom MCP PASS (no error) 0 (unbounded growth)
Parallel Custom MCP FAIL (expected, same as parallel built-in) 0

Risk: silent success masking unbounded growth

The sequential custom MCP scenario succeeds today without compaction because the API context limit is large enough (~2M chars / ~500k tokens fits). This is a latent failure:

  • If API limits are tightened, these workflows will break with no code changes.
  • Workflows that accumulate more tool results (e.g., 50+ calls, larger payloads) will hit the limit.
  • No compaction means no summarization of earlier context, degrading response quality even when the limit isn't hit.

Root cause

The CLI checks context size and triggers compaction between tool loop iterations for built-in tools, but:

  1. Custom MCP tool results arrive via the MCP protocol and bypass the compaction check path entirely
  2. Parallel tool batches return all results in a single message, so there's no inter-iteration checkpoint

Expected behavior

  1. Compaction should trigger for custom MCP tool results the same way it does for built-in tools
  2. After receiving a large parallel batch of tool results, the CLI should check context size and compact before the next API call

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions