-
Notifications
You must be signed in to change notification settings - Fork 589
Description
Description
During a tool execution loop, the CLI does not trigger compaction for custom MCP tool results. Built-in tools (Read, Bash) do compact in sequential mode, but parallel batches of any tool type overflow context with no compaction opportunity.
This means:
- Custom MCP tools: Context grows unbounded regardless of execution mode (sequential or parallel). Currently succeeds only because the API context limit is high enough — but any reduction in API limits or increase in tool output volume will cause hard failures.
- Parallel built-in tools: All tool results land in one batch — no mid-loop compaction check — resulting in "Prompt is too long" error.
Two distinct bugs
- Bug A — Custom MCP tools never compact: Custom MCP tool results arrive via the MCP protocol and bypass the compaction check path entirely. Context grows unbounded in both sequential and parallel modes.
- Bug B — Parallel batches skip compaction: When any tools (built-in or custom) execute in parallel, all results land in a single message with no inter-iteration compaction checkpoint.
Environment
- Claude Code CLI: 2.1.23 (also reproduces on 2.0.62)
- claude-agent-sdk: 0.1.25
- Platform: macOS (Darwin 25.2.0)
- Python: 3.13
Reproduction
Parallel built-in Read (fails with "Prompt is too long")
"""Minimal repro: 10 files x ~46KB each, parallel Read."""
import asyncio, shutil, tempfile
from pathlib import Path
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
from claude_agent_sdk.types import AssistantMessage, ResultMessage, ToolUseBlock
NUM_FILES = 10
LINES_PER_FILE = 800 # ~46k chars per file
async def run():
work_dir = Path(tempfile.mkdtemp(prefix="compaction_repro_"))
# Generate files
for i in range(1, NUM_FILES + 1):
content = "".join(
f"file{i}_line{j:05d}: abcdefghijabcdefghijabcdefghijabcdefghij\n"
for j in range(LINES_PER_FILE)
)
(work_dir / f"file_{i}.txt").write_text(content)
file_list = "\n".join(str(work_dir / f"file_{i}.txt") for i in range(1, NUM_FILES + 1))
prompt = f"""Read ALL of these files using the Read tool. Read each file in FULL.
{file_list}
RULES:
- Use ONLY the Read tool
- Read ALL {NUM_FILES} files completely
- After reading all files, say "DONE"
"""
client = ClaudeSDKClient(ClaudeAgentOptions(
cwd=str(work_dir),
permission_mode="bypassPermissions",
allowed_tools=["Read"],
))
await client.connect()
await client.query(prompt)
async for event in client.receive_response():
if isinstance(event, ResultMessage):
print(f"Error: {event.is_error}, Result: {event.result[:200] if event.result else ''}")
break
elif isinstance(event, AssistantMessage):
tool_uses = [b for b in event.content if isinstance(b, ToolUseBlock)]
if tool_uses:
print(f"Tool calls: {len(tool_uses)}")
await client.disconnect()
shutil.rmtree(work_dir, ignore_errors=True)
asyncio.run(run())Result: Prompt is too long — Claude batches all 10 Read calls in parallel, context exceeds limit in one shot.
Sequential custom MCP tool (no compaction, context unbounded)
"""Minimal repro: custom MCP tool returning large payloads, no compaction triggered."""
import asyncio
from mcp.server.fastmcp import FastMCP
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient, create_sdk_mcp_server
from claude_agent_sdk.types import ResultMessage
mcp = FastMCP("large-payload-server")
@mcp.tool()
def read_large_file(file_id: int) -> str:
"""Read a large file by ID. Returns the full file content."""
return f"file_{file_id}_content_" + ("x" * 100_000)
async def run():
server = create_sdk_mcp_server(mcp)
file_list = "\n".join(f"- file_id={i}" for i in range(1, 21))
prompt = f"""Call read_large_file for each of these files, one at a time:
{file_list}
After all 20 calls, say DONE."""
client = ClaudeSDKClient(ClaudeAgentOptions(
permission_mode="bypassPermissions",
mcp_servers=[server],
))
await client.connect()
await client.query(prompt)
async for event in client.receive_response():
if isinstance(event, ResultMessage):
print(f"Error: {event.is_error}, Result: {event.result[:200] if event.result else ''}")
break
await client.disconnect()
asyncio.run(run())Result: All 20 calls succeed with no compaction events (0 SystemMessage pauses). Context grows to ~2M chars unbounded. No error today, but only because the API limit is high enough to absorb it.
CLI manual repro (parallel built-in Read)
First, generate test files:
mkdir -p /tmp/compaction_repro
for i in $(seq 1 10); do
python3 -c "
for j in range(800):
print(f'file\${i}_line{j:05d}: abcdefghijabcdefghijabcdefghijabcdefghij')
" > /tmp/compaction_repro/file_\${i}.txt
doneThen paste into a fresh Claude Code session:
Read ALL of these files completely using the Read tool. Read each file in FULL - do not skip content, do not use head/tail/bash. Use ONLY the Read tool.
/tmp/compaction_repro/file_1.txt
/tmp/compaction_repro/file_2.txt
/tmp/compaction_repro/file_3.txt
/tmp/compaction_repro/file_4.txt
/tmp/compaction_repro/file_5.txt
/tmp/compaction_repro/file_6.txt
/tmp/compaction_repro/file_7.txt
/tmp/compaction_repro/file_8.txt
/tmp/compaction_repro/file_9.txt
/tmp/compaction_repro/file_10.txt
RULES:
- Use ONLY the Read tool
- Read ALL 10 files completely
- After reading all files, say "DONE"
Result: "Prompt is too long" after parallel Read batch.
Test matrix
| Mode | Tool Type | Result | Compactions |
|---|---|---|---|
| Sequential | Built-in Read | PASS | 3 (working correctly) |
| Parallel | Built-in Read | FAIL ("Prompt is too long") | 0 |
| Sequential | Custom MCP | PASS (no error) | 0 (unbounded growth) |
| Parallel | Custom MCP | FAIL (expected, same as parallel built-in) | 0 |
Risk: silent success masking unbounded growth
The sequential custom MCP scenario succeeds today without compaction because the API context limit is large enough (~2M chars / ~500k tokens fits). This is a latent failure:
- If API limits are tightened, these workflows will break with no code changes.
- Workflows that accumulate more tool results (e.g., 50+ calls, larger payloads) will hit the limit.
- No compaction means no summarization of earlier context, degrading response quality even when the limit isn't hit.
Root cause
The CLI checks context size and triggers compaction between tool loop iterations for built-in tools, but:
- Custom MCP tool results arrive via the MCP protocol and bypass the compaction check path entirely
- Parallel tool batches return all results in a single message, so there's no inter-iteration checkpoint
Expected behavior
- Compaction should trigger for custom MCP tool results the same way it does for built-in tools
- After receiving a large parallel batch of tool results, the CLI should check context size and compact before the next API call
Related
- Claude Code issue non-interactive
/compact#23 requests non-interactive/compact— related but different (manual trigger vs automatic mid-loop compaction) - Claude Code issue Auto Compact on first Query causes async message iterator to hang #288 reports auto-compact on first query causing hangs — different scenario