-
Notifications
You must be signed in to change notification settings - Fork 704
Description
Feature Request: Add /context Command Support to Claude Agent SDK
Summary
Add the /context slash command (currently available only in CLI/UI) to the Claude Agent SDK, allowing programmatic access to context usage breakdown.
Motivation
The /context command is essential for:
- Monitoring token consumption - Understanding where tokens are being spent
- Debugging context issues - Identifying why context fills up quickly
- Cost optimization - Making informed decisions about file loading strategies
- Application development - Building tools that help users manage context efficiently
Current limitation: The SDK only provides cumulative usage data (total input/output tokens) but no breakdown by category.
Current Behavior (CLI/UI Only)
In Claude Code CLI/UI, /context provides detailed breakdown:
Context Usage
Model: claude-sonnet-4-5-20250929
Tokens: 98.2k / 200.0k (49%)
Estimated usage by category
Category Tokens Percentage
System prompt 3.2k 1.6%
System tools 17.6k 8.8%
MCP tools 17.4k 8.7%
Custom agents 299 0.1%
Messages 61.4k 30.7%
Free space 55.0k 27.5%
Autocompact buffer 45.0k 22.5%
This breakdown is not accessible through the SDK.
Proposed Solution
Option 1: Add /context as a Slash Command
Allow the SDK to send /context just like /compact or /clear:
from claude_agent_sdk import query
async def check_context():
async for message in query(
prompt="/context",
options={"max_turns": 1}
):
if message.type == "system" and message.subtype == "context_info":
print("Total tokens:", message.context_data.total_tokens)
print("Categories:", message.context_data.categories)
print("Percentage used:", message.context_data.percentage)Returned data structure:
{
"type": "system",
"subtype": "context_info",
"context_data": {
"model": "claude-sonnet-4-5-20250929",
"total_tokens": 98200,
"max_tokens": 200000,
"percentage": 49.1,
"categories": {
"system_prompt": 3200,
"system_tools": 17600,
"mcp_tools": 17400,
"custom_agents": 299,
"messages": 61400,
"free_space": 55000,
"autocompact_buffer": 45000
},
"mcp_tools_detail": [
{"server": "ref", "tool": "ref_search_documentation", "tokens": 164},
{"server": "ref", "tool": "ref_read_url", "tokens": 135}
]
}
}Option 2: Add Method to ClaudeSDKClient
Add a dedicated method for context inspection:
from claude_agent_sdk import ClaudeSDKClient
async with ClaudeSDKClient() as client:
# After some queries...
context_info = await client.get_context_usage()
print(f"Using {context_info.percentage}% of context")
print(f"Messages: {context_info.categories['messages']} tokens")Option 3: Include in Message Stream
Add context info to system messages automatically:
async for message in query(prompt="Read this file"):
if message.type == "system" and message.subtype == "context_update":
# Periodic context updates during execution
print(f"Context: {message.context_usage.percentage}%")Use Cases
1. Token Usage Monitoring
# Monitor token growth as files are loaded
async with ClaudeSDKClient() as client:
for doc_file in documentation_files:
await client.query(f"Read {doc_file}")
context = await client.get_context_usage()
print(f"After loading {doc_file}: {context.categories['messages']} tokens")2. Intelligent Context Management
# Auto-compact when approaching limits
async with ClaudeSDKClient() as client:
await client.query("Analyze this codebase")
context = await client.get_context_usage()
if context.percentage > 75:
print("Context high - triggering compaction")
await client.query("/compact")3. Cost Optimization
# Track overhead and optimize file loading strategy
async def measure_file_overhead(file_path):
context_before = await client.get_context_usage()
# Load file
await client.query(f"Read {file_path}")
context_after = await client.get_context_usage()
actual_tokens = context_after.categories['messages'] - context_before.categories['messages']
raw_file_tokens = count_raw_tokens(file_path)
overhead = actual_tokens - raw_file_tokens
print(f"Overhead: {overhead} tokens ({overhead/raw_file_tokens*100:.1f}%)")4. Building Developer Tools
# Build a context visualization dashboard
class ContextMonitor:
async def monitor_session(self, client):
while True:
context = await client.get_context_usage()
self.update_dashboard(context)
await asyncio.sleep(5) # Update every 5 secondsBenefits
- Parity with CLI - SDK users get the same visibility as CLI users
- Debugging - Developers can identify context consumption issues
- Optimization - Applications can make smart decisions about context management
- Transparency - Users understand where their tokens are going
- Tooling - Enables building context monitoring and management tools
Related Issues
This feature would have helped identify and diagnose issues like:
- #20223 - File loading adds 70% token overhead (users could measure this programmatically)
- #17959 - Context percentage calculation discrepancies
- #8185 - Premature context compaction
Implementation Notes
- Should work consistently across Python and TypeScript SDKs
- Data should match what
/contextshows in CLI/UI - Consider adding to both
query()andClaudeSDKClient - Optional: Add
include_context_updates=Trueoption to get periodic updates
Alternative: Read-Only Context Inspection
If modifying the SDK is complex, provide a separate utility:
from claude_agent_sdk.utils import inspect_session_context
context = inspect_session_context(session_id="abc123")
print(context.categories)Priority: Medium-High
Impact: Improves developer experience, enables better tooling, aids debugging
Effort: Low-Medium (slash command already exists in CLI, just needs SDK exposure)