Skip to content

[Feature Request] Add /context command support to SDK for programmatic context inspection #507

@puya

Description

@puya

Feature Request: Add /context Command Support to Claude Agent SDK

Summary

Add the /context slash command (currently available only in CLI/UI) to the Claude Agent SDK, allowing programmatic access to context usage breakdown.

Motivation

The /context command is essential for:

  1. Monitoring token consumption - Understanding where tokens are being spent
  2. Debugging context issues - Identifying why context fills up quickly
  3. Cost optimization - Making informed decisions about file loading strategies
  4. Application development - Building tools that help users manage context efficiently

Current limitation: The SDK only provides cumulative usage data (total input/output tokens) but no breakdown by category.

Current Behavior (CLI/UI Only)

In Claude Code CLI/UI, /context provides detailed breakdown:

Context Usage
Model: claude-sonnet-4-5-20250929

Tokens: 98.2k / 200.0k (49%)

Estimated usage by category
Category          Tokens    Percentage
System prompt     3.2k      1.6%
System tools      17.6k     8.8%
MCP tools         17.4k     8.7%
Custom agents     299       0.1%
Messages          61.4k     30.7%
Free space        55.0k     27.5%
Autocompact buffer 45.0k    22.5%

This breakdown is not accessible through the SDK.

Proposed Solution

Option 1: Add /context as a Slash Command

Allow the SDK to send /context just like /compact or /clear:

from claude_agent_sdk import query

async def check_context():
    async for message in query(
        prompt="/context",
        options={"max_turns": 1}
    ):
        if message.type == "system" and message.subtype == "context_info":
            print("Total tokens:", message.context_data.total_tokens)
            print("Categories:", message.context_data.categories)
            print("Percentage used:", message.context_data.percentage)

Returned data structure:

{
    "type": "system",
    "subtype": "context_info",
    "context_data": {
        "model": "claude-sonnet-4-5-20250929",
        "total_tokens": 98200,
        "max_tokens": 200000,
        "percentage": 49.1,
        "categories": {
            "system_prompt": 3200,
            "system_tools": 17600,
            "mcp_tools": 17400,
            "custom_agents": 299,
            "messages": 61400,
            "free_space": 55000,
            "autocompact_buffer": 45000
        },
        "mcp_tools_detail": [
            {"server": "ref", "tool": "ref_search_documentation", "tokens": 164},
            {"server": "ref", "tool": "ref_read_url", "tokens": 135}
        ]
    }
}

Option 2: Add Method to ClaudeSDKClient

Add a dedicated method for context inspection:

from claude_agent_sdk import ClaudeSDKClient

async with ClaudeSDKClient() as client:
    # After some queries...
    context_info = await client.get_context_usage()

    print(f"Using {context_info.percentage}% of context")
    print(f"Messages: {context_info.categories['messages']} tokens")

Option 3: Include in Message Stream

Add context info to system messages automatically:

async for message in query(prompt="Read this file"):
    if message.type == "system" and message.subtype == "context_update":
        # Periodic context updates during execution
        print(f"Context: {message.context_usage.percentage}%")

Use Cases

1. Token Usage Monitoring

# Monitor token growth as files are loaded
async with ClaudeSDKClient() as client:
    for doc_file in documentation_files:
        await client.query(f"Read {doc_file}")
        context = await client.get_context_usage()
        print(f"After loading {doc_file}: {context.categories['messages']} tokens")

2. Intelligent Context Management

# Auto-compact when approaching limits
async with ClaudeSDKClient() as client:
    await client.query("Analyze this codebase")

    context = await client.get_context_usage()
    if context.percentage > 75:
        print("Context high - triggering compaction")
        await client.query("/compact")

3. Cost Optimization

# Track overhead and optimize file loading strategy
async def measure_file_overhead(file_path):
    context_before = await client.get_context_usage()

    # Load file
    await client.query(f"Read {file_path}")

    context_after = await client.get_context_usage()

    actual_tokens = context_after.categories['messages'] - context_before.categories['messages']
    raw_file_tokens = count_raw_tokens(file_path)

    overhead = actual_tokens - raw_file_tokens
    print(f"Overhead: {overhead} tokens ({overhead/raw_file_tokens*100:.1f}%)")

4. Building Developer Tools

# Build a context visualization dashboard
class ContextMonitor:
    async def monitor_session(self, client):
        while True:
            context = await client.get_context_usage()
            self.update_dashboard(context)
            await asyncio.sleep(5)  # Update every 5 seconds

Benefits

  1. Parity with CLI - SDK users get the same visibility as CLI users
  2. Debugging - Developers can identify context consumption issues
  3. Optimization - Applications can make smart decisions about context management
  4. Transparency - Users understand where their tokens are going
  5. Tooling - Enables building context monitoring and management tools

Related Issues

This feature would have helped identify and diagnose issues like:

  • #20223 - File loading adds 70% token overhead (users could measure this programmatically)
  • #17959 - Context percentage calculation discrepancies
  • #8185 - Premature context compaction

Implementation Notes

  • Should work consistently across Python and TypeScript SDKs
  • Data should match what /context shows in CLI/UI
  • Consider adding to both query() and ClaudeSDKClient
  • Optional: Add include_context_updates=True option to get periodic updates

Alternative: Read-Only Context Inspection

If modifying the SDK is complex, provide a separate utility:

from claude_agent_sdk.utils import inspect_session_context

context = inspect_session_context(session_id="abc123")
print(context.categories)

Priority: Medium-High
Impact: Improves developer experience, enables better tooling, aids debugging
Effort: Low-Medium (slash command already exists in CLI, just needs SDK exposure)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions