Skip to content

Fix proxy bugs, improve concurrency, and update SDK options#42

Open
gustavokch wants to merge 8 commits intoRichardAtCT:mainfrom
gustavokch:main
Open

Fix proxy bugs, improve concurrency, and update SDK options#42
gustavokch wants to merge 8 commits intoRichardAtCT:mainfrom
gustavokch:main

Conversation

@gustavokch
Copy link

This PR introduces reliability improvements, full support for concurrent SDK calls, wiring of new Claude API options, and resolutions for several critical proxy bugs.

Features & Enhancements

  • SDK Options Wiring: Full support for reasoning_effort, response_format, thinking, max_budget_usd, and user fields passed directly to the Claude SDK.
  • Concurrency: Removed os.environ mutex (_env_lock) by passing auth via options.env, allowing fully concurrent SDK calls. SessionManager has been refactored to use asyncio.Lock with all session methods converted to async.
  • Token & Reason Mapping: Extracts real token counts directly from the SDK's ResultMessage and properly maps stop_reason to finish_reason (e.g., max_tokenslength).
  • Tool Handling: Changed AnthropicMessagesRequest.enable_tools default to False so simple message requests do not trigger unintended 10-turn loops.

Bug Fixes

  • Session Continuity: Fixed session continuation by correcting continue_session to continue_conversation and replaced list appending with replacement to prevent exponential duplication.
  • Timeouts & Hangs: Wrapped async query() iterations with asyncio.timeout to prevent indefinite hangs when the SDK subprocess stalls.
  • Proxy Reliability:
    • Removed filter_content() from user input which was silently stripping XML-like tags.
    • Secured /v1/auth/status endpoint with the verify_api_key() auth guard.
    • Marked the Bash tool as is_safe=False.
    • Replaced bare except: clauses with except Exception:.

Maintenance & Chores

  • Updated poetry.lock and the test suite for compatibility with pydantic 2.13 and poetry 2.3.
  • Replaced deprecated datetime.utcnow() with datetime.now(timezone.utc).
  • Ignored .worktrees directories in .gitignore.
  • Added diagnostic print statements for /v1/messages and improved the test_message.py script.

…condition, and more

- Bug 1: Remove filter_content() from user input (silently stripped XML-like tags)
- Bug 2: Add asyncio.Lock to serialize os.environ mutation under concurrent requests
- Bug 3: Replace session message append with replace (prevent exponential duplication)
- Bug 4: Replace bare except: with except Exception: in 3 locations
- Bug 5: Use __version__ from src/__init__.py instead of hardcoded "1.0.0"
- Bug 7: Add verify_api_key() auth guard to /v1/auth/status endpoint
- Bug 8: Replace deprecated datetime.utcnow() with datetime.now(timezone.utc)
- Bug 9: Fix GitHub URL in landing page (aaronlippold → RichardAtCT)
- Bug 10: Mark Bash tool as is_safe=False
- Bug 11: Use DEFAULT_MODEL constant in debug endpoint example request
- Wrap async query() iteration with asyncio.timeout(self.timeout) to
  prevent indefinite hangs when the SDK subprocess stalls
- Change AnthropicMessagesRequest.enable_tools default to False so
  simple message requests don't trigger bypassPermissions + 10 turns
- Add diagnostic print() statements in /v1/messages handler to surface
  handler entry and run_completion call in server output
- Improve test_message.py: pipe server output to stderr, add
  DEBUG_MODE=true, reduce client timeout from 120s to 60s
PR 1 — Critical bug fixes:
- Fix continue_session → continue_conversation (sessions now actually continue)
- Wire max_thinking_tokens through to SDK via generic setattr approach
- Extract real token counts from SDK ResultMessage usage field
- Map stop_reason to proper finish_reason (max_tokens → length, etc.)

PR 2 — Concurrency & reliability:
- Remove os.environ mutex (_env_lock) — pass auth via options.env instead,
  allowing fully concurrent SDK calls (no more per-worker serialization)
- Replace threading.Lock with asyncio.Lock in SessionManager to avoid
  blocking the event loop; all session methods converted to async

PR 3 — SDK options wiring:
- Refactor run_completion to accept claude_options dict; apply via setattr
- Add reasoning_effort, response_format, thinking, max_budget_usd fields
- Forward user field to SDK
- Bump version to 2.3.0
Updated repository URL in the README file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant