Fix proxy bugs, improve concurrency, and update SDK options#42
Open
gustavokch wants to merge 8 commits intoRichardAtCT:mainfrom
Open
Fix proxy bugs, improve concurrency, and update SDK options#42gustavokch wants to merge 8 commits intoRichardAtCT:mainfrom
gustavokch wants to merge 8 commits intoRichardAtCT:mainfrom
Conversation
…condition, and more - Bug 1: Remove filter_content() from user input (silently stripped XML-like tags) - Bug 2: Add asyncio.Lock to serialize os.environ mutation under concurrent requests - Bug 3: Replace session message append with replace (prevent exponential duplication) - Bug 4: Replace bare except: with except Exception: in 3 locations - Bug 5: Use __version__ from src/__init__.py instead of hardcoded "1.0.0" - Bug 7: Add verify_api_key() auth guard to /v1/auth/status endpoint - Bug 8: Replace deprecated datetime.utcnow() with datetime.now(timezone.utc) - Bug 9: Fix GitHub URL in landing page (aaronlippold → RichardAtCT) - Bug 10: Mark Bash tool as is_safe=False - Bug 11: Use DEFAULT_MODEL constant in debug endpoint example request
- Wrap async query() iteration with asyncio.timeout(self.timeout) to prevent indefinite hangs when the SDK subprocess stalls - Change AnthropicMessagesRequest.enable_tools default to False so simple message requests don't trigger bypassPermissions + 10 turns - Add diagnostic print() statements in /v1/messages handler to surface handler entry and run_completion call in server output - Improve test_message.py: pipe server output to stderr, add DEBUG_MODE=true, reduce client timeout from 120s to 60s
PR 1 — Critical bug fixes: - Fix continue_session → continue_conversation (sessions now actually continue) - Wire max_thinking_tokens through to SDK via generic setattr approach - Extract real token counts from SDK ResultMessage usage field - Map stop_reason to proper finish_reason (max_tokens → length, etc.) PR 2 — Concurrency & reliability: - Remove os.environ mutex (_env_lock) — pass auth via options.env instead, allowing fully concurrent SDK calls (no more per-worker serialization) - Replace threading.Lock with asyncio.Lock in SessionManager to avoid blocking the event loop; all session methods converted to async PR 3 — SDK options wiring: - Refactor run_completion to accept claude_options dict; apply via setattr - Add reasoning_effort, response_format, thinking, max_budget_usd fields - Forward user field to SDK - Bump version to 2.3.0
Updated repository URL in the README file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces reliability improvements, full support for concurrent SDK calls, wiring of new Claude API options, and resolutions for several critical proxy bugs.
Features & Enhancements
reasoning_effort,response_format,thinking,max_budget_usd, anduserfields passed directly to the Claude SDK.os.environmutex (_env_lock) by passing auth viaoptions.env, allowing fully concurrent SDK calls.SessionManagerhas been refactored to useasyncio.Lockwith all session methods converted to async.ResultMessageand properly mapsstop_reasontofinish_reason(e.g.,max_tokens→length).AnthropicMessagesRequest.enable_toolsdefault toFalseso simple message requests do not trigger unintended 10-turn loops.Bug Fixes
continue_sessiontocontinue_conversationand replaced list appending with replacement to prevent exponential duplication.query()iterations withasyncio.timeoutto prevent indefinite hangs when the SDK subprocess stalls.filter_content()from user input which was silently stripping XML-like tags./v1/auth/statusendpoint with theverify_api_key()auth guard.is_safe=False.except:clauses withexcept Exception:.Maintenance & Chores
poetry.lockand the test suite for compatibility withpydantic 2.13andpoetry 2.3.datetime.utcnow()withdatetime.now(timezone.utc)..worktreesdirectories in.gitignore./v1/messagesand improved thetest_message.pyscript.