Skip to content

Conversation

@AlexsanderHamir
Copy link
Collaborator

Title

[Perf] Alexsander fixes round 3 - Oct 25th

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix
🧹 Refactoring

Changes

  • Reduced memory usage.
  • Changed slow json library for faster option.

Replace json.dumps with orjson.dumps in HTTP handler to reduce
serialization latency for all LLM provider API calls.
Replace json.dumps/loads with orjson in streaming hot paths.
Saves ~350ms per 1000-chunk streaming response.
- Move orjson from optional to required dependencies in pyproject.toml
- Fixes ModuleNotFoundError when importing litellm core modules
- orjson is used in llm_http_handler.py which is imported by core litellm
- Optimize jsonify_object() (24 call sites across codebase)
- Optimize get_request_status() for metadata parsing

Reduces CPU usage and improves database write latency.
- serialize_object() uses orjson for 3-5x faster dict serialization
- get_prompt_caching_cache_key() skips encode/decode cycle
- Maintains cache key compatibility
- Maintains circular reference detection
- Preserves default=str fallback behavior
- Used in 15+ files across proxy and integrations
- Added @lru_cache(1024) to get_cooldown_cache_key()
- Changed all 4 locations to use the cached method instead of recreating strings
- Replaced f-strings with string concatenation for better performance

Results: get_cooldown_cache_key dropped from 47MB to a few bytes and is no
longer a top consumer. Memory leak still present.

Next: Optimize heavy memory consumers so memory leaks become more obvious.
@vercel
Copy link

vercel bot commented Oct 26, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Nov 1, 2025 8:22pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants