Rate limit /chat/message: per-user + per-IP defence-in-depth#48
Open
vahid-ahmadi wants to merge 1 commit intomainfrom
Open
Rate limit /chat/message: per-user + per-IP defence-in-depth#48vahid-ahmadi wants to merge 1 commit intomainfrom
vahid-ahmadi wants to merge 1 commit intomainfrom
Conversation
Two layers protect the only expensive endpoint:
- Per user (or per IP if anonymous): 5/min and 60/hour. Authenticated
clients send their user id in `X-User-Id`; the limiter keys by
`user:<id>` so signed-in users aren't blocked by anon flooders on
shared IPs.
- Per IP defence-in-depth: 30/min, regardless of who is sending.
Both limits are env-tunable (`RATE_LIMIT_CHAT_PER_MIN`,
`RATE_LIMIT_CHAT_PER_HOUR`, `RATE_LIMIT_CHAT_IP_PER_MIN`).
The 429 handler returns JSON with `retry_after_seconds` and a matching
`Retry-After` header. Frontend handles 429 with an inline assistant
message ("you're sending messages a bit fast — please wait ~Ns") rather
than the paywall flow used for 402.
Storage is slowapi's default in-memory backend. With Modal's
max_containers=10 and concurrent=100, an attacker could spread requests
across containers to bypass any single counter — the IP layer is
approximate but adequate. Swap to Redis via `storage_uri` if we need
cross-container accuracy later.
Tests: TestRateLimitConfig covers the key function (user/IP precedence,
empty header fallback, env-var overrides). End-to-end limit triggering
isn't tested — it would require precise timing and per-test storage
resets. conftest.py raises test limits well above pytest workload so
the existing chat tests don't trip the production 5/min cap.
Closes #46
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Beta preview is ready.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two-layer rate limiting on `POST /chat/message` — the only expensive endpoint. Other routes are cheap reads and stay unlimited.
Closes #46.
Implementation notes
Test plan
Out of scope