How to handle rate limits when you can't change the rate-limit in the external model provider ? #10780
Unanswered
manuel-koch
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I just started using Claude models with Continue and immediately hit rate limit errors like the following when asking a question in the chat that involves multiple tool calls ( e.g. searching in code base for a topic / issue ).
{"type":"rate_limit_error","message":"This request would exceed your organization's rate limit of 10,000 input tokens per minute (org: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, model: claude-sonnet-4-6). For details, refer to: https://docs.claude.com/en/api/rate-limits. You can see the response headers for current usage. Please reduce the prompt length or the maximum tokens requested, or try again later. You may also contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase."}
Is there a way to "slow down" or "reduce speed" of Continue when sending request to the model provider ?
I.e. I can't change the minute-rate-limit within the Claude account, but just getting "slower" results would be fine for me.
Looking at the recent issues I saw tons of those regarding rate-limit, thus assuming this seems to be a general issue for lots of users.
How do you workaround such external limitations ?
Is there a configuration that could be used to stay below a given rate-limit and still being able to send a chat message that triggers more tool calls and automatic follow-up actions of the model ?
Beta Was this translation helpful? Give feedback.
All reactions