Skip to content

genai : fixed respect retry_delay from Gemini 429 ResourceExhausted error in retry logic #946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

SYED-M-HUSSAIN
Copy link
Contributor

PR Description
This PR fixes the retry logic in the LangChain Google integration to respect the retry_delay field returned by the Gemini API in ResourceExhausted (429) errors. Previously, the retry mechanism ignored the server-suggested delay, causing premature retries and quota issues. The fix introduces a custom wait strategy that uses the server-provided delay when available and falls back to exponential backoff otherwise. This aligns client retry behavior with Gemini’s rate limiting guidance, improving reliability and reducing quota exhaustion errors.


Relevant issues


Type
🐛 Bug Fix


Changes (optional)

  • Added wait_with_server_retry_delay custom wait strategy class
  • Updated _create_retry_decorator to use this custom wait handler
  • Retains fallback to exponential backoff if no retry_delay present
  • Added logging to show when server delay is respected

@SYED-M-HUSSAIN
Copy link
Contributor Author

Hey @lkuligin , can you please merge this, thanks.

@lkuligin
Copy link
Collaborator

@SYED-M-HUSSAIN linter is failing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

genai: retry_delay is not used to guide retry interval
2 participants