fix(fetchers): honor Retry-After header on 429/503 responses#1670
fix(fetchers): honor Retry-After header on 429/503 responses#1670Sanjays2402 wants to merge 2 commits into
Conversation
LLM and embedding fetchers in memos-local-plugin classified 429 as transient and retried with exponential backoff, but never inspected the Retry-After response header. Under provider rate limiting, retries fired before the upstream-requested cooldown, extending throttling and wasting paid API calls. Parse Retry-After per RFC 7231 §7.1.3 (delta-seconds or HTTP-date), cap to a sane maximum to avoid hostile pinning, and prefer it over exponential backoff on 429/503. Fall back to the existing backoff when the header is absent or unparseable. Fixes MemTensor#1620
There was a problem hiding this comment.
Pull request overview
This PR updates the LLM and embedding HTTP fetchers in apps/memos-local-plugin to honor the upstream Retry-After header on 429/503 responses, preferring the server-provided cooldown over the existing exponential backoff (with a 60s cap), and adds unit coverage for the new behavior.
Changes:
- Added a shared
parseRetryAfterMs(resp, capMs)helper that supports delta-seconds and HTTP-date formats. - Updated both LLM and embedding fetchers to use
Retry-After(when present/parseable) for 429/503 retry sleeps, otherwise falling back to exponential backoff. - Added/extended unit tests to cover parsing behavior and fetcher retry timing behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/memos-local-plugin/core/util/retry-after.ts | New helper to parse and cap Retry-After into a millisecond delay. |
| apps/memos-local-plugin/core/llm/fetcher.ts | Uses Retry-After delay for 429/503 retries; refactors backoff into backoffMs() + sleep(). |
| apps/memos-local-plugin/core/embedding/fetcher.ts | Mirrors LLM fetcher behavior to honor Retry-After for 429/503 retries. |
| apps/memos-local-plugin/tests/unit/util/retry-after.test.ts | Adds unit tests for delta-seconds, HTTP-date, malformed values, and capping behavior. |
| apps/memos-local-plugin/tests/unit/llm/fetcher.test.ts | Adds timing-based tests asserting Retry-After is honored or falls back to backoff. |
| apps/memos-local-plugin/tests/unit/embedding/fetcher.test.ts | Adds timing-based tests asserting Retry-After is honored or falls back to backoff. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * `null` when the header is absent, malformed, or non-positive — callers | ||
| * should fall back to their existing backoff strategy. |
| import { parseRetryAfterMs } from "../util/retry-after.js"; | ||
| import type { LlmProviderLogger, LlmProviderName } from "./types.js"; | ||
|
|
||
| /** Hard ceiling on any single retry sleep — Retry-After or otherwise. */ |
| /** Hard ceiling on any single retry sleep — Retry-After or otherwise. */ | ||
| const MAX_RETRY_AFTER_MS = 60_000; |
…0 is valid Address Copilot review on MemTensor#1670: - parseRetryAfterMs JSDoc: document that 0 (retry immediately) is a valid value; only negative or unparseable values return null. - LLM and embedding fetchers: cap the final sleep duration with MAX_RETRY_AFTER_MS regardless of whether it came from Retry-After or exponential backoff. Previously backoff could exceed 60s on high attempt counts (config allows up to 10 retries).
|
Thanks for the review — addressed in the latest commit:
Diff is small and behavior-only-on-the-margin (only changes anything when |
|
This PR has been automatically marked as stale due to inactivity. |
|
Automated Test Results: FAILED Cloud test-engine rerun against
Failed cases:
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[2/2]⎯ Do not merge until this is fixed and cloud tests pass. |
Summary
The LLM and embedding fetchers in
apps/memos-local-pluginclassified HTTP 429 as transient and retried with exponential backoff, but never read theRetry-Afterheader. Under provider rate limiting, retries fired before the upstream-requested cooldown, extending throttling and wasting paid API calls.This PR parses
Retry-Afterper RFC 7231 §7.1.3 (delta-seconds or HTTP-date), caps the value to a sane maximum, and prefers it over exponential backoff on 429/503. Falls back to the existing backoff when the header is absent or unparseable.Changes
apps/memos-local-plugin/core/util/retry-after.tsexportingparseRetryAfterMs(resp, capMs).apps/memos-local-plugin/core/llm/fetcher.ts: parse Retry-After on 429/503 and use it as the sleep duration; small refactor splitbackoff()intobackoffMs()+sleep()so both code paths share one timer.apps/memos-local-plugin/core/embedding/fetcher.ts: same treatment, mirroring the LLM fetcher.tests/unit/util/retry-after.test.ts— covers integer seconds, HTTP-date, capping, malformed, missing, past-date, and negative inputs.tests/unit/llm/fetcher.test.tsandtests/unit/embedding/fetcher.test.ts— three new cases each: integer Retry-After is honored, HTTP-date is honored, malformed value falls back to backoff.Behavior
Retry-After: 5→ sleep ~5s instead of next backoff step.Retry-After: Wed, 21 Oct 2025 07:28:00 GMT→ sleep ~(date - now), capped.MAX_RETRY_AFTER_MS) to prevent hostile or buggy servers from pinning the client indefinitely.Notes
tsc --noEmitlocally (nonode_modulesin this worktree), so types were verified by inspection. The new helper has explicit return typenumber | nulland matches the import-style (.jsextension,Bundlerresolution) used elsewhere in the codebase.tests/unit/embedding/fetcher.test.ts; sleep upper bounds are generous (≥900 ms forRetry-After: 1) to avoid flakes on slow CI.Fixes #1620