Skip to content

Commit 47ba8b0

Browse files
committed
fix: Don't retry for non-recoverable server http errors
This is specifically addressing the issue where server returning Not Implemented (code 501) would receive two more attempts for the same request, even though there's no reason to expect it to serve the request any better on further attempts. This patch reduces the number of >=500 codes that would be restarted to those where there seems to be a chance of recover on further attempts. These codes are now explicitly listed instead of broad >=500 filter. For all possible server codes, please consult e.g. here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status#server_error_responses Signed-off-by: Ihar Hrachyshka <[email protected]>
1 parent 6155c31 commit 47ba8b0

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

src/llama_stack_client/_base_client.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -734,7 +734,11 @@ def _should_retry(self, response: httpx.Response) -> bool:
734734
return True
735735

736736
# Retry internal errors.
737-
if response.status_code >= 500:
737+
if response.status_code in (
738+
502, # Bad Gateway
739+
503, # Service Unavailable
740+
504, # Gateway Timeout
741+
):
738742
log.debug("Retrying due to status code %i", response.status_code)
739743
return True
740744

0 commit comments

Comments
 (0)