fix(client): retry 429 responses and honour Retry-After#566
fix(client): retry 429 responses and honour Retry-After#566RapidPoseidon wants to merge 1 commit intomainfrom
Conversation
`_RETRYABLE_STATUS_CODES` previously only included 502/503/504, so a legitimate 429 (rate limit) from the backend bubbled up as a hard error instead of being retried. The exponential-backoff loop is already built for transient failures — 429 fits the same shape. Also: when the server sends a `Retry-After` header (common for 429), prefer that value over the exponential delay, bounded by a 60s cap to keep a hostile/broken server from extending the retry loop indefinitely. Integer-seconds form only; HTTP-date form falls back to exponential backoff. Mirrored into `openapi/templates/rest.mustache`. Session: https://session-bc38cc85.poseidon.rapidata.internal/ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: lino <lino@rapidata.ai>
Code ReviewOverviewThis PR adds IssuesComment says "five" but there are four status codes
# 429 = rate limit, 502/503/504 = transient upstream / gateway errors.
# All five are safe to retry with exponential backoff.
_RETRYABLE_STATUS_CODES = {429, 502, 503, 504}The set has four entries, not five. Should read "All four". SuggestionsOnly 3 attempts is likely too few for rate limiting
No jitter on the When multiple SDK clients hit a rate limit simultaneously they all receive the same No automated tests The PR description mentions walking branches manually. The What's Good
SummaryThe core logic is correct and the approach is sound. The main actionable items before merging are: fix the "five"→"four" comment, consider whether 3 attempts is sufficient for rate-limit retries, and ideally add at least a basic unit test for |
Summary
_RETRYABLE_STATUS_CODES = {502, 503, 504}excluded 429. Any rate-limit response from the backend bubbled up as a hard error even though the SDK already has an exponential-backoff loop perfectly suited to it.Fix
429to_RETRYABLE_STATUS_CODES.Retry-Afterheader (common for 429), prefer that value over the exponential delay. Bounded by 60 s so a hostile or broken server can't extend the retry loop beyond a reasonable ceiling.openapi/templates/rest.mustache.Test plan
uv run pyright src/rapidata/rapidata_client→ 0 errorsRetry-After: 5→ 5s,Retry-After: 999→ 60s cap,Retry-After: "Wed, ..."→ exponential fallback, no header → exponential.🔗 Session: https://session-bc38cc85.poseidon.rapidata.internal/