𝗛𝗼𝘄 𝗜 𝗦𝘁𝗼𝗽𝗽𝗲𝗱 𝗟𝗼𝘀𝗶𝗻𝗴 𝗔𝗣𝗜 𝗖𝗮𝗹𝗹𝘀 𝘁𝗼 𝗥𝗮𝘁𝗲 𝗟𝗶𝗺𝗶𝘁𝘀
My app dropped user requests. Logs showed 429 errors. Retry logic made it worse. The whole system stopped.
I used an AI API for text analysis. The API limited requests to 50 per minute. A simple wait time failed. It blocked the worker. Retries hit the limit again.
I tried these:
- Random wait times. They lacked backpressure.
- Tenacity library. It handled calls alone. It missed global limits.
- Token bucket. It failed without a distributed lock.
I found a fix. I used Redis for central coordination. I used asyncio to stop blocking. I split the rate limiter from the retry logic. The limiter tracks the global quota. The retry logic handles small failures.
Redis sorted sets track timestamps. If the code hits the limit, it backs off. I added exponential backoff with jitter. This stops all retries from hitting the API at once.
Lessons learned:
- Redis adds a small delay.
- Queue work instead of failing.
- Only retry 429 and 5xx errors.
- Log every attempt.
Use aiolimiter or tenacity for production. A circuit breaker is a good next step.
How do you handle this? Do you use token buckets or sliding windows?
Source: https://dev.to/__c1b9e06dc90a7e0a676b/how-i-stopped-losing-api-calls-to-rate-limits-and-you-can-too-137k Optional learning community: https://t.me/GyaanSetuAi