๐๐๐ถ๐น๐ฑ๐ถ๐ป๐ด ๐ฎ ๐ฆ๐ฒ๐ฟ๐๐ฒ๐ฟ๐น๐ฒ๐๐ ๐ฃ๐ฟ๐ผ๐ ๐ ๐ณ๐ผ๐ฟ ๐๐ ๐๐ฃ๐๐
I wanted to add an AI chatbot to my side project.
The goal seemed simple. Take user messages. Send them to an LLM API. Stream the response back to the frontend. Keep my API key safe.
I ran into problems immediately.
First, I tried calling the API directly from the browser. This failed because of CORS errors. It also exposed my API key to anyone visiting the site.
Next, I built an Express server. It worked, but I did not want to manage a VPS. I did not want to worry about uptime or crashes for a small project.
The best solution was a serverless function.
I used Vercel Edge Functions to act as a lightweight proxy. This approach keeps the API key secure in environment variables. It has no idle costs and requires no server management.
How the setup works:
- The function receives user messages.
- It calls the AI API with streaming enabled.
- It returns the stream directly to your frontend.
This setup works well for MVPs and small projects. However, you should watch out for these four issues:
- Cold starts: You might see a small delay on the first request.
- Timeouts: Most providers cap how long a function runs.
- Rate limiting: Users can spam your endpoint and increase your bill. I used Upstash to stop this.
- Error handling: You need to format AI errors so the frontend understands them.
If you build a large production app, consider these upgrades:
- Use a queue like BullMQ to manage heavy traffic.
- Add caching for common questions to save money.
- Use a dedicated worker on Fly.io or Railway for more control.
Use a serverless proxy if you need to prototype quickly or run a small app. Avoid it if you need sub-100ms response times or handle millions of daily requests.
How do you handle AI API calls in your production apps?
Source: https://dev.to/__c1b9e06dc90a7e0a676b/building-a-serverless-proxy-for-ai-apis-lessons-learned-34lj