𝗜 𝗕𝘂𝗶𝗹𝘁 𝗮 𝗦𝘁𝗿𝗲𝗮𝗺𝗶𝗻𝗴 𝗔𝗜 𝗖𝗵𝗮𝘁 𝗖𝗹𝗶𝗲𝗻𝘁 𝗪𝗶𝘁𝗵𝗼𝘂𝘁 𝗟𝗼𝘀𝗶𝗻𝗴 𝗠𝘆 𝗠𝗶𝗻𝗱

I wanted to build a chat interface where the AI responds in real-time. I wanted that smooth typewriter effect.

It was harder than I thought. The problem was not the AI. The problem was the pipeline between the API and the browser.

I tried three different ways to solve this.

  1. The Wait Method I called the API and waited for the full response before showing it. This worked, but the UI froze for several seconds. Users thought the app was broken. They clicked "Send" repeatedly. This was a bad user experience.

  2. The Polling Method I thought about having the server send a job ID. The client would then ask for updates every second. This required heavy server management. Updates appeared in random chunks. It was not smooth.

  3. The WebSocket Method I tried Socket.IO. This added massive complexity. I had to manage reconnections, heartbeats, and state synchronization. For a simple chat app, WebSockets were overkill.

The solution was simpler: Server-Sent Events (SSE).

Most AI APIs already send responses via SSE over HTTP. I stopped looking for complex tools and used the native fetch API.

By using response.body.getReader(), I read the stream of bytes directly. I parsed the SSE protocol myself. This approach keeps the UI responsive and uses standard HTTP.

Why this works:

  • No WebSocket server needed.
  • No complex reconnection logic.
  • It works with any API that supports SSE.
  • You can stop the stream easily using AbortController.

There are trade-offs.

  • You cannot push updates to the client without a request.
  • If the connection drops, you lose the partial response.

If you build a chat app, avoid WebSockets unless you need bidirectional communication. Stick to HTTP streaming. It is simpler and more reliable.

What is your streaming strategy? Do you use WebSockets or SSE? Tell me in the comments.

Source: https://dev.to/__c1b9e06dc90a7e0a676b/i-built-a-streaming-ai-chat-client-without-losing-my-mind-3gi0