𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗮 𝗥𝗔𝗚 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 𝗙𝗿𝗼𝗺 𝗦𝗰𝗿𝗮𝘁𝗰𝗵

I wanted to add an AI assistant to SmartQueue.

SmartQueue is a task queue I built in Go for IT support tickets. I did not want a generic AI. A generic model does not know your specific password reset rules or your outage runbooks.

I needed Retrieval-Augmented Generation (RAG). This pulls facts from your documents first. Then it gives those facts to the model as context.

Here is what I learned while building this pipeline.

𝗧𝗵𝗲 𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗙𝗮𝗶𝗹𝘂𝗿𝗲

My first version used ChromaDB for vector search. It worked locally. It failed during deployment.

I ran everything in one container on Hugging Face Spaces. This included Redis, a Go API, workers, a FastAPI service, and ChromaDB. Five processes competed for limited memory and CPU. ChromaDB caused startup races and silent failures.

I made a choice. I ripped out the vector database and replaced it with a simple BM25 search.

𝗧𝗵𝗲 𝗦𝗶𝗺𝗽𝗹𝗲 𝗦𝗼𝗹𝘂𝘁𝗶𝗼𝗻

The new replacement is 50 lines of Python. It has no external process. It has no network calls. It uses the Okapi BM25 formula to match keywords in memory.

The trade-off is clear:

𝗧𝘂𝗻𝗶𝗻𝗴 𝘁𝗵𝗲 𝗦𝘆𝘀𝘁𝗲𝗺

I tuned several settings to keep the system stable: • Retrieved docs (k): 4. This provides enough context without hitting token limits. • Bot temperature: 0.2. Troubleshooting needs literal answers, not creativity. • Classifier temperature: 0.1. This ensures the JSON output is predictable. • Session history: Last 10 turns. This provides continuity without using too much memory. • Rate limits: 30 requests per minute. This protects my API quota.

𝗧𝗵𝗲 𝗕𝗲𝘀𝘁 𝗗𝗲𝘀𝗶𝗴𝗻 𝗶𝘀 𝗮 𝗗𝗲𝗴𝗿𝗮𝗱𝗶𝗻𝗴 𝗗𝗲𝘀𝗶𝗴𝗻

I built every endpoint with a non-AI fallback. If the AI service goes down, the system uses keyword matching or rule-based logic. The system degrades instead of failing.

This is not a complex RAG setup. It has no re-ranking or hybrid search. It is a small, smart tool built for a specific scale.

Lecciones aprendidas:

Fuente: https://dev.to/ambarish_0221/building-a-rag-pipeline-from-scratch-what-smartqueue-taught-me-about-retrieval-4gdb

Comunidad de aprendizaje opcional: https://t.me/GyaanSetuAi