๐ฅ๐๐ ๐๐ ๐ ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต ๐ฃ๐ฟ๐ผ๐ฏ๐น๐ฒ๐บ, ๐ก๐ผ๐ ๐๐ป ๐๐ ๐ฃ๐ฟ๐ผ๐ฏ๐น๐ฒ๐บ
I thought RAG was about Large Language Models. I was wrong.
I spent time on search instead of prompts. RAG is a search problem.
AI does not know everything. It knows only its training data. Ask it about yesterday's news. It might lie. RAG fixes this.
Think of a library. A librarian does not read every book to find one answer. The librarian finds the right book. They open the right page. They read the answer.
RAG works the same way. Search first. Generate second.
Machines struggle with keywords. Embeddings turn text into numbers. Similar meanings get similar numbers. Now the machine searches by meaning.
Chunking also matters. Break long books into small pieces. Small pieces make search precise.
Vector databases store these numbers. They find content similar to your question. Options include:
- Pinecone
- Weaviate
- Qdrant
- Milvus
- pgvector
Better retrieval leads to better answers. The model needs the right data. Focus on these:
- Chunking
- Embeddings
- Search quality
- Metadata filtering
- Reranking
Intelligence is not only in the model. Intelligence is finding the right info at the right time. RAG is a search problem using AI.
What did you learn while building RAG? Tell me in the comments.
Source: https://dev.to/threshika_vs/the-day-i-realized-rag-isnt-an-ai-problem-23ac Optional learning community: https://t.me/GyaanSetuAi