𝗪𝗵𝘆 𝗠𝘆 𝗙𝗶𝗿𝘀𝘁 𝗥𝗔𝗚 𝗦𝘆𝘀𝘁𝗲𝗺 𝗙𝗮𝗶𝗹𝗲𝗱 (𝗮𝗻𝗱 𝗛𝗼𝘄 𝗜 𝗙𝗶𝘅𝗲𝗱 𝗜𝘁)

📅6 days ago⏱1 min read

I built a bot for internal documents. I used a vector database and an LLM. It looked good at first. Then it lied.

My first version had three big problems.

It gave wrong numbers.
It missed steps in long guides.
It found the wrong documents.

I fixed these with two methods.

First. Parent-child chunking. I split data into small child chunks for searching. I gave the LLM the larger parent section for context. The LLM saw the full picture.

Second. Hybrid search. I combined vector search with keyword matching. This finds exact terms like admin password.

My new pipeline:

User asks a question.
Hybrid search finds child chunks.
System pulls parent sections.
Reranker picks the top 3.
GPT-4 writes the answer.

This stopped the hallucinations. The bot found the right sections. It stopped guessing.

RAG is a system design problem. The embedding model is a small part. Slicing and retrieving data matters most.

My advice for you:

Create a test set to measure progress.
Monitor retrieval quality in production.
Log the chunks the bot finds.

What is your chunking strategy?

Source: https://dev.to/__c1b9e06dc90a7e0a676b/why-my-first-rag-system-hallucinated-and-how-i-fixed-it-cha Optional learning community: https://t.me/GyaanSetuAi

𝗪𝗵𝘆 𝗠𝘆 𝗙𝗶𝗿𝘀𝘁 𝗥𝗔𝗚 𝗦𝘆𝘀𝘁𝗲𝗺 𝗙𝗮𝗶𝗹𝗲𝗱 (𝗮𝗻𝗱 𝗛𝗼𝘄 𝗜 𝗙𝗶𝘅𝗲𝗱 𝗜𝘁)

Continue reading

𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗔𝗻 𝗜𝗻𝘁𝗲𝗿𝗻𝗮𝗹 𝗔𝗜 𝗖𝗵𝗮𝘁𝗯𝗼𝘁: 𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗟𝗲𝗮𝗿𝗻𝗲𝗱

𝗪𝗵𝘆 𝗠𝘆 𝗥𝗔𝗚 𝗕𝗼𝘁 𝗟𝗶𝗲𝗱 𝗔𝗻𝗱 𝗛𝗼𝘄 𝗜 𝗙𝗶𝘅𝗲𝗱 𝗜𝘁

𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗔 𝗥𝗔𝗚 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 𝗜𝗻 𝗔 𝗪𝗲𝗲𝗸𝗲𝗻𝗱

𝗥𝗔𝗚 𝗶𝗻 𝟴 𝗟𝗮𝘆𝗲𝗿𝘀: 𝗙𝗿𝗼𝗺 𝗧𝗼𝗸𝗲𝗻𝘀 𝘁𝗼 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻

𝗜 𝗥𝗲𝗯𝘂𝗶𝗹𝘁 𝗠𝘆 𝗥𝗔𝗚 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 𝗙𝗿𝗼𝗺 𝗦𝗰𝗿𝗮𝘁𝗰𝗵