আমি একটি RAG অ্যাপ তৈরি করলাম, তারপর তাকে জিজ্ঞেস করলাম আমি কোন গাড়ি পছন্দ করি। সে জানত না।

Translated for your language. Read the original.

AI-assisted draft.

GyaanSetu Editorialগত সপ্তাহ2min read

আমি একটি RAG অ্যাপ তৈরি করলাম, তারপর তাকে জিজ্ঞেস করলাম আমি কোন গাড়ি পছন্দ করি। সে জানত না।

I Built a RAG App, Then Asked It What Car I Like. It Didn't Know.

I am building a document-chat tool called Kenning. It uses RAG (Retrieval-Augmented Generation) to let users ask questions about uploaded files.

I built the entire pipeline from scratch using:

Java 21 and Spring Boot
Spring AI
PostgreSQL with pgvector
Ollama (running llama3.2:3b and nomic-embed-text)
Docker Compose

The pipeline works like this: Upload file → Extract text → Chunk text → Convert chunks to vectors → Store in pgvector → Search for similar chunks → Send chunks + question to the model → Get answer with sources.

The system worked, but I hit two different failures. They looked the same, but the causes were different.

Failure 1: The model was confused. I asked: "What embedding model does this project use?" The document explicitly stated the answer. The model retrieved the correct text. However, it answered by saying it did not know, even while repeating the correct model name in the next sentence.

My theory: The 3B model is too small. It retrieved the right data but could not commit to a confident answer. A larger model would likely fix this.

Failure 2: The model found nothing. I asked: "What car brand do I like?" The document mentioned I like BMW. But the system returned zero results. The similarity score was too low to pass my threshold.

My theory: Chunk dilution. My test document was short. It mixed many topics like Spring AI, OAuth2, and my car preference into one chunk. The vector for that chunk became diluted across all those topics. A specific question about cars lost its strength against a broad chunk. Better chunking strategy would fix this.

Lessons learned:

Small models have reasoning limits.
Naive chunking affects retrieval accuracy.
Debugging the "why" is more important than just fixing the error.

The architecture holds up. It is slow and sometimes wrong, but the loop is complete.

Source: https://dev.to/mido-dev/i-built-a-rag-app-then-asked-it-what-car-i-like-it-didnt-know-583n

Optional learning community: https://t.me/GyaanSetuAi

আমি একটি RAG অ্যাপ তৈরি করলাম, তারপর তাকে জিজ্ঞেস করলাম আমি কোন গাড়ি পছন্দ করি। সে জানত না।

Continue reading

জাপানি ল্যাবগুলো কীভাবে আরও উন্নত RAG সিস্টেম তৈরি করে

𝗜 𝗦𝗽𝗲𝗻𝘁 $𝟱𝟬𝟬 𝗼𝗻 𝗥𝗔𝗚 𝗜𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝗕𝗲𝗳𝗼𝗿𝗲 𝗙𝗶𝘅𝗶𝗻𝗴 𝗧𝗵𝗲𝘀𝗲 𝟳 𝗠𝗶𝘀𝘁𝗮𝗸𝗲𝘀

৭টি ভুল করার আগে RAG ইনফ্রাস্ট্রাকচারে আমার ৫০০ ডলার খরচ হয়েছে

𝗪𝗵𝘆 𝗬𝗼𝘂𝗿 𝗥𝗔𝗚 𝗦𝘆𝘀𝘁𝗲𝗺 𝗛𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗲𝘀