๐—›๐—ผ๐˜„ ๐—œ ๐—ฆ๐˜๐—ผ๐—ฝ๐—ฝ๐—ฒ๐—ฑ ๐——๐˜‚๐—บ๐—ฝ๐—ถ๐—ป๐—ด ๐—ฃ๐——๐—™๐˜€ ๐—”๐—ป๐—ฑ ๐—ฆ๐˜๐—ฎ๐—ฟ๐˜๐—ฒ๐—ฑ ๐—–๐—ต๐—ฎ๐˜๐˜๐—ถ๐—ป๐—ด ๐—ช๐—ถ๐˜๐—ต ๐——๐—ผ๐—ฐ๐˜‚๐—บ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป

My team had hundreds of pages of internal guides. Nobody read them. The same questions filled our Slack channels every week.

I tried a basic search index. It failed. People asked about staging databases and received results about production credentials. Context was lost.

I spent two weekends building a RAG system. Here is what I learned from my mistakes.

My first attempt used a simple recipe: PDFs, text splitting, OpenAI embeddings, and Pinecone. It worked for one question. For everything else, it returned junk.

The problem was chunking. I used a fixed 512-token size. This split sentences and code blocks in half. The retriever found text that looked similar but made no sense to the model.

I tried larger chunks and better embedding models. This helped a little, but the model got distracted by too much text.

I eventually settled on a two-layer approach:

This system now runs for my team of 20. It handles 50 questions a day. It reduced our Slack repetitions by 70%.

My main takeaways for you:

What chunking strategies work for your technical docs?

Source: https://dev.to/__c1b9e06dc90a7e0a676b/how-i-stopped-dumping-pdfs-and-started-chatting-with-my-documentation-2c8j