I Added A Reranker to My RAG Pipeline — It Broke Everything
I added a reranker to my RAG pipeline. It immediately broke my tests.
In version 2, I used hybrid retrieval with FAISS and BM25. It passed all 19 of my test questions. Then, I added a cross-encoder reranker to improve precision.
The theory is simple:
- Stage 1: Use fast retrieval to get a broad set of candidates.
- Stage 2: Use a smart reranker to pick the best ones.
It took 20 minutes to implement. It immediately failed 2 of my 19 tests.
The failure happened because of data format. My data contains dense, tabular chunks like this: "Company: Zentara Robotics | CEO: Iris Kallas | Employees: 287"
The cross-encoder model was trained on natural language paragraphs. When it saw a table row, it gave it a very low score. It thought the chunk was irrelevant.
Hybrid retrieval found the answer, but the reranker threw it away.
I tried 7 different ways to fix this:
- Using a bigger candidate pool.
- Blending scores from the reranker and the retriever.
- Using rank fusion.
None of them worked. The reranker score was so negative that it overpowered everything else. The model was not just ranking lower. It was actively rejecting the table format.
I stopped trying to fix the math and changed the structure.
Instead of letting the reranker decide everything, I protected my best results. I used a "guaranteed slot" strategy:
- If you want top 3 results, keep the top 2 from the first stage.
- Use the reranker to pick only the 3rd result.
This ensures the hybrid search results stay in the final list. The reranker only improves the remaining slots.
The result: 19/19 tests passed.
Lessons learned:
- Rerankers are not instant upgrades. They can hurt performance on structured or tabular data.
- Your evaluation set is your safety net. Without my 19 tests, I would have shipped a broken system.
- Protect what works. If your first-stage retrieval is good, do not let a reranker override it.
Build a strong retriever before you reach for a reranker.
Optional learning community: https://t.me/GyaanSetuAi
