𝗧𝗮𝗺𝗶𝗻𝗴 𝗟𝗼𝗻𝗴 𝗗𝗼𝗰𝘂𝗺𝗲𝗻𝘁𝘀 𝘄𝗶𝘁𝗵 𝗟𝗟𝗠𝘀

📅2 weeks ago⏱1 min read

I built a system to answer questions from 100-page technical PDFs.

Simple scripts failed. I fought token limits and high costs for weeks.

My first try used GPT-4 with the full text. This worked for 10 pages. At 100 pages, I hit the token cap. The model forgot details in the middle. Costs were too high.

I tried these methods:

Basic chunking: The model picked the wrong sections. It missed context.
Map-reduce: I lost specific details.
Sliding windows: This was too slow and expensive.

I mimic how humans read. We skim the table of contents. Then we read specific sections.

Here is the new workflow:

Create a hierarchy. Use an LLM to make a short summary for each chunk.
Store summaries and full text in a vector database.
Use hybrid search. Combine keywords and semantic search.
Retrieve the top 3 summaries first.
Fetch the full text for those summaries.
Feed this context to the LLM.

The results:

Costs dropped by 70%.
Technical terms are now accurate.
Accuracy improved.

Tips for your setup:

Use GPT-3.5 for summaries.
Use GPT-4 for the final answer.
Build a test dataset early.
Stuff the prompt for docs under 20 pages.

Source: https://dev.to/__c1b9e06dc90a7e0a676b/how-i-finally-tamed-long-document-analysis-with-llms-it-wasnt-simple-chunking-5ed3

𝗧𝗮𝗺𝗶𝗻𝗴 𝗟𝗼𝗻𝗴 𝗗𝗼𝗰𝘂𝗺𝗲𝗻𝘁𝘀 𝘄𝗶𝘁𝗵 𝗟𝗟𝗠𝘀

Continue reading

𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗔𝗻 𝗜𝗻𝘁𝗲𝗿𝗻𝗮𝗹 𝗔𝗜 𝗖𝗵𝗮𝘁𝗯𝗼𝘁: 𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗟𝗲𝗮𝗿𝗻𝗲𝗱

𝗧𝗮𝗺𝗶𝗻𝗴 𝗟𝗼𝗻𝗴 𝗗𝗼𝗰𝘂𝗺𝗲𝗻𝘁 𝗔𝗻𝗮𝗹𝘆𝘀𝗶𝘀 𝘄𝗶𝘁𝗵 𝗟𝗟𝗠𝘀

𝗛𝗼𝘄 𝗜 𝗠𝗮𝗱𝗲 𝗠𝘆 𝗔𝗜 𝗣𝗗𝗙 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝗦𝗺𝗮𝗿𝘁

𝗔𝗜 𝗔𝗣𝗜 𝗧𝗶𝗺𝗲𝗼𝘂𝘁𝘀 𝗔𝗻𝗱 𝗔𝘀𝘆𝗻𝗰 𝗖𝗵𝘂𝗻𝗸𝗶𝗻𝗴

𝗥𝗔𝗚 𝗶𝗻 𝟴 𝗟𝗮𝘆𝗲𝗿𝘀: 𝗙𝗿𝗼𝗺 𝗧𝗼𝗸𝗲𝗻𝘀 𝘁𝗼 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻