𝗔𝗜 𝗪𝗲𝗯 𝗦𝗰𝗿𝗮𝗽𝗶𝗻𝗴 𝘃𝘀 𝗦𝗲𝗹𝗲𝗰𝘁𝗼𝗿𝘀
I built a price comparison tool. I used CSS selectors. They broke every time a website changed.
I tried a new way. I used an AI model to read the raw HTML.
Here is the process:
- I removed scripts and footers.
- I kept only headings and prices.
- This cut token use by 70 percent.
- I gave the AI five examples of the right output.
- I set the temperature to 0 for consistency.
The results were reliable. There are trade-offs:
- Cost: Each page costs a few cents.
- Speed: It takes 1 to 3 seconds per call.
- Errors: AI sometimes makes up data.
I added a step to verify the numbers. This stopped the errors.
Avoid this method if:
- The site has a stable API.
- You need to scrape millions of pages.
- Your selectors never change.
Stop wasting hours on brittle regex. Use AI to handle the structure.
Optional learning community: https://t.me/GyaanSetuAi