๐—œ ๐—•๐—Ž๐—ถ๐—น๐˜ ๐—” ๐—ช๐—ฒ๐—ฏ ๐—ฃ๐—ฎ๐—ด๐—ฒ ๐—ฆ๐˜‚๐—บ๐—บ๐—ฎ๐—ฟ๐—ถ๐—›๐—ฒ๐—ฟ ๐—ช๐—ถ๐˜๐—ต ๐—”๐—œ I was onboarding a new Python library. The docs were scattered across 12 different HTML pages. I spent three hours clicking back and forth, copying snippets, and trying to piece together how the authentication flow worked. I thought: "There has to be a better way. Why can't I just dump all these pages into an AI and get a clean summary?" So I tried exactly that. And it worked. Sort of.

My first "solution" was manual. I opened each doc page, selected all text, pasted it into a single markdown file, and then fed that into ChatGPT. It worked for one page, but after three pages I wanted to scream. I decided to automate. My plan was simple:

I wrote a Python script using requests, BeautifulSoup, and openai. When I ran this on two doc pages, I got back neat little summaries. But when I fed it five more pages, the problems piled up:

What I learned: