𝗔𝘀𝘆𝗻𝗰 𝗗𝗡𝗦 𝗥𝗲𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗮𝗶𝗼𝗱𝗻𝘀

You crawl thousands of video pages per hour. Latency adds up.

We thought bandwidth slowed us down. We were wrong. DNS was the bottleneck.

Python default DNS resolution is blocking. It uses a thread pool. When you have 500 lookups, threads run out. Your code waits.

We switched to aiodns. It is a wrapper for c-ares. It is non-blocking.

The results:

How to do it:

If your crawler is slow, use py-spy. Look for getaddrinfo in the top frames. If you see it, move to aiodns.

Source: https://dev.to/ahmet_gedik778845/async-dns-resolution-with-aiodns-for-high-throughput-video-crawlers-2pdm