How to accelerate your web scraping

Why should I care?

If the number of requests you rotate through a single IP are higher than what target websites allow, the website you target will identify your IPs and block or mislead you with false information. It means that your information collecting can be much slower than what you’re used to.

How do I improve the speed of my data harvesting?

Assuming you're running 10 million requests, 1 request per second per IP with 1000 data center IPs, your routine can take about 3 hours. With 10,000,000 residential IPs, your routine can potentially take 1 second.

Guidelines to rotate multiple parallel sessions through Luminati’s residential network:

  • Sign up to Luminati
  • Contact Luminati’s support team and ask for access to the residential IP network
  • Once activated, install the Luminati proxy manager from GitHub. We recommend you watch this guide for a smooth start
  • Go to the ‘proxies’ tab
  • Check the port of your ‘gen’ zone
  • Edit in the port settings ‘preset’ to ‘round-robin (ip) pool’
  • Route your requests to where the portnum is the port of the ‘gen’ zone