Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Developer Tutorials Blog:
Parallel web scraping in PHP: cURL multi functions
Jul 29, 2008 @ 07:57:00

The Developer Tutorials blog has posted a tutorial about scraping other website information in parallel (with their permission, of course) with the help of the cURL extension.

For anyone who's ever tried to fetch multiple resources over HTTP in PHP, the logic is trivial, but one key challenge is ever-present: latency delays. While web servers have perfectly good downstream links, latencies can increase script execution time tenfold just by downloading a few external URLs. But there's a simple solution: parallel cURL operations. In this tutorial, I'll show you how to use the "multi" functions in PHP's cURL library to get around this quickly and easily.

He starts with a basic cURL example, grabbing the content from example.com and putting it into a variable. He modifies this to make it a bit more complex and to run multiple fetches in parallel - creating more than one cURL object and using the culr_multi_* methods to manage them.

tagged: webscraping curl function multi parallel tutorial