Phil Sturgeonhas posted about some Node.js vs PHP benchmarks that someone linked him to concerning web scraping. The article suggests that Node.js "owns" PHP when it comes to this but, as Phil finds out, there's a bit more to the story than that.
Sometimes people link me to articles and ask for my opinions. This one was a real doozy. Oh goody, a framework versus language post. Let's try and chew through this probable linkbait [where] we're benchmarking NodeJS v PHP. Weird, but I'll go along with it. Well, now we're testing cheerio v PhpQuery which is a bit different, but fine, let's go along with it.
Through a little discovery, Phil noticed phpQuery using file_get_contents, a blocking method for fetching the remote pages to scrape. Node.js instead uses a non-blocking method, meaning multiple files can be fetched at the same time. In answer to this blocking vs non-blocking, he decided to run benchamrks against a few cases - Node.js/Cherrio, PHP/phpQuery and his own, more correct comparison to the Node option - PHP/ReactPHP/phpQuery. He's shared his results, showing a major difference between the straight phpQuery and the React-based version.
It seems likely to me that people just assume PHP can't do this stuff, because by default most people arse around PHP with things like MAMP, or on their shitty web-host where is is hard to install things and as such get used to writing PHP without utilizing many extensions. It is probably exactly this which makes people think PHP just can't do something, when it easily can.