PHPDeveloper: PHP News, Views and Community

Subscribe

@phpdeveloper.org

News Archive

Community News: Latest PECL Releases (06.17.2025)

Community News: Latest PECL Releases (06.10.2025)

Community News: Latest PECL Releases (06.03.2025)

Community News: Latest PECL Releases (05.27.2025)

Community News: Latest PEAR Releases (05.26.2025)

Community News: Latest PECL Releases (05.20.2025)

Community News: Latest PECL Releases (05.13.2025)

Community News: Latest PECL Releases (05.06.2025)

Community News: Latest PECL Releases (04.29.2025)

Community News: Latest PECL Releases (04.22.2025)

Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Raphael Stolt's Blog:
Scraping websites with Zend_Dom_Query

byChris Cornutt Oct 17, 2008 @ 19:31:34

Raphael Stolt has a new blog post today with a tutorial showing how to take the Zend_Dom_Query component out of the Zend Framework and use it to scrape content from another web site.

Today I stumbled upon an interesting and reportable scenario were I had to extract information of the weekly published Drum and Bass charts provided by BBC 1Xtra. As this information currently isn't available in any consumer friendly format like for example a RSS feed, I had to go that scraping route but didn't want to hustle with a regex approach. Since version 1.6.0 the Zend_Dom_Query component has been added to the framework mainly to support functional testing of MVC applications, but it also can be used for rolling custom website scrapers in a snap. Woot, perfect match!

He includes the code for his Bbc_DnbCharts_Scraper class he's created to show how the data is pulled in (via curl) and pushed into an object to be parsed.