On the Zend Developer Zone there's a recent post about a book from Matthew Turland (recently available in print) - the php|architect's Guide to Web Scraping with PHP - and why you shouldn't judge a book by its cover.
I was really hesitant to commit to reviewing the book because I tend not to review books I don't like and this subject matter just wasn't doing it for me. So with great fear and trepidation, I popped open my review copy. (PDF so I could read it on my iPad) I was ever so surprised and in a very good way.
He talks about the different parts of the book - the foreword from Ben Ramsey ("expert in all things HTTP") and the two halves of the book. The first half deals with accessing the information on remote sites and the second talks about the actual scraping of the information (parsing out the content with things like regular expressions and SimpleXML).