Recently Johannes Schluter wrote up a post on searching through archived files (like .tar.gz or .bzip) to perform a full-text search on their contents:
As said the app is about browsing the PHP manual. The manual is provided as tar.gz to the app and I wanted to have a fulltext search. For accessing the tar.gz content I'm using phar. Yes, phar is not only for phar files but can work on different kinds of archives (tar.gz, tar.bz2, zip), too.
In the code he uses Iterators and a PharData instance to open and search the contents of the given file. He explains how it all works, too, as well as mentioning a few places where it might need a bit of work.