News Feed
Sections




News Archive
Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Maarten Balliauw's Blog:
Indexing Word 2007 (docx) files with Zend_Search_Lucene
February 05, 2008 @ 10:24:00

Maarten Balliauw has written about a method he's developed to convince the Zend_Search_Lucene component of the Zend Framework to index the contents of a Word 2007 document.

Lucene basically is an indexing and search technology, providing an easy-to-use API to create any type of application that has to do with indexing and searching. If you provide the right methods to extract data from any type of document, Lucene can index it. [...] Sounds like a challenge!

He works through the three step process to getting the searching working, the key being his readDocXContents() function that goes through the Word file and returns all the text it can find. This is passed back out so the Zend Framework component can pull the data in and search (their example is on the string "Code Access Security").

You can grab the the full code here.

0 comments voice your opinion now!
zendframework zendsearchlucene word document download


blog comments powered by Disqus

Similar Posts

Andi Gutmans' Blog: Zend Framework 1.6 Featuring Dojo, SOAP, Testing, and more...

Oracle Technology Network: Using PHP and Oracle Database 11g (Tutorials)

Richard Thomas' Blog: Performance of Zend_Loader

Spindrop.us: sfZendPlugin (a Zend Framework plugin for Symfony)

Padraic Brady's Blog: Zend Framework Blog Tutorial - Part 7: Authorisation with Zend_Acl & Styling


Community Events





Don't see your event here?
Let us know!


composer api symfony laravel introduction release language opinion extension list conference interview configure unittest version podcast community install series voicesoftheelephpant

All content copyright, 2015 PHPDeveloper.org :: info@phpdeveloper.org - Powered by the Solar PHP Framework