<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>PHPDeveloper.org</title>
    <link>http://www.phpdeveloper.org</link>
    <description>Up-to-the Minute PHP News, views and community</description>
    <language>en-us</language>
    <pubDate>Fri, 16 May 2008 03:29:44 -0500</pubDate>
    <ttl>30</ttl>
    <item>
      <title><![CDATA[Markus Wolff's Blog: Fulltext search as a webservice]]></title>
      <guid>http://www.phpdeveloper.org/news/10134</guid>
      <link>http://www.phpdeveloper.org/news/10134</link>
      <description><![CDATA[<p>
In a <a href="http://blog.wolff-hamburg.de/archives/22-Fulltext-search-as-a-webservice.html">recent blog entry</a> about a fulltext searching solution, <i>Markus Wolff</i> hacked together in a few hours with Zend_Search_Lucene:
</p>
<blockquote>
While working at some really old code that provided a fulltext search feature, I was at one point incredibly pissed rather unsatisfied due to the fact that said code resisted all attempts to debug it. This lead to the decision to sit down on a rainy weekend to try if I couldn't come up with something more useful, and most importantly, scalable.
</blockquote>
<p>
<a href="http://blog.wolff-hamburg.de/archives/22-Fulltext-search-as-a-webservice.html">His method</a> allowed for separation between the indexing and the main app and how he changes some of his methods when he learned that <a href="http://lucene.apache.org/solr/">Solr</a> did something very similar. He also lays out some example XML content and how it's handled in his script (via a SimpleXML object).
</p>]]></description>
      <pubDate>Wed, 07 May 2008 12:57:47 -0500</pubDate>
    </item>
    <item>
      <title><![CDATA[Greg Szorc's Blog: Using DTD's and Catalogs for XHTML Validation]]></title>
      <guid>http://www.phpdeveloper.org/news/9949</guid>
      <link>http://www.phpdeveloper.org/news/9949</link>
      <description><![CDATA[<p>
<i>Greg Szorc</i> shows how, in <a href="http://blog.case.edu/gps10/2008/04/06/using_dtds_and_catalogs_for_xhtml_validation">this entry</a> on his blog, to use DTDs and catalogs to validate your XHTML pages with a little help from PHP.
</p>
<blockquote>
This [validation from an external site like the W3C validator] approach is a good start, but it is far from ideal because it is based on an honor system of sorts. You often forget to validate each change you make and there is always some corner case that you forget. So, what can be done about it? Well, if you find yourself developing in PHP, you can employ the following solution.
</blockquote>
<p>
The code <a href="http://blog.case.edu/gps10/2008/04/06/using_dtds_and_catalogs_for_xhtml_validation">he includes</a> pulls in the XHTML content from your page (or the output of the framework's view layer) and pushes it into a DOMDocument that's build with the LIBXML_DTDLOAD and LIBXML_DTDATTR options.
</p>]]></description>
      <pubDate>Thu, 10 Apr 2008 11:29:48 -0500</pubDate>
    </item>
    <item>
      <title><![CDATA[Developer Tutorials Blog: Extracting text from Word Documents via PHP and COM]]></title>
      <guid>http://www.phpdeveloper.org/news/9861</guid>
      <link>http://www.phpdeveloper.org/news/9861</link>
      <description><![CDATA[<p>
In a <a href="http://www.developertutorials.com/blog/php/extracting-text-from-word-documents-via-php-and-com-81/">recent blog post</a> <i>Akash Mehta</i> showed how to reach into a Microsoft document (a Word file) and pull out the content inside via a PHP script.
</p>
<blockquote>
Communicating via COM in PHP is easy as ever; especially for people coming from a VB background where executing complex tasks in MS-applications is a piece of cake, you will feel right at home in PHP. In fact, VB COM calls can be converted to PHP COM calls in just a few simple search and replaces.
</blockquote>
<p>
He shows how to use the COM extension in a (Windows) PHP installation to access the text inside the document and manipulate the contents however you'd like (even writing them back out to another Word file).
</p>]]></description>
      <pubDate>Wed, 26 Mar 2008 12:02:06 -0500</pubDate>
    </item>
    <item>
      <title><![CDATA[Maarten Balliauw's Blog: Indexing Word 2007 (docx) files with Zend_Search_Lucene]]></title>
      <guid>http://www.phpdeveloper.org/news/9569</guid>
      <link>http://www.phpdeveloper.org/news/9569</link>
      <description><![CDATA[<p>
<i>Maarten Balliauw</i> has <a href="http://blog.maartenballiauw.be/post/2008/02/Indexing-Word-2007-(docx)-files-with-Zend_Search_Lucene.aspx">written about</a> a method he's developed to convince the Zend_Search_Lucene component of the <a href="http://framework.zend.com">Zend Framework</a> to index the contents of a Word 2007 document.
</p>
<blockquote>
Lucene basically is an indexing and search technology, providing an easy-to-use API to create any type of application that has to do with indexing and searching. If you provide the right methods to extract data from any type of document, Lucene can index it. [...] Sounds like a challenge!
</blockquote>
<p>
He works through the three step process to getting the searching working, the key being his readDocXContents() function that goes through the Word file and returns all the text it can find. This is passed back out so the Zend Framework component can pull the data in and search (their example is on the string "Code Access Security").
</p>
<p>
You can grab the <a href="http://examples.maartenballiauw.be/LuceneIndexingDOCX/LuceneIndexingDOCX.zip">the full code here</a>.
</p>]]></description>
      <pubDate>Tue, 05 Feb 2008 10:24:00 -0600</pubDate>
    </item>
    <item>
      <title><![CDATA[Kapustabrothers.com: Indexing PDF Documents with Zend_Search_Lucene]]></title>
      <guid>http://www.phpdeveloper.org/news/9472</guid>
      <link>http://www.phpdeveloper.org/news/9472</link>
      <description><![CDATA[<p>
As <a href="http://devzone.zend.com/article/3000-kapustabrothers.com---Indexing-PDF-Documents-with-Zend_Search_Lucene">mentioned</a> on the Zend Developer Zone, there's a <a href="http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/">new post</a> on kapustabrothers.com about a method for indexing all of those PDF files your site uses with the help of the Zend Framework's Zend_Search_Lucene component.
</p>
<blockquote>
along with many others have been trying and asking how to index and search PDF files. Once Zend released its Framework, which is a port of Java Lucene to PHP, I decided to jump on board and find a way to index and search PDF files.
</blockquote>
<p>
He uses the <a href="http://www.foolabs.com/xpdf/">XPDF</a> software to parse out the PDF files and the ZF component to do the actual indexing and searching. XPDF extracts key information from the PDF and puts it out to a new file where Zend_Search_Lucene can get to it. Example code is included to show the automatic creation of these details and how to add them to the component's index.
</p>]]></description>
      <pubDate>Wed, 23 Jan 2008 07:58:00 -0600</pubDate>
    </item>
    <item>
      <title><![CDATA[DevShed: Drawing Basic Rectangles in PDF Documents with PHP 5 ]]></title>
      <guid>http://www.phpdeveloper.org/news/9082</guid>
      <link>http://www.phpdeveloper.org/news/9082</link>
      <description><![CDATA[<p>
DevShed has continued their look at working with PDFs in a PHP5 application with <a href="http://www.devshed.com/c/a/PHP/Drawing-Basic-Rectangles-in-PDF-Documents-with-PHP-5/">this new part</a>, part four, focusing on making a PDF with basic rectangles drawn in it. 
</p>
<blockquote>
All right, now that you know how to include basic blocks of text into a simple PDF file, in addition to incorporating some images, the question that comes up here is: what's the next step to take? Well, in this fourth part of the series I'm going to show you how to draw a few basic shapes, once a PDF document has been opened, like empty and filled rectangles, which can be useful if you want to decorate the document with these kinds of forms.
</blockquote>
<p>
The tutorial walks you through the creation of another sample PDF file (with text and an image) and shows the process for adding rectangles via the rect() function call on their PDFLib class.
</p>]]></description>
      <pubDate>Tue, 20 Nov 2007 12:56:00 -0600</pubDate>
    </item>
  </channel>
</rss>
