<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>PHPDeveloper.org</title>
    <link>http://www.phpdeveloper.org</link>
    <description>Up-to-the Minute PHP News, views and community</description>
    <language>en-us</language>
    <pubDate>Mon, 21 May 2012 09:04:48 -0500</pubDate>
    <ttl>30</ttl>
    <item>
      <title><![CDATA[Joshua Thijssen's Blog: Bloom Filters]]></title>
      <guid>http://www.phpdeveloper.org/news/17792</guid>
      <link>http://www.phpdeveloper.org/news/17792</link>
      <description><![CDATA[<p>
In <a href="http://www.adayinthelifeof.nl/2012/04/09/bloom-filters/">this new post</a> to his blog <i>Joshua Thijssen</i> describes something that can help when processing large amounts of data (like, in his example, the text of a book) to search through the information and find if a certain piece of data is in the set - a bloom filter.
</p>
<blockquote>
Most of my co-workers never really heard of bloom filters, and I'm continuously need to explain what they are, what their purpose is and why it's a better solution than other ones. So let's do an introduction on bloom filters. [...] Bloom filters have the property of being exceptionally fast AND exceptionally small compared to other structures but it comes with a price: it MIGHT be possible that our bloom filter thinks that an element is inside our set, when it really isn't. Luckily, the reverse is not possible: when a bloom filter says something is NOT in the set, you are 100% sure that it isn't part of the set.
</blockquote>
<p>
He explains how the filter works, noting how it's better for memory consumption and how it's possible for it to give a "maybe" response instead of ab absolute "yes" or "no". He also points out <a href="http://pecl.php.net/package/bloomy">a PHP extension, bloomy</a> that takes the hard work out of it for you.
</p>]]></description>
      <pubDate>Mon, 09 Apr 2012 11:13:32 -0500</pubDate>
    </item>
    <item>
      <title><![CDATA[Andrei Zmievski's Blog: Bloom Filters Quickie]]></title>
      <guid>http://www.phpdeveloper.org/news/12291</guid>
      <link>http://www.phpdeveloper.org/news/12291</link>
      <description><![CDATA[<p>
<i>Andrei Zmievski</i> has <a href="http://gravitonic.com/2009/04/bloom-filters-quickie">written a new post</a> about a new extension he's worked up (out of curiosity for the technology) - the <a href="http://pecl.php.net/package/bloomy">pecl/bloomy extension</a>.
</p>
<blockquote>
A Bloom filter is a probabilistic data structure that can be used to answer a simple question, is the given element a member of a set? Now, this question can be answered via other means, such as hash table or binary search trees. But the thing about Bloom filters is that they are incredibly space-efficient when the number of potential elements in the set is large.
</blockquote>
<p>
The filters allow false positives with a defined error rate - it gives the "yes" or "no" answer based on the content and you, the developer, decide if that answer falls within a rate that's okay for you and your app. The filters also take the same amount of time to look up items no matter how many are in the set.
</p>
<p>
He includes an example of the extension in use - defining the number of elements, the false positive allowance and adding/searching data and how the responses would come back from the checks. 
</p>]]></description>
      <pubDate>Tue, 07 Apr 2009 11:13:01 -0500</pubDate>
    </item>
  </channel>
</rss>

