On his site Sherif Ramadan has posted an interesting tutorial covering implementing bloom filters in PHP. Bloom filters are data structures that make it easier to determine if something is a member of a set.
Let's imagine you have built a music app like Spotify. You've finally grown this thing to sizeable amount of users and you have a decent number of titles in your content library. Let's also say this app has social elements to it so your users can connect with their facebook friends or twitter followers. Now, let's say each time your users play a song in your app you want to ask the question Which of this user's friends have NOT listened to this song yet? The intention being that you may recommend that song to them if they haven't listened to it.
One solution to this problem is to use a data structure known as a bloom filter. A bloom filter is basically a very space-efficient hash set with probabilistic tendency. If you aren't familiar with a hash set or sets in general, let's do a quick review of what they mean.
He goes on to explain what a bloom filter is and how it differs from normal sets, hash sets and hash maps. He then introduces some of the basic concepts involved in creating and using bloom filters. To help make things clearer, he provides a "contrived example" using lightbulbs and the probably that they've been turned on. From there he starts to get into something more practical, something more in the world of PHP. He includes a basic
Bloomfilter class example and some of the results (performance) of using it over something like in_array (especially for large data sets).