Paul Meagher sent in a new article that he has posted over on the O'Reilly Network today - Calculating Entropy for Data Mining.
Information theory (IT) is a foundational subject for computer scientists, engineers, statisticians, data miners, biologists, and cognitive scientists. Unfortunately, PHP currently lacks software tools that would make it easy to explore and/or use IT concepts and methods to solve data analytic problems. This two-part series aims to remedy this situation by: Introducing you to foundational information theory concepts, Implementing these foundational concepts as classes using PHP and SQL, Using these classes to mine web data.
This first piece focuses on making the connections between the "information theory" and "database theory" - including the central focus of the series: entropy.




