On the SitePoint PHP blog there's a new post from editor Bruno Skvorc sharing information about a "gender" extension for PHP that tries to guess the gender of a first name.
Recently, I ventured into a section of the PHP manual which lists extensions that are used to help with Human Language and Character Encoding. I had never looked at them as a whole – while dealing with gettext, for example, I always kind of landed directly on it and ignored the rest. Well, of those others, there’s one that caught my eye – especially in this day and age given the various controversies – the Gender extension.
This extension, in short, tries to guess the gender of first names. As its introduction says: "Gender PHP extension is a port of the gender.c program originally written by Joerg Michael. The main purpose is to find out the gender of firstnames. The current database contains >40000 firstnames from 54 countries."
This is interesting beyond the fact that the author is kinda called George Michael. In fact, there are many aspects of this extension that are quite baffling.
He then walks through some examples of putting the extension to use, evaluating various names in different languages and gauging the results. The extension allows for definite answers (is male/female), relative results, unisex, a "couple" or, when all else fails, erroring or giving a "not found" result. It also can check for "nicknames" for common names. He walks you through getting it installed and shows other functionality for getting similar names and checking for nicknames, showing code examples and the resulting output.