 | News Feed |
 | Jobs Feed |
Sections
|
| feed this: |  |
Project: Patchwork-UTF8 - UTF8 Support for PHP
by Chris Cornutt January 27, 2012 @ 11:38:40
Nicolas Grekas has shared another tool that he's pulled out of his "Patchwork" framework to make it a stand-alone tool: the Patchwork-UTF8 helper that provides matching functions to those PHP already has for regular strings, but a little smarter to work with UTF8 correctly.
The PatchworkUtf8 class implements the quasi complete set of string functions that need UTF-8 grapheme clusters awareness. These functions are all static methods of the PatchworkUtf8 class. The best way to use them is to add a use PatchworkUtf8 as u; at the beginning of your files, then when UTF-8 awareness is required, prefix by u:: when calling them.
In the README for the tool he talks about the functions included in the current release that match PHP's string functions as well as some additional methods like "isUtf8", "bestFit" and "strtocasefold". It relies on the mbstring, iconv and intl extensions being installed, and if they aren't, it falls back to other functionality (list of those methods included).
voice your opinion now!
utf8 support string patchwork framework helper mbstring iconv intl
Yannick's Blog: mbstring vs iconv benchmarking
by Chris Cornutt October 06, 2008 @ 12:50:20
Recently on his blog Yannick has done some benchmarking comparing mbstring and iconv in PHP 5.2.4 release.
Following up on my previous post about the differences between the mbstring and iconv international characters libraries (which resulted in a tentative conclusion that nobody knew anything about those differences), and particularly the comments by Nicola, we have combined forces (mostly efforts from Nicola, actually) to provide you with a little benchmarking, if that can help you decide.
His code for the test script is included (for you to gather your own results) and a full listing of his results comparing the effects of possible caching, running up to ten executions. You can download the text file that he ran the script on here.
voice your opinion now!
mbstring iconv benchmark php5 text file statistic
Vinu Thomas' Blog: mbstring Functions by default in PHP
by Chris Cornutt July 18, 2008 @ 07:57:16
In a new post to his blog, Vinu Thomas talks about a set of functions that can make your life easier when handling unicode strings - the mb_* methods of the mbstring extension.
When dealing with multiple languages and internalization in PHP, some of the default functions in PHP end up mangling up the unicode characters in PHP. This is evident when you have a lot of funny looking characters coming up on your web page instead of the actual characters. [...] There is an extensions called mbstring which you can install in PHP which gives you a set of functions which are unicode ( actually multibyte ) ready.
He mentions some of the replacements like mb_send_mail instead o fmail and mb_strlen instead of the usual strlen. Thankfully, there's a simple way to make use of these functions without having to replace a lot of code - a setting in your php.ini (mbstring.func_overload) that tells your application to seamlessly replace things behind the scenes.
voice your opinion now!
mbstring function utf8 unicode multibyte replace
Dokeos Blog: mbstring vs iconv
by Chris Cornutt April 24, 2008 @ 11:18:08
In this post on the Dokeos blog, there's a comparison of the mbstring function and the iconv library as it pertains to their use on multi-byte strings.
I was wondering today why use mbstring rather than iconv in Dokeos, and honestly I didn't remember exactly why I had chosen mbstring in the past, but finding information about the *differences* between the two. [...] Searching a bit more, I found a PPT presentation from Carlos Hoyos on Google.
Essentially, it boils down to how the library is integrated - mbstring is bundled and iconv is pulled from an external source. So, if you're looking for maximum portability, he recommends mbstring.
voice your opinion now!
mbstring iconv multibyte character string compare internal external
Alessandro Crugnola's Blog: AMFPHP and mbstring
by Chris Cornutt October 12, 2007 @ 09:23:00
Alessandro Crugnola was struggling with an application he was developing (with Flex and PHP) where his local PHP installation worked just fine but his remote system errored on the same code:
Connecting to the service browser I was receiving the error "Channel.Ping.Failed" error and investingating a bit more in the fault message I discovered that the source error was: "The class {Amf3Broker} could not be found under the class path {/var/htdocs/amfphp/services/amfphp/Amf3Broker.php}" and the Amf3Broker php class does not exists anywhere in amfphp!
Despite some default settings he found, though, things still weren't loading correctly. Finally, he found the culprit - mbstring. One server had the setting to overload the strings and the other didn't resulting in the return of corrupted data from the amfphp stream.
voice your opinion now!
amfphp mbstring flex application error amfphp mbstring flex application error
Matthew Weir O'Phinney's Blog: mbstring comes to the rescue
by Chris Cornutt May 17, 2006 @ 05:49:23
Character encodings, especially when dealing with XML, in PHP can be a pain to say the least. Matthew Weir O'Phinney found this out first-hand when a script he was working with had a mixed character set in one of its strings, giving the XML parser in the SimpleXML functionality problems.
I tried a number of solutions, hoping actually to automate it via mbstring INI settings; these schemes all failed. iconv didn't work properly. The only thing that did work was to convert the encoding to latin1 -- but this wreaked havoc with actual UTF-8 characters.
Then, through a series of trial-and-error, all-or-nothing shots, I stumbled on a simple solution.
The discovery was to detect the encoding of the string itself (not really the content) and convert eveything in it to that encoding. How, you might ask? With the handy mb_detect_encoding and mb_convert_encoding functions. Of course, this functionality has to be compiled into PHP, but it's well worth it if it's exactly what you need.
voice your opinion now!
mbstring xml simplexml encoding utf-8 detect convert mbstring xml simplexml encoding utf-8 detect convert
SitePoint PHP Blog: PHP UTF-8 0.1
by Chris Cornutt February 28, 2006 @ 06:54:57
In this post from the SitePoint PHP Blog, Harry Fuecks talks about a new package of software he's worked up to make it possible for PHP to handle UTF-8 encoded strings - PHP UTF-8.
Been messing around with bits of this code for a long time, in fact since first really getting to grips with Dokuwiki, but finally got the first release out.
PHP UTF-8 is intended to make it possible to handle UTF-8 encoded strings in PHP, without requiring the mbstring extension (although it uses mbstring if it's available). In short, it provides versions of PHP's string functions (pretty much everything you'll find on this list), prefixed with utf_ and aware of UTF-8 encoding (that 1character >= 1 byte). It also gives you some tools to help check UTF-8 strings for "well formedness", strip bad sequences and some "ASCII helpers".
He continues the post, mentioning where some of the code for it was pulled from and a note about the documentation (there, but scarce). He also includes a warning for the use of it - not to use it "blindly" and only to use it when you need it, not to replace the standard PHP str_* functions.
voice your opinion now!
sitepoint utf-8 mbstring handle string encoded sitepoint utf-8 mbstring handle string encoded
|
Community Events
Don't see your event here? Let us know!
|