PHPDeveloper: PHP News, Views and Community

Subscribe

@phpdeveloper.org

News Archive

Community News: Latest PEAR Releases (03.17.2025)

Community News: Latest PECL Releases (03.11.2025)

Community News: Latest PEAR Releases (03.10.2025)

Community News: Latest PECL Releases (03.04.2025)

Community News: Latest PECL Releases (02.25.2025)

Community News: Latest PECL Releases (02.18.2025)

Community News: Latest PECL Releases (02.11.2025)

Community News: Latest PECL Releases (02.04.2025)

Community News: Latest PECL Releases (01.28.2025)

Community News: Latest PECL Releases (01.21.2025)

Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Make Me Pulse Blog:
PHP6, Unicode and TextIterator features

byChris Cornutt Mar 14, 2008 @ 14:32:34

On the Make Me Pulse blog, there's a look at PHP6's support of Unicode in the SPL (Standard PHP Library) TextIterator handler.

I've just install the last version of PHP6 dev and I've decided to test the famous new feature, the PHP Unicode Support. I will not explain new things about PHP6 or Unicode or TextIterator, it's just my discoveries test on this features.

He steps through the process he followed - enabling Unicode support, testing various output methods (including just an echo and using the TextIterator) as well as some of the manipulation methods (next/first/current) that can be used to get certain characters out of a string.

David Sklar's Blog:
Visiting each character in a string

byChris Cornutt Apr 26, 2007 @ 12:01:00

In a new post today, David Skalr demonstrates how he solved a simple problem - looping through all of the characters in a string in a UTF-8 enabled environment.

So I've got this string (in PHP) and I need to scan through it character by character. I can't scan byte by byte because it's 2007, our users write in all sorts of languages, and the string is UTF-8.

To remedy the situation, he falls back on an old standby - the mb_* functions, mb_substr and mb_strlen. His benchmarks show that, with a 1500 character string, running his sample script gives him around 61 scans per second. (The PHP6 version with TextIterator works much faster, though - 450 scans per second).

tagged: string loop utf8 mbstrlen mbsubstr benchmark textiterator string loop utf8 mbstrlen mbsubstr benchmark textiterator

Link:

David Sklar's Blog:
Visiting each character in a string

byChris Cornutt Apr 26, 2007 @ 12:01:00

In a new post today, David Skalr demonstrates how he solved a simple problem - looping through all of the characters in a string in a UTF-8 enabled environment.

So I've got this string (in PHP) and I need to scan through it character by character. I can't scan byte by byte because it's 2007, our users write in all sorts of languages, and the string is UTF-8.

tagged: string loop utf8 mbstrlen mbsubstr benchmark textiterator string loop utf8 mbstrlen mbsubstr benchmark textiterator

Link:

Andrei Zmievski's Blog:
All the Little Pieces, or TextIterator in PHP 6

byChris Cornutt Jul 14, 2006 @ 10:55:15

On Andrei Zmievski's blog today, there's a new post looking at new features of the upcoming PHP6 series - specifically dealing with internationalization and Unicode support.

I have been working on the Unicode support in PHP for quite a while now and I figure that it is time to start talking about Unicode and I18N in general and specifically about some of the new features that PHP 6 will be bringing to the table.

He first covers the new TextIterator class, as "swiss-army kife-like" tool that gives the user abilities to work with text units (really their boundaries) in a simple, normalized way. Of course, definitions and code follow to illustrate the point with examples ranging from interpreting a string out to grabbing certain bits of the string.

He also introduces the opposite twin of the TextIterator class - ReverseTextIterator. It's basic function is to (basically) do everything its twin does, only in reverse.

tagged: textiterator reversetextiterator php6 unicode internationalization class textiterator reversetextiterator php6 unicode internationalization class

Link: