NYCPHP Meetup

[nycphp-talk] iterating through a multibyte string

Rob Marscher rmarscher at beaffinitive.com
Wed Jan 13 10:02:08 EST 2010


Hi all,

I have a need to iterate through a multibyte string to process the string character by character.  Hopefully in php6, this will work without any special work, but as we know we need to use special multibyte string functions in php5 to work with utf-8 characters.  Here's an example that iterates my dilemma:

<?php
mb_internal_encoding("UTF-8");

$str = "string with utf-8 chars åèö";
$length = mb_strlen($str);
$brokenStr = "";
$preservedStr = "";

for ($i = 0; $i < $length; $i++) {
  $brokenStr .= $str[$i];
  $preservedStr .= mb_substr($str, $i, 1);
}
echo "brokenStr = " . $brokenStr . "\n";
echo "preservedStr = " . $preservedStr . "\n";
?>

The array notation for string is the normal way to do this with regular strings: $str[$i].  I assume this will work for multibyte strings in php6.  

-- Is using mb_substr($str, $i, 1) the only way to get this to work in php5?  That's my question.

It seems like it's going to be many times slower according to some of the comments I've seen on the multibyte functions in the php manual.

Thanks!!
-Rob




More information about the talk mailing list