NYCPHP Meetup

NYPHP.org

[nycphp-talk] iterating through a multibyte string

Rob Marscher rmarscher at beaffinitive.com
Wed Jan 13 10:51:48 EST 2010


On Jan 13, 2010, at 10:28 AM, John Campbell wrote:
> mb_substr is always going to be slow because you always have to
> iterate from the beginning get the count, thus the loop will run in
> O(N^2).
> 
> In theory, it should be much faster if you just pull the first character.
> e.g.:

Good point. Thanks.

On Jan 13, 2010, at 10:23 AM, Dan Cech wrote:
> This might be a bit quicker:
> 
> $str = "string with utf-8 chars åèö";
> $t = preg_split('//u',$str,-1,PREG_SPLIT_NO_EMPTY);
> var_dump($t);

Yeah.  This is nice.  I think we'll use it.  I suppose I could write a benchmark to try to compare.  I'll post the results if I do.

Thanks a lot to both of you for the quick response.  Much appreciated.

-Rob




More information about the talk mailing list