NYCPHP Meetup

NYPHP.org

[nycphp-talk] Squashing accented characters

Andrew Yochum andrew at plexpod.com
Mon Oct 25 12:35:00 EDT 2010


On 10/25/10 7:37 AM, Chris Snyder wrote:
> On Fri, Oct 22, 2010 at 2:57 PM, Andrew Yochum<andrew at plexpod.com>  wrote:
>> Hi Paul,
>>
>> You can achieve that with unicode transliteration:
>>      http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines
>> Check out the PHP Iconv extension:
>>      http://us.php.net/manual/en/intro.iconv.php
>>
>> Hope that helps!
> An example would rock this thread, Andrew.
Sure.

I'll admit, its finicky and convoluted. PHP 5.3 has an new Internationalization extension, but I've yet to play with it and IIRC, it does not have transliteration yet.
	http://php.net/intl
Andrei Zmievski demoed translitteration in PHP 6 in 2006, but well, thats another story.
	http://zmievski.org/talks/
So, thus, the Iconv solution is still the answer.

<?php
// Find your locale w/ cmd 'locale -a'
// setlocale(LC_CTYPE, 'en_US.UTF-8'); // Works on my mac
setlocale(LC_CTYPE, 'en_US.utf8');  // works on my linux boxen

$text = "Some unicode character text: âäåãá éèêë Düsseldorf";

$from = 'UTF-8';
$to = 'ASCII';

echo 'Original : ', $text, PHP_EOL;
echo 'TRANSLIT : ', iconv($from, $to."//TRANSLIT", $text), PHP_EOL;
echo 'IGNORE   : ', iconv($from, $to."/IGNORE", $text), PHP_EOL;
echo 'Plain    : ', iconv($from, $to, $text), PHP_EOL;
?>

Output:
Original : Some unicode character text: âäåãá éèêë Düsseldorf
TRANSLIT : Some unicode character text: aaeaaaa eeee Duesseldorf
IGNORE   :
Plain    : Some unicode character text:

YMMV.

Also, Derrick Rethans has a cool little PECL extension that does transliteration quite simply. Never used it, but looks good.
	http://derickrethans.nl/projects.html#translit

Hope that helps... more.

Regards,
Andrew

--
Andrew Yochum
Plexpod
andrew at plexpod.com
office: 718-360-0879
mobile: 347-688-4699
fax:    718-504-6289




More information about the talk mailing list