NYCPHP Meetup

NYPHP.org

[nycphp-talk] Determine the text language

Carlos A Hoyos cahoyos at us.ibm.com
Fri Nov 9 09:36:36 EST 2007


> How can I use regular expression to determine the text language, is
> the selected text is English, Arabic, Hebrow,  .....etc

You can't use a regular expression to determine language - or at least not
a very simple one. Each language has certain particularities, such as
letter combinations, and statistically you can test enough of these to get
an accurate determination.

I'm not aware of any php tools (but watch me be corrected in this list ;-)
--- I suggest you look at the language guess tool here:
http://languid.cantbedone.org/ It's not in php but you should be able to
invoke it via the command line, or rewrite it in php.

Carlos Hoyos





More information about the talk mailing list