NYCPHP Meetup

NYPHP.org

[nycphp-talk] Regular Expressions & Foreign Characters

Jeff jsiegel1 at optonline.net
Wed Sep 17 12:04:55 EDT 2003


Thanks for pointing this out.

Jeff

-----Original Message-----
From: talk-bounces at lists.nyphp.org [mailto:talk-bounces at lists.nyphp.org]
On Behalf Of David Sklar
Sent: Wednesday, September 17, 2003 10:21 AM
To: NYPHP Talk
Subject: RE: [nycphp-talk] Regular Expressions & Foreign Characters


On Wednesday, September 17, 2003 10:59 AM,  wrote:

> If I understand correctly, a regular expression like this:
> ^[a-z0-9\',.
> -]{1,35}$/I will not allow foreign characters, e.g., Ë, because it is
> not part of the regular ASCII set of characters but part of the
> extended set. So...what's a kid to do?

Use a POSIX named character class. These respect locale settings:

preg_match('/[[:alnum:]]/','Ë');

This returns true under a locale like 'en_US', or 'de_DE'.

Read all about POSIX named character classes in the egrep(1) manpage.

You should probably call setlocale() in your PHP script before
preg_match()ing against special characters, the default locale (often
"C")
may not include these characters in the "alnum" or "alpha" classes.
E.g.:

setlocale(LC_CTYPE,'en_US');

or

setlocale(LC_ALL,'en_US');


David


_______________________________________________
talk mailing list
talk at lists.nyphp.org
http://lists.nyphp.org/mailman/listinfo/talk




More information about the talk mailing list