[nycphp-talk] Charsets are still driving me nuts

csnyder chsnyder at
Wed Mar 5 17:56:54 EST 2008

On Wed, Feb 27, 2008 at 8:35 PM, Cliff Hirsch <cliff at> wrote:
>  b 26, 2008, at 5:46 PM, Cliff Hirsch wrote:
>   I have been validating textareas using ctype_print
>  Are you using utf-8 encoding?  What do you need to validate/sanitize with
> the textareas?  That the input is using the correct character set?  Need to
> strip html from it?  I've had pretty good luck with using utf-8 and having
> text cut and pasted from MS Word come through fine.
>  I have been waffling between 8859-1 and utf-8. Regarding validating a text
> area — good question. No stripping necessary. I though ctype_print looked
> good — any reasonable printable character. Maybe textarea doesn't need to be
> validated/sanitized if everything is escaping properly? I thought the
> browser took care of the charset, but don't know for sure.

Is there a downside to using utf-8?

If you use 8859-1 you're practically making the application
English-only, or at least limiting the ability of people to express
themselves. そうかなあ?

Chris Snyder

