NYCPHP Meetup

NYPHP.org

[nycphp-talk] htmlentities charset bug

csnyder chsnyder at gmail.com
Wed Jan 23 18:33:00 EST 2008


On Jan 23, 2008 3:27 PM, John Campbell <jcampbell1 at gmail.com> wrote:
> > Do your pages validate?
> Yes.  The extended HTML entities are not required. Check the source of
> this page: http://www.w3c.de/
>
> > What happens in browsers that don't support
> > the characters you're sending?
>
> I don't develop for browsers that don't support UTF-8... e.g. IE2.  If
> they don't have a glyph for the character, there is nothing you can do
> (html entities or otherwise).  Most browsers replace unknown
> characters with a question mark symbol.
>
> > What happens in systems (such as RSS
> > feed processors) that don't support multibyte characters?
>
> RSS is XML which requires UTF-8 support.  If they don't support utf-8,
> it is not a legit feed processor.  I can't think of a single piece of
> software that interprets html entities but does not support unicode.
>

I put together a test page at
http://cs.dots.chxo.com/htmlentities-tests.php

Okay, I think I'm convinced. htmlspecialchars() is roughly twice as
fast, and sending straight characters conserves bandwidth over sending
entities.

I haven't found a browser yet that chokes on the un-escaped Word smart
quotes. Unicode FTW!

Thanks, John.

-- 
Chris Snyder
http://chxo.com/



More information about the talk mailing list