NYCPHP Meetup

NYPHP.org

[nycphp-talk] PHP UTF8 Conversion to ASCII

Chris Snyder chsnyder at gmail.com
Thu Nov 10 16:55:04 EST 2011


On Thu, Nov 10, 2011 at 4:33 PM, Peter Sawczynec <ps at blu-studio.com> wrote:

> *From:* talk-bounces at lists.nyphp.org [mailto:talk-bounces at lists.nyphp.org]
> *On Behalf Of *Chris Snyder
> *Sent:* Thursday, November 10, 2011 4:16 PM
> *To:* NYPHP Talk
> *Subject:* Re: [nycphp-talk] PHP UTF8 Conversion to ASCII****
>
> ** **
>
> On Thu, Nov 10, 2011 at 4:11 PM, Peter Sawczynec <ps at blu-studio.com>
> wrote:****
>
> Recently came across the issue where utf8 characters were getting
> outputted into links like so:****
>
> http://example.com/ãcenar [<< where the "a" is a special character],
> which a browser can turn to links like so: ****
>
> http://example.com/ã�cenar ****
>
>  ****
>
> In researching, I have found that browsers do not handle special utf8
> characters in urls very well.****
>
> ** **
>
> ** **
>
> Seems like this is exactly what urlencode() is for, no? ****
>
> ** **
>
> *[Peter Sawczynec] *
>
> * *
>
> *My impression was that urlencode translated chars that cannot pass in an
> URL into entities that can. But those new entities are now  gibberish to
> the human eye. *
>
> *My end reuslt needed is creating user-friendly, clean, attractive urls
> from utf8 that will render as human-readable characteres in the browser
> address bar.*
>
> *And that browsers will not choke on the link when a user clicks it in
> web page. *
>
> *Are you saying that an urlencoded link will be clickable in a web page
> and render human-readable in the browser address bar too?*
>
> * *
>
> * *
>
> *
> *
>


URLs are made up of ASCII characters. You can fake it and hope that the
browser converts non-ascii to %-encoded entities correctly, or you can do
it explicitly using urlencode() in href attributes. I think actually most
browsers will do the encoding on the fly but you never know, especially
with mobile.

If you're also asking about display of unicode in an HTML page (seeing
the � character) it could be you are trying to display UTF-8 characters in
a page whose content type encoding is set to ISO-8859-1 or vice-versa. You
can manually change the page encoding using the View menu in Firefox to see
if that fixes the issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20111110/54ecb488/attachment.html>


More information about the talk mailing list