NYCPHP Meetup

NYPHP.org

[nycphp-talk] question about utf-8 or unicode?

chris feldmann cwf at axlotl.net
Mon Sep 6 12:42:20 EDT 2004


Joel,
I've been using this on and off, and it seems to me to be 
shaping up into a really neat tool. There's a slashdot 
discussion on extensions today, and I thought of linking to 
history agent, then I thought I'd bounce it off you first, 
since posting something to this list isn't the same as 
advertising it to the masses. By reply time, the thread will 
be stale, but surely there'll be another within a week or 
so. So the question, I guess, is: do you want as much 
traffic as your pipe can handle?

-chris
> Subject:
> Re: [nycphp-talk] question about utf-8 or unicode?
> From:
> Joel De Gan <joel at tagword.com>
> Date:
> Sun, 05 Sep 2004 21:46:13 +0000
> To:
> NYPHP Talk <talk at lists.nyphp.org>
> 
> To:
> NYPHP Talk <talk at lists.nyphp.org>
> 
> Content-Transfer-Encoding:
> quoted-printable
> Precedence:
> list
> References:
> <1094417231.7498.73.camel at bezel> <20040905211658.GA11519 at panix.com>
> In-Reply-To:
> <20040905211658.GA11519 at panix.com>
> Reply-To:
> NYPHP Talk <talk at lists.nyphp.org>
> Message-ID:
> <1094420773.7502.97.camel at bezel>
> Content-Type:
> text/plain; charset=koi8-r
> MIME-Version:
> 1.0
> Message:
> 3
> 
> 
> Well..
> The main issue is something like this:
> http://historyagent.com/index.php?str=Русскийязык
> Which the browser translates into:
> http://historyagent.com/index.php?str=%D0%A0%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9%D1%8F%D0%B7%D1%8B%D0%BA
> Which if piped through:
> html_entity_decode(rawurldecode ($desc)) 
> works fine.
> 
> However,
> when passed through javascript escape I get: 
> http://historyagent.com/index.php?str=%u0420%u0443%u0441%u0441%u043A%u0438%u0439%20%u044F%u0437%u044B%u043A
> Which is having issues being translated in php
> 
> I think I am going to remove the javascript escaping..
> -joel
> 
> On Sun, 2004-09-05 at 21:16, Daniel Convissor wrote:
> 
>>Hey Joel:
>>
>>URL encoding is based on RFC 1738, which says the encoding consists 
>>"of the character "%" followed by the two hexadecimal digits" 
>>representing the octet for a "character within the US-ASCII coded 
>>character set."  So, Unicode seems out of the question.
>>
>>See you,
>>
>>--Dan
>>
>>
>> ------------------------------------------------------------------------
>>
>> Subject:
>> Re: [nycphp-talk] question about utf-8 or unicode?
>> From:
>> Joel De Gan <joel at tagword.com>
>> Date:
>> Sun, 05 Sep 2004 22:07:29 +0000
>> To:
>> NYPHP Talk <talk at lists.nyphp.org>
>>



More information about the talk mailing list