NYCPHP Meetup

NYPHP.org

[nycphp-talk] SimpleXML - UTF8

Adrian Noland anoland at indigente.net
Sat Oct 17 17:10:08 EDT 2009


I have this handy function I pulled from somewhere else. Does it help?

Apologies if the actual characters don't come across in the email.

    /**
     * This function was created to scrub additional html entities that are
not in the PHP get_html_translation_table
     * Currently bug #34577 in the bugs.php.net database.
     * a1 is a list of current html entities that are commonly appearing in
the listing description that are not escaped
     * a2 is most of the entities to either an accepted format, correct
html-entity, or with a blank space
     *
     * @param string $string string to scrub
     * @return string $string clean string
     */
    public static function xmlStringScrub($string) {
        $a1 = array("�","�","�","�", "�","�", "�", "�", "�", "�",
"�","�","�","�","�", "�", "�");
        $a2 = array(".","-","•","", "'","'", '"', '"', "-", "-", ",",
"^",",","","€", "®", "™");
        $string = htmlentities($string, ENT_QUOTES);
        $string = str_replace($a1, $a2, $string);
        $string = utf8_encode($string);
        return $string;
    }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20091017/23d5418b/attachment.html>


More information about the talk mailing list