NYCPHP Meetup

NYPHP.org

[nycphp-talk] Converting the pesky MS Word quotes and other characters

Joseph Crawford codebowl at gmail.com
Wed Oct 29 15:52:50 EDT 2008


I'm not sure why but neither suggestion seemed to work, here is what I  
found that does work

/**
  * Remove unwanted MS Word high characters from a string
  *
  * @param string $string
  * @return string $string
  */
function sanitizeString($string = null)
{
	if(is_null($string)) return false;
	
	//-> Replace all of those weird MS Word quotes and other high  
characters
     $badwordchars=array(
         "\xe2\x80\x98", // left single quote
         "\xe2\x80\x99", // right single quote
         "\xe2\x80\x9c", // left double quote
         "\xe2\x80\x9d", // right double quote
         "\xe2\x80\x94", // em dash
         "\xe2\x80\xa6" // elipses
     );
     $fixedwordchars=array(
         "'",
         "'",
         '"',
         '"',
         '—',
         '...'
     );
     return htmlspecialchars(str_replace($badwordchars,$fixedwordchars, 
$string));
}


On Oct 29, 2008, at 3:40 PM, Hans Zaunere wrote:

>> Has anyone on here found a viable solution?  Everything seems to work
>> on the www side of things but as soon as i use the data in an RSS  
>> feed
>> it does not seem to like the MS Word characters.
>
> Word/etc always manages to create new and exciting chars, but the  
> following
> usually take care of most of them.
>
> $this->Value = str_replace(array(chr(0x92),chr(0x93),
> chr(0x94),chr(0x96),chr(0x97),chr(0x85)),
> array('\'','"','"','-','-','...'),$this->Value);
>
> H
>
>
> _______________________________________________
> New York PHP Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> NYPHPCon 2006 Presentations Online
> http://www.nyphpcon.com
>
> Show Your Participation in New York PHP
> http://www.nyphp.org/show_participation.php




More information about the talk mailing list