NYCPHP Meetup

NYPHP.org

[nycphp-talk] sanitizing user-submitted html

Chris Snyder chris at psydeshow.org
Sat May 31 00:38:26 EDT 2003


James Wetterau wrote:

>This submission breaks it:
>
>Strips some attibutes:<br>
><img src='http://fotola.com/berylium/csnyder/?method=latestimage'
>onmouseover='whatever(whatever="onmouseover='
>alert("gotcha");
>alert("I can do anything in here"); 
>'/>
>
>  
>
Not anymore. Two things happened there-- I needed to create separate 
patterns for attributes delimited with " and with ', and I didn't 
realize that the dot wasn't matching newline chars. Fixed both of those, 
and thanks for the shakedown!!

http://chxo.com/scripts/safe_html-test.php

It's also closing open tags now, but without any sort of pretense to 
well-formed HTML-- it just tacks the appropriate number of closing tags 
on at the end. My goal is to brute-force protect against people who 
might want to break the page visually, not correct a poster's formatting.

>Your program needs to verify that after it strips the HTML it hasn't
>generated unsafe HTML, and it needs a way to avoid getting caught in a
>loop doing that.  This is the sort of programming challenge that I
>like to model with a state machine.
>
I took a crash course in state machines this evening via Google, and I 
must admit that I have no idea what this problem would like if modeled 
as one. It's true that I would be happier with mathematical proof that 
the routine was unexploitable, but anecdotal proof will be enough for me 
to allow HTML posts in non-critical applications. Thanks again for 
testing it!

    chris.





More information about the talk mailing list