NYCPHP Meetup

NYPHP.org

[nycphp-talk] NEW PHundamentals Question

Joel De Gan joel at tagword.com
Wed Feb 11 00:43:32 EST 2004


On Tue, 2004-02-10 at 20:51, David Mintz wrote:
> Fascinating. Initially I was thinking, oops, bad form, I shoulda googled
> before I asked "what's a Captcha." But WTF, this is too interesting. What
> a demented/delightful cat-and-mouse dynamic.

That is exactly what is going on.
Everytime yahoo introduces a new captcha, someone releases a new
"work-around" ..  all the major sites are dealing with captcha work
around programs. Small sites and personal sites are obviously not
targets, but a 'good' captcha combined with email verification is a
formidable foe.
There are a lot of other ways around the captcha's and a lot of other
ways to deal with them.
Font captchas are super easy, training gocr to identify a new font only
requires that you reload a page a few hundred times at most and then
train gocr. (easy to get around those crazy fonts). Captcha's that use
rotated text are the same, except that you re-rotate and pick the one
that matches.. It sounds more difficult than it is.. etc.. etc..
There are also a lot of C programs that can identify points, you can
have those programs output images of just the points and train gocr to
work with those also. (the 'database' option in the newer gocr is what I
am talking about).
There are ones that have you pick three words from an image that has
multiple words overlapping using alphablending. That has been broken
with accuracy of 80%.
The basic idea is, accuracy of greater than 0% allows bots in the door,
they don't care, they can reload all day long.
Some sites you just need to pull legit info, bank sites, advertising
sites etc..
With sites like yahoo that are fighting automated spam-drops, any
percentage is a huge issue.. 

Anyway, yea.. more info than you needed to know.. The whole dark
underbelly of what captcha's are and why you would want to automate them
and why people don't want them automated..

I just finished a project for bypassing one of our partners captchas so
we could automate using curl to snag some pages we needed..  So you say
"captcha?" and you got that... :)

Cheers
-Joel

-- 
joeldg - developer, Intercosmos media group.
http://lucifer.intercosmos.net




More information about the talk mailing list