NYCPHP Meetup

NYPHP.org

[nycphp-talk] can anyone recommend a good captcha?

Joel De Gan joel at tagword.com
Thu Jul 22 12:06:02 EDT 2004


Hey,
Are you talking about how to keep crackbots out? or how to crack one one
of these?
I have done a lot of work in cracking captcha's (and have published code
all over about it).
The main issue here with this one is the random line(s) and the opacity.
The random line (rand start/end, thus angle) often cuts through letters
which is one of the biggies in captcha hacking. If the lines are
vertical/horizontal/angular but always are started and ended at an edge,
we can remove them easily while at the same time preserving the portions
which are letters (it is a greedy system where abberations to the side
limit pixel removal, it can make spots, but those can be smoothed out by
despeckling).
I would venture to say that the only captcha's I have not been able to
bot-crack are the ones using insane angular, and or curved, specialty
fonts. However, if I could find the font itself, I could build a map of
them fairly quickly, so a lot of the issue is training gocr with various
font files and feeding the image through. So if I have gocr trained with
the font(s), basically for the angular or curved captchas you spin the
image both ways 180 degrees (360 total, and no captcha expects people to
flip the image upside down) keeping track of the order you find the
letters and at that point a spellchecker with 'suggest-a-word' such as
pspell can do the rest for the ones that use dictionary words.

That being said:
You "best" defense is random characters, strange intermingled
(multiple)fonts that overlap each other with various opacity while at
the same time being distorted and curved with random wiggly or curved
lines slicing through the whole thing. 

However, at that point if an actual AOL user could figure it out I would
be surprised. :)

If you want code, poke around in the archives for this list I had a
large post full of functions for doing this stuff.

-joeldg

On Thu, 2004-07-22 at 15:29, Daniel Convissor wrote:
> By "solves a lot of issues" do you mean those involved with having a 
> computer doing character recognition on the image itself?
> 
> I trust you're familliar with the technique of a cracker bot taking your 
> CAPTCHA and serving it up on some other high-traffic website where some 
> unsuspecting person enters the value which the cracker bot then submits to 
> your site.  It doesn't look like that's addressed here.
> 
> Either way, can you elaborate on the steps you took?
> 
> Thanks,
> 
> --Dan
-- 
joeldg - developer, Intercosmos media group.
http://lucifer.intercosmos.net




More information about the talk mailing list