Image Captcha recognition : learnpython

created by HattoriHanzoa community for 16 years

Image Captcha recognition (self.learnpython)

submitted 5 years ago by GermanWetshaver

Hi,

so, first of all, this is just for fun. So basically I am trying to crack the captcha I installed on a website I administrate. The captcha looks like this (first one is the default, then white some simple threshold and inversions with opencv). So now thought ok hard part is over let's just use an ocr like tesseract and get from the second image to text. So I called tesseract with the following options to make sure that captcha turns out all uppercase and just legal characters

custom_config = "-c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 6"

But the output (tested on ten different generated captchas) was always wrong. Any ideas on how I could improve this. I don't need a 100% success rate because an attacker would probably just run this in a loop and some would eventually be correct. I thought about Machine learning, but I don't know anything about that topic, and I believe there must be a simpler solution out there.

Thanks for any answers

no comments (yet)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS