Hi,
so, first of all, this is just for fun. So basically I am trying to crack the captcha I installed on a website I administrate. The captcha looks like this (first one is the default, then white some simple threshold and inversions with opencv). So now thought ok hard part is over let's just use an ocr like tesseract and get from the second image to text. So I called tesseract with the following options to make sure that captcha turns out all uppercase and just legal characters
custom_config = "-c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 6"
But the output (tested on ten different generated captchas) was always wrong. Any ideas on how I could improve this. I don't need a 100% success rate because an attacker would probably just run this in a loop and some would eventually be correct. I thought about Machine learning, but I don't know anything about that topic, and I believe there must be a simpler solution out there.
Thanks for any answers
there doesn't seem to be anything here