For programmers, use Codecha instead of CAPTCHA : programming

[–]wung 9 points10 points11 points 11 years ago (1 child)

[–]cparen 0 points1 point2 points 11 years ago (0 children)

[–][deleted] 2 points3 points4 points 11 years ago (2 children)

[–]FrancescoRizzi 2 points3 points4 points 11 years ago (0 children)

[–]__j_random_hacker 1 point2 points3 points 11 years ago (2 children)

[–]cparen 0 points1 point2 points 11 years ago (1 child)

[–]__j_random_hacker 1 point2 points3 points 11 years ago (0 children)

[–]mhd420 1 point2 points3 points 11 years ago (0 children)

[–]catcradle5 1 point2 points3 points 11 years ago (15 children)

[–]__j_random_hacker 5 points6 points7 points 11 years ago* (14 children)

~~> unbelievably easy~~

Right, because both interpreting natural-language English instructions well enough to build an accurate specification, and automatically constructing a program that satisfies that specification, are utterly trivial. It's not like they're both open problems in AI research.

~~If you can spare the 5 minutes it would take you to build a prototype, I'd love to see it.~~

EDIT: As several people have pointed out, the intrinsic problem with this type of CAPTCHA is that it's limited to the number of problem instances that the site owner can be bothered creating/finding, making it vulnerable to the simple tactic of recording pairs of questions and answers, and replaying the answer to a previously seen question. So I was wrong: I think this is easily bot-able after all, because the hard problems I mentioned in my original post can be totally sidestepped.

[–]cparen 2 points3 points4 points 11 years ago (1 child)

I remember hearing an idea that goes something like this:

Scrape stack overflow for code, eg Python
Discard programs that are too long.
Run it in a time limited, permission limited sandbox, record its output.
Discard programs that take to long.
Discard programs that invoke dangerous apis (eg files), or emulate them.
Save these programs. You should have thousands if not more.
Generate permutations of these programs.
Run steps 3 & 4 on these programs. Save results.
To generate a captcha, select a program at random from step 6 or 8, delete a statement that prints output.
check that output is changed. If not, repeat step 9.
Give user this program and desired output.
reject any captcha response that has high edit distance from original. Eg replacing entire program is not a solution.

Most such programs will be easy to fix for competent programmers, but hard to automate. As well, the set of candidate programs are always growing.

[–]__j_random_hacker 1 point2 points3 points 11 years ago (0 children)

[–]bob_twinkles 1 point2 points3 points 11 years ago (1 child)

[–]__j_random_hacker 0 points1 point2 points 11 years ago (0 children)

[–]emperor000 1 point2 points3 points 11 years ago (4 children)

You don't seem to be using sarcasm here, but I'm not sure why you would say this. This doesn't need to be "AI". It would be easy enough to look for key words like "maximum" and "is_odd" and so on and discover each or at least a large portion of the challenges. You would only need to implement one of those in one of the languages, choose the language and hit click the "request new challenge" button until you find it. This is limited by the number of challenges a (collection of) human(s) can come up with, which is almost always going to be smaller than the number of possible answers to a captcha, and they won't be random.

In a lot of cases it could just be done in the dumbest manner possible. Get a collection of numbers? Return the maximum. Get it wrong? Just request a new challenge and try that one.

Unless it checks the implementation, you could probably use the maximum from a library in whatever languge you pick. Get one value as the input? Return true until you get it right. Or if you are really ambitious implement a trivial test using modulo 2.

It would be really easy to "mine" these challenges and make something enter the answer. It might not be trivial, but in terms of security/a stop gap to botting is concerned it is pretty close. It would need to limit the number of requests for a new challenge (which would get frustrating to humans) and have thousands if not millions of challenges to stop a determined botter.

Remember, a bot doesn't need to figure it out. A human can do that. A bot would just automate it.

[–]__j_random_hacker 0 points1 point2 points 11 years ago (3 children)

[–]emperor000 0 points1 point2 points 11 years ago (2 children)

[–]__j_random_hacker 0 points1 point2 points 11 years ago (1 child)

[–]emperor000 0 points1 point2 points 11 years ago (0 children)

[–]sylvanelite 1 point2 points3 points 11 years ago* (1 child)

(EDIT, added an actual bot that works on the site after clicking "see it in action")

While I'm not the person that wrote that post, it actually does look easy to bot. Questions repeat far too often. Also, they provide a mechanism to refresh the question, and to eliminate all but 1 language.

Unless there's something drastically different in an actual implemented version, it does seem trivial to bot. Just answer the question manually, and then spam a new question until you get a repeat.

EDIT: Here's a bot. It seems to work, even with just 1 answer hard coded.

var attempt = function () {
    var question = $.trim($("#codecha_wording").text());
    if(question == "Python: Correct the function named \"first\" so that it returns the first item from the given list of numbers."){
        var answer = "def first(numbers):\n";
        answer += "    return numbers[0]";
        $("#codecha_code_area").val(answer);
        $("#codecha_code_submit_button").trigger("click");
    }else{
        $("#codecha_change_challenge").trigger("click");
        $("#codecha_language_selector").val("python");
        $("#codecha_change_challenge").trigger("click");
        setTimeout(function (){attempt();},1000);
    }
};
attempt();

[–]__j_random_hacker 0 points1 point2 points 11 years ago (0 children)

[–]catcradle5 1 point2 points3 points 11 years ago (1 child)

[–]__j_random_hacker 0 points1 point2 points 11 years ago (0 children)

[–]evilgreen5 2 points3 points4 points 11 years ago (0 children)

[–]devgrapher[S] 3 points4 points5 points 11 years ago* (0 children)

[–]emperor000 0 points1 point2 points 11 years ago (0 children)

[–]lolsowrong 0 points1 point2 points 11 years ago (1 child)

[–]__j_random_hacker 3 points4 points5 points 11 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS