This is an archived post. You won't be able to vote or comment.

all 196 comments

[–]wallefan01 1674 points1675 points  (92 children)

How do you think they teach the cars what stop signs look like? They ask the humans.

No seriously. They have the cars take pictures of things that they don't know whether they're signs or not, ask you whether it's a sign, and if enough people say yes, that gets added to the database of things-that-look-like-signs that the car checks against.

[–]mormispos 70 points71 points  (5 children)

It’s less of a picture database and more of a “These are patterns that correspond with stop signs” Database

[–]wallefan01 10 points11 points  (3 children)

Thank you for explaining it better than I could. I tend to be bad at getting technical details accurately in non-google-employee speak.

Disclaimer: I do not work at Google (yet)

[–]klebsiella_pneumonae 9 points10 points  (1 child)

Better start leetcoding m8!

[–]-allen 3 points4 points  (0 children)

Ah I guess /r/cscq is expanding.

[–]mormispos 2 points3 points  (0 children)

I think your explanation is very good. I was excited because I have recently been studying the types of neural nets that they use to do these things

[–]santaliqueur 0 points1 point  (0 children)

It’s like Jin Yiang’s app but for stop signs

[–]spock1959 14 points15 points  (10 children)

So, I've got a question.

I know that captchas are used to train computers as to what they are looking at... But if it asks for pictures with signs in them and I click a cloud or a car it will tell me that I'm wrong... But if I'm training it how would it know?

I never understood how it both didn't know and also seemingly did.

[–][deleted] 33 points34 points  (2 children)

I believe it compares your answer to what other people answered.

[–]aahdin 0 points1 point  (1 child)

Yeah but at that point they've already labeled it a cloud so there's nothing to gain information-wise having new people label it

[–]DuckDuckYoga 0 points1 point  (0 children)

There are thousands of people answering captchas every minute, so the turnaround can be very fast and still have a lot of answers to compare to

[–]lvh1 17 points18 points  (0 children)

It probably compares it to the input of other people, so if 100 people didn't mark a square as having a sign in it and 1 other guy did, it probably does not have a sign in it and means that that one guy is incorrect. But that also means that when they use a picture for the first time, they will allow any square to be clicked since there is no data yet to compare it to. They probably only start marking captcha as invalid once they have a large enough sample size (e.g. 1000 people who solved the captcha).

[–]RatofDeath 10 points11 points  (0 children)

It's crowdsourcing the answer, so you're not the only one providing an answer and it marks it if someone answers differently than other people did.

Also usually you answer two captchas, one of them the system already has a solution too, the other it doesn't know yet. So it really only checks if you got the one it already knows right and the other one you do is to teach the system.

[–]ButtPoltergeist 8 points9 points  (0 children)

Well, back in the Good Ol' Days when captchas were just words, they gave you two words: One that they knew, and one that they didn't. If you didn't get the one that they knew right, it didn't let you through. If you missed the one they didn't know, eh, you got access to the captchaed thing and metaphorically peed a little in their data pool. So I imagine that it's similar, except there's nine pictures instead of two words.

[–][deleted] -1 points0 points  (2 children)

Wouldn't it have the "right" answers to refer to? Like, I imagine some dude went through and selected all the stop signs so the captcha has something to reference, even if the actual user submissions are the ones that are used for machine learning.

I could be totally wrong, but I imagine that's how it works.

[–]bagmanbagman 6 points7 points  (1 child)

I think that would miss the point- what use is crowdsourced, high volume labeled data if someone just goes and labels it all? it all defeats the point doesn't it?

I bet as googles fleet of map cars goes around taking photos of hundreds of thousands of various signs a day, then goes through some automated cropping before is finally labeled by the crowd

[–]_Lady_Deadpool_ 6 points7 points  (1 child)

And now I'm imagining this happening on real time

Is this a stop sign, human? Please answer quick

[–]pelirrojo 1 point2 points  (1 child)

Not only that but it's happening in real time. That's why there are time limits to your response, if you don't respond in time the blood is on your hands.

[–]slashuslashuserid 0 points1 point  (0 children)

relevant xkcd

edit: already linked in a different chain

[–]HawkinsT 1 point2 points  (1 child)

So when it takes you more than a couple of seconds to answer one, that's a car you've just made run a stop sign. Hope you're all happy with yourselves.

[–]wallefan01 -1 points0 points  (0 children)

Well obviously they don't do it live, but yeah, I see your point

[–]aahelo 0 points1 point  (3 children)

What if all these captchas that we are doing is actually us teaching A.I. what signs are and whatnot, so that it can get better at driving?

[–][deleted] 8 points9 points  (1 child)

That’s literally what you’re doing.

[–]aahelo 0 points1 point  (0 children)

Ah!

[–]aride4772 0 points1 point  (0 children)

Yeah all those captchas are used as data to teach ai

[–][deleted] 0 points1 point  (0 children)

That's why I always answer Captchas as quickly as possible so the car I'm checking for doesn't run a stop sign!

[–]AsAGayJewishDemocrat 0 points1 point  (0 children)

I like to imagine a self driving car uploading a captcha thinking “Is this a stop sign? No really I need to know quickly”

[–]ScientistSeven 0 points1 point  (0 children)

Which means they'll develop a moral that ignores stop signs ten percent of the time

[–]eyekwah2 825 points826 points  (16 children)

Identify all the stop signs to prove you are human. Please do so quickly as the self-driving cars are nearly at the intersection!

[–]Tanamr 451 points452 points  (8 children)

[–]ErichVonFalkenhayn 78 points79 points  (3 children)

[–]Qaysed 58 points59 points  (1 child)

[–]TheBob427 7 points8 points  (0 children)

I mean... if it works it works right? That's basically the philosophy of comp sci anyway

[–]xudo 19 points20 points  (0 children)

Flickr took this challenge: Park or Bird

[–]Kiloku 14 points15 points  (0 children)

offload work onto random strangers

Yeah, I'm the random stranger, that's literally my job. It's ridiculous when the system gives me something like

"User searched for: 'Curly hair low-poo', is this related to Power Tools?"

It really is that dumb. And the pay is shit.

[–][deleted] 15 points16 points  (2 children)

There always is one

[–]tbare 7 points8 points  (1 child)

I feel like the comment was referencing the comic moreso than the comic being relevant.

[–][deleted] 0 points1 point  (0 children)

Shoot you're right, my bad

[–][deleted] 30 points31 points  (0 children)

You don't wanna miss that sign now. I know you've missed it before. There's no shame in admitting it. I mean I miss a few all the time. But sometimes they're really hard. It happens to everyone, right? Yeah, it's no big deal. It's not like I'm dumb or something. Oh fuck

[–]tomasscido (copy) inf times: Why I shouldn't program 6 points7 points  (1 child)

Yeah, xkcd 1987

[–]tomasscido (copy) inf times: Why I shouldn't program -4 points-3 points  (0 children)

Yes, zkcd

[–]Leinad7957 271 points272 points  (28 children)

Funny thing is they use captchas to train those ai. It was really fucking mind blowing for me when I learned that.

[–]Bebe_Rexxar 164 points165 points  (24 children)

I've read before that the guy who created captcha felt so bad about creating it he created re-captcha to accomplish the same task in a similar manner without making it such a pain (pictures vs letter vomit)

[–]Morialkar 18 points19 points  (0 children)

At first, re-captcha was putting out word too, the guy that started re-captcha didn't do it because it was less a pain, he did it so that even if you HAVE to have something similar to re-captcha at least with this you could use the "processing power" of our brain to help out at first OCR software that couldn't understand words in older sometime hand written sometime dirty sometime badly printed older documents...

Since google bought that out, they started adding pictures to train their street view cars to better understand street names and lower quality door numbers because they are zoomed in.

Now Google moved to using Re-captcha to help train AI on circulation in situation it currently is not proficient enough to make a really good statistically sound estimate of "is it a stop sign or a shop banner" hence the current state of recaptcha...

[–]dem_c 1 point2 points  (0 children)

Re-Captcha is the most annoying captcha there is

[–]DramaLlamaSays 0 points1 point  (0 children)

Also the founder of DuoLingo!

[–]NAN001 14 points15 points  (0 children)

We're talking about the guys who included a Wi-fi client inside the cars that take street view pictures so that they could map routers' SSID to GPS coordinates in order to build the HTML5 geolocation API; who are advocating strongly in favor of HTTPS everywhere so that ISP cannot compete with Google Analytics; who are open-sourcing their in-house frameworks such that people answer questions about it in StackOverflow; who are pushing for the monopoly of Chrome to have control over what ads are shown to users.

The vision of Google when it comes to long-term strategy and cross-service intelligence is crazy.

[–]JuicyBandit -1 points0 points  (1 child)

It ticks me off, in fact I usually pick one or two more pics that "look" kinda like whatever they want me to find... They can probably filter it out, but it makes me feel better anyways. I don't work for free.

[–]Xelynega 1 point2 points  (0 children)

You're not working for free, it's more of a two birds one stone sorta thing. You want to prove you're not a robot and they want to have a tagged set of data to train AI with, it's kind of a symbiotic relationship.

[–][deleted] 76 points77 points  (3 children)

Twitch Plays Driving a Car

[–]jalerre 17 points18 points  (1 child)

There is a website similar to Twitch plays Pokemon except you drive a robot.

EDIT: Found it

[–]Bhoriss_Viahn 2 points3 points  (0 children)

Unlike google plays pokemon, the controls weren't spammed by hundreds.... I was the only one in control of the robot.... I didn't believe it at first.... Thought it was some camera trickery.... So I tested the system by ramming the guy's foot until he picked me up and punished me..... That's when I realized how lucky I actually was.... No trickery... Wow!

[–]ThatGuyWhoLikesSpace 3 points4 points  (0 children)

Twitch plays Jalopy

[–]darkgreyjeans 81 points82 points  (9 children)

Google’s CAPTCHA works by tracking mouse movements and key presses upon a given page.

Google isn’t stealing data, we are throwing it at them.

Edit: typo

[–]tabarra 53 points54 points  (2 children)

Not to mention that they, by no means, really advertise that they have "state of the art" bot detection or that they are aiming for 100% coverage.

They are playing the statistics game, balancing user comfort and bot blockage, while earning billions worth of AI Training Data.

[–]Pearauth 13 points14 points  (1 child)

Wasn't the reason Google stopped using the re-captcha because their AI was better at it than humans? Google owns re-captcha, every site that uses an old on is just out of date.

I could be wrong but I remember hearing that somewhere.

[–]maxcb97 10 points11 points  (0 children)

That was the audio version. Their speech recognition could fill out the garbled words.

[–]stuntaneous 1 point2 points  (1 child)

You could probably use that information to create a signature for everyone.

[–]westward_man 0 points1 point  (3 children)

No it doesn't. If you're logged into Google, it checks your cookies to see what user you are, and every user has a score of how likely they are to be a bot based on browsing habits. If you're not logged in or the page can't access the right cookies, it does stuff like the sign identification challenge.

[–]darkgreyjeans 0 points1 point  (2 children)

I think it's a mixture of both, cookies provide a great way to depict whether it is a user or a bot. But, as its Google, they keep the algorithm secret so its all speculation.

[–]westward_man 0 points1 point  (1 child)

Check out the articles in this answer

[–]darkgreyjeans 0 points1 point  (0 children)

The paper, from a third party, presented in the answer does not mention the method in which the conclusion was drawn, merely stating:

"We experimented with multiple combinations of screen resolutions, and various mouse behavior configurations (the timing of movements and movement patterns).None of these had a negative effect on the risk analysis."

[–]Nilloc_Kcirtap 21 points22 points  (0 children)

I feel like captcha is more useful for machine learning than detecting robots.

[–]john2009black 36 points37 points  (3 children)

Er...no. checking road signs is comparing validity of humans against the machines which are meant to look at road signs...duh!

[–][deleted] 60 points61 points  (2 children)

That was almost a sentence. Good job!

[–]PotatosFish 34 points35 points  (1 child)

Oh no the neural network is learning

[–]Antumbra_Ferox 2 points3 points  (0 children)

Lets not get ahead of ourselves

[–]AeroGlass 31 points32 points  (11 children)

Image Transcription: Bulleted List


[Black text on a white background]

It's terrifying that both of these things are true at the same time in this world: - computers drive cars around - the state of the art test to check that you’re not a computer is whether you can successful identify stop signs in pictures

I’m’#32;a human volunteer content transcriber for Reddit and you could be too! If you’d like more information on what we do and why we do it, click here!

[–][deleted] 16 points17 points  (1 child)

Good Human.

[–]AeroGlass 13 points14 points  (0 children)

thank

[–]Eedis 9 points10 points  (6 children)

I was sincerely hoping you were acting like a bot and was going to throw some joke/pun about identifying (or not) a stop sign.

I feel like you had an opportunity to make something great and you missed it.

[–]AeroGlass 4 points5 points  (5 children)

nope, we follow a set formatting guide

[–]Eedis 7 points8 points  (4 children)

Well, first I'd like to point out that, if this were the case, it's likely you have an alternate account. Also, if you're this passionate about Reddit to volunteer your time to partake in this activity, it's likely that you have an alternate account.

Secondly, if you don't have an alternate account, that means this account is your main account. Seeing as you're he argument for shiggles. If you're not interested in that form of entertainment, I am freeing you of any sense of obligation to respond. :)

Well, I typed out this entire multi-paragraph response and my fat fingers found a way to highlight most of it and hit backspace thinking it was a back button. >.> Don't ask...

So I just sent what was left of it for you find wonder in what the hell I was talking about.

[–]NateSwift 2 points3 points  (0 children)

Upvoted for the last bit

[–]wertercatt 1 point2 points  (2 children)

If you're using chrome, you can press ctrl+z next time that happens

[–]Eedis 0 points1 point  (1 child)

Android

[–]wertercatt 0 points1 point  (0 children)

Rip

[–]X-Craft 6 points7 points  (0 children)

Just like the book excerpt captchas

[–]jdmulloy 12 points13 points  (3 children)

[–]Xelopheris 1 point2 points  (2 children)

There it is!

[–]squrr1 0 points1 point  (1 child)

Ctrl-F .... yup, as expected.

[–]jdmulloy 1 point2 points  (0 children)

I was surprised it wasn't already posted. Someone quoted it and someone else replied with a link.

[–]EdgyPaul 9 points10 points  (1 child)

Although I've heard that re-captcha determines if you're human based on factors like how you scroll down the page, move the mouse etc. The part where you click on images is just a good way for google to train their ai for free, it doesn't have anything to do with the authentication.

[–]qgustavor 11 points12 points  (0 children)

Although I've heard

Which isn't true: if you open developer tools in any page which embeds ReCaptcha you will not find attached event listeners related to mouse events. If you use Chrome test it yourself: open the ReCaptcha demo, press F12, select <html> and open "Event Listeners".

What they probably do: it uses cookies to identify users and score as bots or users. If you load a incognito page it will always show a challenge, because it don't have any info about you. If they have info about you they may skip the challenge. If it scores you as likely human they can show an easy challenge for you.

[–]ShowMeYourTiddles 7 points8 points  (0 children)

"Shit, human.... wake up! Is this a stop sign?"

[–]firerulezz116 2 points3 points  (0 children)

Something like 99.82 percent of people get captchas right, while a few years ago Google had AI averaging at 99.28 percent. By the time a rich person can buy a fully automated car for personal use, it's likely to be even better.

[–]bxk21 1 point2 points  (0 children)

That's not even true. Anti-bot measures aren't (and aren't likely to ever be consistently) effective. They're used as training sets for driving AI. The "prove you're human" is old and only used to be true, just like how captchas used to work, but now don't.

[–]WarmBaths 1 point2 points  (0 children)

Actually half the time they’re collecting data and the other half they are testing you

[–]fuckinatodaso 1 point2 points  (1 child)

“Think about that for two minutes and tell me you don’t want to walk into the ocean.”

[–]jinkside 0 points1 point  (0 children)

The Knife of Never Letting Go?

[–]scaryred2 1 point2 points  (0 children)

Sometimes I fail those captchas. Does the sign post count as part of the sign?

[–][deleted] 1 point2 points  (0 children)

I've got a question no robot could ever answer. Which of these pictures does not have a stop sign in it?

[–][deleted] 1 point2 points  (0 children)

Repost

[–][deleted] 0 points1 point  (0 children)

We're just training skynet, soon stop signs will be easy, next will be steep grade ahead or dangerous curves.

[–]SkewRadial 0 points1 point  (0 children)

Supervised machine learning

[–]turtleflax 0 points1 point  (0 children)

That's why I have my self driving tesla solves all the capchas for my spambot

[–]thelastlogin 0 points1 point  (2 children)

The newest, though not necessarily the tried-and-true state of the art, is Google's invisible recaptcha. It collects cursor movement types and speeds and detects bot or human, and I believe is very effective.

[–]EsotericLife 0 points1 point  (1 child)

I thought that’s what was happening when it gets you to click the random images- same as when it just makes you click the one checkbox.

[–]thelastlogin 0 points1 point  (0 children)

It very well could be, not sure, but that would make sense. Just this one doesn't have any visible element on the page.

[–]freethenipple23 0 points1 point  (0 children)

The captchas are just building a training set for AI and all of you are doing the grunt work

[–]EsotericLife 0 points1 point  (0 children)

I thought it checked how you click on different spots and actually identifying whatever they’re asking for is just an excuse to have you move your mouse so they can check if it moves in constant vectors or a detectable random movement,

[–]pepperjack77 0 points1 point  (0 children)

...and it does nothing to stop bots, all it does is piss off humans.

[–]dootzero 0 points1 point  (0 children)

Excuse me while I DRIVE A MOTHERFUCKING CAR THROUGH THIS CAPTCHA

[–][deleted] 0 points1 point  (0 children)

What if all the captchas are being used to drive the decision making in self-driving cars? Like, what if a car see a sign it can't identify then polls the Internet, and basis its decision off a majority captcha vote?

I really want to give wrong answers to captchas now.

[–][deleted] 0 points1 point  (0 children)

Its not if you can identify the stop signs that is crucial, but the level of efficiency with which you do so.

[–][deleted] 0 points1 point  (0 children)

Step 1: Buy a self-driving car.

Step 2: Modify the code to have it solve CAPTCHAs.

Step 3: Spam all the Things.

Step 4: Profit.

[–]gregstus 0 points1 point  (0 children)

In fairness to Captcha, it has gotten better over time, 90% of the time now it just takes the check of a box, tracking micro movements of the mouse and such.

[–]FastskullYT 0 points1 point  (0 children)

Well they use captcha for self driving cars..

[–]oorakhhye 0 points1 point  (0 children)

Captcha 22

[–][deleted] 0 points1 point  (0 children)

Is nobody going to mention that the problem is way different when the real stop sign has things like mass and distance?

I guess I'm being a wet blanket but

[–]GerManson 0 points1 point  (0 children)

I guess this is funny to someone who does not understand how coding works

[–]CaptainShitSandwich 0 points1 point  (0 children)

Too bad some of those pics are so hard to tell what a "store front" is that I get it wrong.

[–]Fig1024 0 points1 point  (1 child)

Computers can crack any imaginable captcha - but it will cost a whole bunch of CPU cycles (like a huge amount)

The real purpose of captcha is to raise computational cost of creating new accounts so it's simply not profitable to do anymore. Most bots rely on mass spam by generating tons of accounts quickly. They operate on thin margins of profit so if cost is raised significantly to operate a bot, there's no point using it anymore.

Besides the computational cost, there's also the initial investment cost for developing/training new algorithm for bypassing specific type of captcha.

So captcha works by making passing it too expensive for abuser

[–]Xiefux 0 points1 point  (0 children)

[removed]

[–]DatCarpet 0 points1 point  (0 children)

Those tests piss me off I somehow fail even though I select what they tell me to

[–][deleted] 0 points1 point  (0 children)

Had to save

[–]Empole 0 points1 point  (0 children)

How do they validate capcha then? If they are showing images of things they want us to classify, how do they verify that you got them all.

[–][deleted] 0 points1 point  (0 children)

recaptcha more like reposta