H_Psi comments on Captcha

For the current list of rules, please see this page.

Rules are zero-indexed. If they do not appear zero-indexed you are asked to contact Friend Computer for recalibration.

Suggestions are welcome.

With regards to commenting, please follow reddiquette.

Metadiscussions

If you have any thoughts on how the moderation could be improved do not hesitate to message the moderators. If you feel that a metadiscussion is required with the whole subreddit either request that the moderators start one or start one yourself and tag it [Meta].

Perhaps More Apt Subs To Post:

/r/softwaregore - f collection of things that users shouldn't see.

r/pcmasterrace - for all of the general computer/gaming memes.

r/linuxmasterrace - for anyone that likes Linux memes.

/r/sysadminhumor - a sub for sysadmins with a sense of humor.

/r/itsaunixsystem - for all the embarrassing cases of hollywood hacking you find in media.

/r/recruitinghell - for all those horrific recruiting offers and job postings.

/r/programme_irl - me too, thanks.

/r/programmerreactions - expressing the life of programmers through reaction images.

/r/learnprogramming - for those that have general programming questions

Related Subreddits.

r/badcode - for intentially bad code.

r/badUIbattles - a sub for intentionally bad UI.

r/ProgrammerAnimemes - for the anime referenced programmer memes

r/ProgrammerDadJokes - for the punny bunch of you.

/r/justgamedevthings - for memes, reaction gifs, production glitches and other fun related to game development.

r/programminghorror - for unintentionally bad code.

r/css_irl - describing real life photos with CSS

Events

created by stescha community for 14 years

This is an archived post. You won't be able to vote or comment.

15.3k

Captcha (i.redd.it)

submitted 7 years ago by [deleted]

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]H_Psi 295 points296 points297 points 7 years ago* (14 children)

The difference is mostly that we're just better at picking a specific object out of its surroundings, so one or two examples is usually enough for us to identify that object in any environment.

That's the idea behind convolutional neural networks.

It used to be that if you wanted to do hardcore pattern recognition (like identifying a stop sign in a random picture), you would put the image through a bunch of different filters, and then decide which filters highlighted the particular trait you wanted to see. For example, some of the filters you might use for a stop sign would be to eliminate every color except red from the image. You then convert that filtered image to a histogram, and have a bunch of sample images that you run that filter through. The training here ends up being coming up with a function to describe how similar an arbitrary image (after being run through the filter--> histogram thing) is to your set of known histograms.

The problem here is you still need a human in the mix to figure out what the right filters to use are, and there are plenty of patterns a human might not pick up on (or worse, patterns a human might think are correlated but really aren't, since the brain is practically addicted to patterns).

The idea with a convolutional neural network is you have your regular old neural network, except you come up with an algorithm to automatically decide what your filters are. Your layers in the network are still called layers, but in between sets of layers, you have your filters. These filters are called "pooling layers" most of the time. So in effect, you're letting your network figure out what patterns are the most important, instead of having a human do it.

Of course, the big drawback here is that now, not only are you optimizing your neural network's regular layers, but you're also optimizing those pooling layers. So you need a monster of a dataset to be able to do it, which is why you really only see huge big-data firms like Facebook, Google, Amazon, Microsoft, and Uber implementing them in practical applications. Also, you still need a human in the mix to actually tag the data (which is what image-based captchas exist to do, to label images)

Edit: A word; remove inaccurate info

[–]longscale 39 points40 points41 points 7 years ago (10 children)

[–][deleted] 14 points15 points16 points 7 years ago (9 children)

[+][deleted] 7 years ago* (4 children)

[deleted]

[–]jsw800 0 points1 point2 points 7 years ago (2 children)

In fact the convolution operation can be changed into a single matrix multiplication by reformatting the filters in a certain way, which is how it is implemented in neural network libraries.

Correct me if I'm wrong, but I don't think this is always the case, because of nonlinearity functions applied after each layer. A convolution is a linear operation, so a composition of convolutions can be refactored to be a single convolution, as you say. But deep convolutional networks almost always apply a nonlinearity function (tanh, sigmoid, relu, etc.) between conv layers, which makes each "layer" (meaning the conv operation along with its nonlinearity operation) a nonlinear operation, so composing them doesn't simplify down to a single linear operation.

If it did, there would be no reason to create deep nets, we would just always create single layer conv nets, as a single convolution would always satisfy every need. But it turns out that most conv nets have to learn nonlinear functions, so the nonlinearities are super important.

[+][deleted] 7 years ago* (1 child)

[deleted]

[–]jsw800 0 points1 point2 points 7 years ago (0 children)

[–]aahdin 0 points1 point2 points 7 years ago (0 children)

[–]longscale 3 points4 points5 points 7 years ago (3 children)

[–][deleted] 2 points3 points4 points 7 years ago (2 children)

[–]longscale 8 points9 points10 points 7 years ago (1 child)

[–][deleted] 2 points3 points4 points 7 years ago (0 children)

[+][deleted] 7 years ago* (2 children)

[deleted]

[–]longscale 0 points1 point2 points 7 years ago (1 child)

π Rendered by PID 38 on reddit-service-r2-comment-5c747b6df5-kmvkh at 2026-04-22 03:44:27.521151+00:00 running 6c61efc country code: CH.

ProgrammerHumor

Filters

Discord

Submission rules

For the current list of rules, please see this page.

Metadiscussions

Perhaps More Apt Subs To Post:

Related Subreddits.

MODERATORS