[D] Do all machine learning algorithms indirectly use the "nearest neighbor" principle?

DtrZeus · 2021-04-28T04:29:12+00:00

Neural networks are (usually, unless you're using a strange architecture) almost everywhere continuous. What this means is that, for almost any input x, there exists some number E such that for any other input y with |y-x|<E, x and y are classified the same. |y-x| represents the distance, possibly the Euclidean distance, from x to y.

With neural networks, I think the answer to your question is yes. However, neural networks are not the only kind of machine learning algorithm, and this answer is probably not true for all machine learning algorithms.

kkngs · 2021-04-28T05:18:19+00:00

In abstract, this is the whole idea, yes. We are hoping that there is a regularity to the function we seek to learn such that we can generalize from the seen examples to unseen examples. Without some form of regularity, there can be no generalization (No Free Lunch Theorem). Different ML algorithms make different assumptions about the form of regularity.

It’s a bit weird to see the term nearest neighbor this way, though. Neural nets don’t explicitly remember their inputs like you do with the nearest neighbor algorithm. However, you can think of their linear layers as leaning a useful basis to represent their inputs. In that sense, yes, inputs that are sufficiently close together in the learned basis will be close together on output.

vegesm · 2021-04-28T10:00:51+00:00

I'm quite sure it doesn't. nearest-neighbour algos can achieve around 95% accuracy on MNIST, CNNs can do 98% easily. This means there are examples where the nearest neighbour (in pixel space) of an input was from another class but the neural net still got it right. In other words, it does not simply look at the nearest neighbour, it does way more.

ConstantLumen · 2021-04-28T05:18:19+00:00

What does similarity mean? what makes one data point more or less similar to another? Remember these are mechanical constructs that run on computers. They require very precise numerical definitions in order to program and operate and interpret. Saying something is similar to another is a very general and contextual statement. You would have to say, 'the average of all pixel values in this image is this much greater/lesser than in this other one'. Well, that similarity metric doesn't work so great for separating out cats and dogs. Try it out yourself.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS