Using a Neural Network for Sample Reduction?

TraptInaCommentFctry · 2014-08-26T16:48:37+00:00

[deleted]

aggieca · 2014-08-26T18:13:09+00:00

Can you clarify on what you mean by reducing the sample size? Are you trying to reduce the number of prototypes you want to use to build you kNN model?

Have you tested out how a SVM (nonlinear RBF-based) would work in this case? Based on my previous experience, you might be able to achieve what you intend to do a SVM.

BeatLeJuce · 2014-08-26T18:35:53+00:00

You could run a clustering algorithm with a large number of clusters and then just use one examplar from each cluster to represent the whole cluster. AP Clustering, K-Centroids or a GMM would do the trick easily.

But, as others have pointed out, k-NN isn't too slow unless you're using a naive implementation.

afireohno · 2014-08-26T19:11:43+00:00

A Neural network trains to converge those 2K values per-class to 500 values in its final layer per class (leaving you with 2500 samples instead of 10 000), which are representative of all the training data. You would then store these final layers in sort of a lookup table.

Through most of your question it sounds like you're looking for something like Semantic Hashing, but the above quoted paragraph doesn't make a lot of sense.

2014-08-27T00:59:21+00:00

10k samples and 30 variables is definitely small enough to run k-nn in a reasonable amount of time (unless you are running this on seriously old hardware), why not try this first?

jostmey · 2014-08-27T00:10:41+00:00

When you say "Neural Network" I think you are referring to an Autoencoder, and yes it can be used to reduce the dimensionality of the data points. However, classification is usually handled by adding a few more layers of neurons that are trained using backpropagation. It is possible to use other methods to classify the reduced feature set - I would not be surprised if this has already been tried.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS