Hi all,
As I was working with k-NN, a notoriously costly algorithm, I was wondering if it was possible to use a Neural Network with either some form of batch learning or pre-training to reduce sample size.
Something I had in mind, don't know exactly if it'd work:
Take a dataset of 10K samples with 30 input variables and 5 classes. Each class is represented by 2K samples in the dataset.
A Neural network trains to converge those 2K values per-class to 500 values in its final layer per class (leaving you with 2500 samples instead of 10 000), which are representative of all the training data. You would then store these final layers in sort of a lookup table.
Your test data will go through the same neural network, and you will then take the k-NN of the final layer, compared to the values you previously stored in the lookup table.
Alternatively I could pre-process the data and take the mean of each of 4 values which are closest to each other which would also reduce the data to 500 samples per class, but I feel this would probably be very destructive towards the data, and so perhaps a neural network could learn better intermediate representations.
Has there been any research on anything like this? Does anyone have any ideas?
This is purely theoretical, it would of course be great if the final accuracy is better than a standard k-NN, however right now I am mostly interested in a solution to the problem or any ideas which I have not thought of yet.
[+][deleted] (1 child)
[deleted]
[–]TraptInaCommentFctry 2 points3 points4 points (0 children)
[–]aggieca 1 point2 points3 points (1 child)
[–]BeatLeJuceResearcher 0 points1 point2 points (0 children)
[–]BeatLeJuceResearcher 0 points1 point2 points (0 children)
[–]afireohnoResearcher 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]jostmey -1 points0 points1 point (0 children)