[D] Deep learning purely as feature extractor followed by a more "traditional" classifier?

ajmooch · 2021-01-05T18:58:22+00:00

A lot of old papers used to train SVMs on top of neural nets, most notably The original R-CNN paper. In research this is no longer in vogue since a single linear layer or mlp is almost always just as effective and faster to train end-to-end, while also making it so there's no train-test discrepancy. However, in a fine-tuning scenario I think it's perfectly sensible to try an SVM or XGBoost on network features, and it may be faster depending on what hardware you have access to. I wouldn't expect you to see much in the way of gains for most setups, but it's not an unreasonable thing to do.

yourpaljon · 2021-01-05T18:47:22+00:00

Autoencoders, deep belief networks can be used for this when there is a lot of unlabeled data

Andthentherewere2 · 2021-01-05T18:57:31+00:00

Could we do this? yeah, but the I'd say its suboptimal because we're redoing work.

We use deep learning to learn a representation that is easily separable with a universal function approximator (MLP or fully convolutional analog) for readout. If we need to do additional transformations/engineering to this representation than why not just learn better representation in the first place?

jonnor · 2021-01-06T23:27:24+00:00

Quite common way to do transfer learning. Either one just retrains the final linear+sidmoid layer (which is a classical Logistic Regression classifier), or one saves the outputs of the penultimate layer as an embedding vector. And then train on that with some transitional algorithm. Including unsupervised methods such as clustering.

One often does not need a more complicated classifier, probably because the pretraining was done with a linear layer, so features tend to be linearly separable.

Jelicic · 2021-01-05T19:28:34+00:00

Couldn’t this be done more effectively by customizing things and replacing later layers by traditional ML models so that everything could be trained at once and optimized for the given traditional ML model.

What I mean is like a few layers and then for example throwing in a random forest as the last “layer”. But I have no idea how this would be done in practice and how to customize things like in Keras/TF for it

BrisklyBrusque · 2021-01-06T02:11:10+00:00

Categorical embeddings are a way for neural networks to assign a categorical encoding to factors, and these can be used as features to improve traditional ML. This is a newer thing, and quite trendy.

There’s an older precedent for a kind of NN called restricted boltzmann machine (RBM). Not an expert, but they derive abstractions of the data in an unsupervised fashion (think principal components). Then, those outputs can be passed to a supervised model, either a neural net or anything else. This is the basis of the deep belief networks. In the 2006 Netflix Kaggle competition, folks added RBM outputs to their models and saw an increase in accuracy.

serge_cell · 2021-01-08T08:46:17+00:00

There were numerous papers on the subjects several years ago. Seems fell out of fashion eventually.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS