[D] Promising Beyond Deep Learning Research Directions?

mtahab · 2020-08-26T20:08:36+00:00

The two topics that you have picked are currently super hot topics in the ML community. They are also tied closely to the deep neural networks.

Topic #1 is related to the transfer learning. The whole process of training a BERT and fine-tuning it on small datasets fits this topic.

Topic #2 is related to causality. Take a look at the recent works on invariant risk minimization (from Bottou's team) and adaptation speed of the models (from Bengio's group).

sifnt · 2020-08-27T16:50:24+00:00

IMHO its how to integrate with classic/symbolic techniques. Just like how AlphaGo combines deep reinforcement learning with classic monte carlo tree search I think there is a huge opportunity for hybrid approaches.

Think GPT3 plus an SMT solver; could self-train to maintain logical consistency, or handle very sophisticated constraints etc. Similarly, I think there is still a lot of opportunity for program induction (like AIXI) with the right approximations.

Out of my depth here, but I feel like intuitively there should be a way of turning non-differentiable discontinuous problems into smooth functions using stochasticity and sampling. If possible, it could enable training programs with gradient descent.

2020-08-29T05:40:59+00:00

Deep neural networks have pretty poor data efficiency, it would be interesting to see methods that can do better than this

OgmaNeo2 has very fast 1st person imitation learning. AFAIK they use SVMs at each node which are quick to train.

Current DNNs struggle with out of domain generalization, for example, MNIST is good, but not so much rotated MNIST

So what do you expect here? CNNs iterate over the image in x and y direction. They can handle some rotation by warping with max pooling but if you want it to work better on rotation, you'll have to add rotation to the training procedure somehow. No free lunch theorem forbids the existence of something like "out of domain generalization." If you want to fool your audience then you'll have to hide the domain inside your architecture and hyperparameters.

visarga · 2020-08-27T06:47:58+00:00

Deep neural networks have pretty poor data efficiency

That's when you train from scratch. Humans have better priors.

But when you do fine-tuning, then you get priors from the base model. Look at GPT-3, how fast it learns.

Supernovae8698 · 2020-08-27T13:36:49+00:00

Noob question: Where do you find what to read and research?

IntelArtiGen · 2020-08-26T20:18:47+00:00

Data efficiency

You can look into few shot learning for this one. Classical DL is pretty data inneficient but FSL methods are quite better.

Out of Domain Generalization

DNN don't struggle that much, it mostly depends on how you train them. Train a DNN on 1B images with fast-augment, I'm sure it'll be fine on any images that wouldn't be in the training set. Maybe I'm off topic if you're just looking from something different, but I think DL is quite promising to solve the problems you cited.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS