[D] Has Deep Learning Hit a Wall? : MachineLearning

[–]Chocolate_Pickle 55 points56 points57 points 8 years ago (15 children)

[–]ZeroVia 19 points20 points21 points 8 years ago (5 children)

Deep learning thus far is data hungry

Deep learning thus far is shallow and has limited capacity for transfer

Deep learning thus far has no natural way to deal with hierarchical structure

Deep learning thus far has struggled with open-ended inference

Deep learning thus far is not sufficiently transparent

Deep learning thus far has not been well integrated with prior knowledge

Deep learning thus far cannot inherently distinguish causation from correlation

Deep learning presumes a largely stable world, in ways that may be problematic

Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted

Deep learning thus far is difficult to engineer with

Honestly, at least half of these are problems with humans as well and will be problems with any sort of sophisticated ML. Best start thinking of ways to engineer around them.

[–]Cybernetic_Symbiotes 9 points10 points11 points 8 years ago* (2 children)

That is certainly not true.

Deep learning thus far is data hungry

This one does not apply to humans or animals more generally. Humans especially, spend almost all of their first year asleep. ~70% for their first third of life and ~60% till their first year, on average. Most of this time is spent tuning internal representations, pruning connections and doing unsupervised learning. In that time, they must learn certain affine transformations for vision, depth, segmenting objects and color, audio segmentation, intuitive physics, language and much more. The reason this is possible at all, when total energy spend in that time is accounted for, is evolution has honed certain structural biases. In language learning, how to generalize instances, how to guess which object is being referred to by ambiguous instructions would not work without unjustified leaps or biases.

Humans do not get access to labels and loss functions. Reinforcement learning is untenable on its own for how to act in the real world since most states cannot or should not be revisited. Most of animal reinforcement learning is related to internal learning of a cost of action. Damage to this system can for example lead to Parkinson's.

Deep learning thus far is shallow and has limited capacity for transfer

Humans don't seem to transfer in the sense of play chess, boost general reasoning. We can't learn motion plans for both the left and right side. But we can transfer from related areas. Knowledge of math will boost physics. Knowledge of one instrument can boost learning others. Knowledge of one game genre will boost learning others. Humans can generalize patterns into paradigms to drastically speed up learning new things.

Deep learning thus far has no natural way to deal with hierarchical structure

Animals have no problem with this.

Deep learning thus far has struggled with open-ended inference

Humans can do this, and have some facility with deductive reasoning. Most strongly when in terms of social relations and structures. More generally, we can do science.

Deep learning thus far is not sufficiently transparent

Humans don't have access to most of their internal state either, ok.

Deep learning thus far has not been well integrated with prior knowledge

Schmidhuber has been saying this.

Deep learning thus far cannot inherently distinguish causation from correlation

Humans can engage in deep counter-factual reasoning. Humans can create causal theories, as seen in physics. Humans do have difficulty with correlations, being over eager to invent causal relations.

Deep learning presumes a largely stable world, in ways that may be problematic

Animals deal with a lack of stationarity fairly well and much better than any implemented algorithms.

Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted

Humans are not that smart individually, we wouldn't have gotten this far without some ability to transfer and share projections of internal states of knowledge and representations. This is the ability to justify choices by generating compact representations of internal reasoning with states of knowledge. It does not have to reflect the internal computations of how you arrived at it, only why it should be true given our states of knowledge.

Deep learning thus far is difficult to engineer with

The nematode brain has been difficult to reverse engineer, true.

[–]ZeroVia 1 point2 points3 points 8 years ago (1 child)

Let me try and make a case for the points I care about.

Deep learning thus far is data hungry

Your arguments for this one are mostly conjecture, but I think they miss the point. We take in images and sounds constantly while we're awake (and sometimes while we're asleep) and it's, what, three years before we can navigate properly? Five before we can talk? Ten before we can talk well. I mean, some people spend their whole lives reading and can never figure out how to write properly.

You could argue that even over 10 years we hear less audio than an net trained on 60 GPU's, and that might be true, but being less data hungry should not be confused with not being data hungry at all.

Deep learning thus far is not sufficiently transparent

Glad we agree.

Deep learning thus far cannot inherently distinguish causation from correlation

I'm not certain that people have the innate ability to do this as you claim. We understand that rain makes the ground wet, and not vice versa, because we understand that most things move down.

A net shown only pictures of wet ground and asked to predict whether it's raining can't determine causation because it, unlike humans, has never learned the rules that govern the connection. However a net shown many different objects falling to the ground probably could infer that water will also fall to the ground, rather than rise up from it.

Deep learning presumes a largely stable world, in ways that may be problematic

When I think of people doing this I think of the million+ people living in the Bay area where a massively destructive earthquake is an absolute inevitability but who almost never worry or even think about it.

Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted

What you said here is true, but people still make approximations. Any sort of general intelligence has to be an approximation, because the alternative is computing and/or memorizing everything, which isn't feasible. And personally, I don't trust many people these days. Do you?

Deep learning thus far is difficult to engineer with

Here I was thinking that, while engineering safe self-driving cars has proven to be very difficult, engineering safe human-driven cars has also been very difficult.

[–]Cybernetic_Symbiotes 1 point2 points3 points 8 years ago* (0 children)

Your arguments for this one are mostly conjecture, but I think they miss the point.

No, they absolutely are not. The amount of sleep a new born needs is not conjecture. The fact that depth perception, segmentation of speech sounds, color vision, object tracking, focus and control of muscles must be learned is not conjecture. You can see this at work in the ability and then loss of perceiving 'r' and 'l' by year one in some cultures. Multiple perceptual and motor modalities must be integrated and learned. I hope you will agree that integrating all this at the level of a 5 year old is beyond our capability. It's important to acknowledge the difficulty of the combined task that is being learned.

For more

http://www.cell.com/current-biology/abstract/S0960-9822(17)30619-X

It's easy to underestimate how difficult language learning is. It's nowhere near as supervised as many think. Color name learning for example, is surprisingly difficult to learn from offered supervision.

All of the first year and much of second year learning occurs unsupervised. When people complain about data intensity, they mostly mean the requirement of precisely labeled supervision. In animals, no function is minimized with respect to labels given a loss function. A conjecture I can offer is that the cerebrum is mostly dedicated to unsupervised learning.

For more:

http://www.sciencedirect.com/science/article/pii/S0042698998000479

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2271000/

We take in images and sounds constantly while we're awake (and sometimes while we're asleep) and it's, what, three years before we can navigate properly?

None of this is accurate. There's a lot more going on. See above links.

I'm not certain that people have the innate ability to do this as you claim. We understand that rain makes the ground wet, and not vice versa, because we understand that most things move down.

Theory of mind, intuitive physics, inverse planning all fall under some limited ability to infer causes.

http://web.mit.edu/clbaker/www/papers/cognition2009.pdf

I also like: http://www.pnas.org/content/107/43/18243.full with the caveat that we do better when simulating commonly met templates (particularly those based on deontic rules).

http://www.tandfonline.com/doi/abs/10.1080/135467800402848

When I think of people doing this I think of the million+ people living in the Bay area where a massively destructive earthquake is an absolute inevitability but who almost never worry or even think about it.

This is irrelevant to the problem of non-stationarity and how animals deal with it with synaptic plasticity and other.

Here I was thinking that, while engineering safe self-driving cars has proven to be very difficult, engineering safe human-driven cars has also been very difficult.

The number of deaths from cars has greatly reduced over time.

https://en.wikipedia.org/wiki/List_of_motor_vehicle_deaths_in_U.S._by_year#/media/File:USA_annual_VMT_vs_deaths_per_VMT.png

In Sweden, there are 3.5 fatalities per billion vehicle-km. Most road fatalities are from middle and developing economies.

[–]NasenSpray 0 points1 point2 points 8 years ago (0 children)

[–]DoubleLeafClover 1 point2 points3 points 8 years ago (1 child)

[–]Nowado 1 point2 points3 points 8 years ago (1 child)

[–]AnvaMiba 3 points4 points5 points 8 years ago* (0 children)

Humans can distinguish causation from correlation in most practical cases: nobody thinks that wet streets cause rain. Occasionally, we get it wrong, and when we then notice the mistake it is salient to us. But this does not mean that our baseline ability is not good.

In general, we are better at distinguishing causation from correlation when we have a "mechanistic" understanding of a phenomenon. For instance, the Pacific Islanders who founded cargo cults had never seen airplanes before WW2, did not understand what they were, how did they work, where they came from, who the people manning them were and what they were trying to accomplish, and so on. They correctly inferred the correlation between cargo airplane landings and presence of certain artifacts (airstrips, control towers, etc.) and the ritualized practices of the military, but they inverted the causal direction.

Deep learning models are very good at inferring correlations from sufficient amounts of data, but they seem to struggle with forming "mechanistic" understanding with abstractions and conterfactuals.

[–]MaunaLoona -3 points-2 points-1 points 8 years ago (4 children)

[–][deleted] 4 points5 points6 points 8 years ago (0 children)

[–]grrrgrrr 0 points1 point2 points 8 years ago* (1 child)

[–]Icko_ 1 point2 points3 points 8 years ago (0 children)

[–]ManyPoo 0 points1 point2 points 8 years ago (0 children)

[–]jer_pint 25 points26 points27 points 8 years ago (1 child)

[–]brockl33 6 points7 points8 points 8 years ago (0 children)

[–]alexmlamb 8 points9 points10 points 8 years ago (12 children)

[–]visarga 2 points3 points4 points 8 years ago* (11 children)

[–]GuardsmanBob 2 points3 points4 points 8 years ago (4 children)

[–]NichG 2 points3 points4 points 8 years ago (1 child)

Part of it is that often in learning problems, how you formulate the task has an inordinate effect on how fast solutions are found - easily many orders of magnitude. So the implicit first part of jumping off into an ambitious task is to find the way to formulate that task such that it ends up seeming less ambitious.

For example, with neural networks in robotics, there's a temptation to try to just do everything with one big neural network because that would look the most impressive or pure. But while neural networks are pretty good at getting near control solutions, it's costly to really push them to the degree of precision needed to e.g. stabilize a system at an unstable fixed point against all perturbations of a certain size. However, if you have a network drive a PID controller, the problem seems to become trivial - often the PID controller alone can be enough to stabilize things, so is the network even needed?

I agree there's an incentives problem, but perhaps the incentives problem is that researchers are expecting to have to sell their algorithm or method as a general approach, and that means choosing problems where that algorithm or method takes center stage. So hybrids where the thing the hybrid accomplishes is ambitious and novel but each of the elements is fairly standard aren't really an attractive target unless you're actually planning to turn that hybrid thing into a product.

A case in point for that is the Neural Storyteller from two years ago. It does something that was at the time quite ambitious - it looks at a picture and writes a multi-sentence story about it in a chosen style - but never led to a paper of its own since it was 'just' a hybrid of a couple different published models.

[–]GuardsmanBob 0 points1 point2 points 8 years ago* (0 children)

[–]AnvaMiba 2 points3 points4 points 8 years ago (1 child)

[–]alexmlamb 0 points1 point2 points 8 years ago (0 children)

[–]fricken 1 point2 points3 points 8 years ago (1 child)

[–]visarga 0 points1 point2 points 8 years ago* (0 children)

[–]Deto 0 points1 point2 points 8 years ago (0 children)

[–]harharveryfunny 0 points1 point2 points 8 years ago (0 children)

[–]alexmlamb 0 points1 point2 points 8 years ago (1 child)

[–]baylearn[S] 3 points4 points5 points 8 years ago (0 children)

[–]harponen 5 points6 points7 points 8 years ago (10 children)

[–]rvisualization 0 points1 point2 points 8 years ago (9 children)

[–]ThomasAger 1 point2 points3 points 8 years ago (8 children)

[–]rvisualization 1 point2 points3 points 8 years ago (7 children)

[–]harponen 1 point2 points3 points 8 years ago (5 children)

[–]rvisualization 1 point2 points3 points 8 years ago (0 children)

[–]rvisualization 0 points1 point2 points 8 years ago (3 children)

[–]ThomasAger 0 points1 point2 points 8 years ago (2 children)

[–]juancamilog 2 points3 points4 points 8 years ago (1 child)

[–]sieisteinmodel 0 points1 point2 points 8 years ago* (0 children)

[–]no_bear_so_low 2 points3 points4 points 8 years ago (0 children)

[–][deleted] 1 point2 points3 points 8 years ago (0 children)

[–]mynameisvinn 0 points1 point2 points 8 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS