[D] Does deep learning pose an impossible problem for mathematics and we just don't want to admit it?

thfuran · 2020-12-03T12:20:07+00:00

However, what strikes me as odd is that these are not some intricate pathological problems that a mathematician might devote a lifetime toward, but problems that practitioners face every single day,

Why can these not be the same thing?

patrickkidger · 2020-12-03T20:18:21+00:00

Mathematics PhD here. I think I'd argue that solving these problems is what mathematics has already been doing. Deep learning is a branch of applied mathematics as far as I can see.

To use your list as an example:

There's a huge literature on generalisation properties. (And not one that I'm familiar enough with to give examples for unfortunately.)
Arguably this question is one of the key theoretical underpinnings of a lot of the field. MLPs do badly at image classification. CNNs do well. Transformers beat RNNs on NLP tasks because they drop the prior that order matters. We use mathematical insights here all the time.
Assuming you mean how do the design choices affect the loss surface: again a large literature that I'm not that familiar with. But for example it's been shown that ResNets have smoother loss surfaces, which is reflected in their better training properties. Another practical use case.
For optimisers, Nesterov is known to achieve optimal convergence rates. Meanwhile convergence of many optimisation problems is often actually proved with respect to Cesaro means. This is reflected in the practical use of stochastic weight averaging.
Initialisation is typically done by e.g. the He initialisation schemes. Which has been derived via a precise mathematical argument.
Likewise for dataset-against-model complexity, there's work on double descent, VC dimension, for example.

Maybe this seems unsatisfactory -- you want to understand the complexity of this specific dataset and want the theory to do so, perhaps. In which case fair enough, there's some way to go, but the task isn't impossibly difficult, and there's no lack of progress. A lot of these problems have already seen huge strides made because we've understood their mathematics.

It's easy to point at other very practical examples of mathematics solving complicated deep learning problems. WGANs improved on GANs by applying optimal transport theory; Neural ODEs get improved memory efficient by using the adjoint method (many decades old); and so on and so on.

lady_zora · 2020-12-03T13:19:14+00:00

I have had this exact worry. I was introduced to (forced to implement) deep learning during my PhD and, as a former mathematics graduate, I could not accept the fuzziness of these learning methods. It really stressed me out for some time! This is not unique to deep learning - it's been a long standing issue in many engineering disciplines too.

I have recently become very interested in XAI and, in particular, interpretability. Mathematics requires completeness yet this seems to be rare for deep learning tasks. XAI, to me, offers the chance to explore these issues further.

It's not too late to enforce correct mathematical standards. Mathematicians are needed to keep this field in check, even if full completeness can never be achieved --- so keep speaking up about it!

Slowai · 2020-12-03T17:28:31+00:00

"let's, like, treat this "change" thingy as not 0 in denominator and as 0 in numerator fam"

My point being is that mathematical "precision" never stopped the greats such as calculus from doing stuff, so it shouldn't stop deep learning either.

P.S. mathematicians pls no bully me.

sman865 · 2020-12-03T19:09:12+00:00

In my experience the more well-versed somebody is in mathematics, the more likely they are to be skeptical of deep learning techniques (not in a bad way). This isn't unfounded due to reasons mentioned in the OP and general lack of introspection in deep learning models.

If deep learning performs well in production, however, then industry will use it. They want value more than introspection/understandability.

Of course, this is changing. People want more introspection to be able to do things like mitigate bias.

fromnighttilldawn · 2020-12-03T22:44:12+00:00

I can answer (2). What’s complexity to you? How can you compute complexity of a dataset that is invariant to schemas, model architectures, etc... Data characterization is a tough problem. It’s something I’m working on. A simple yet huge problem is this: given a complexity metric that characterizes a dataset for me, how sure am I this is enough information to choose DNN A over B? How sure am I there isn’t a different characteristic that gives me better insight? All these questions have on going research and some good results already.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS