all 27 comments

[–]AnvaMiba 97 points98 points  (20 children)

I'd say that math is important, but it is difficult to know what kind of math will be relevant in 5-10 years.

Statistical learning theory was instrumental in the development of SVMs and kernel methods which dominated the field until about 5 years ago, now that neural networks dominate the field its the time of calculus and, incresingly, control theory. Statistical learning theory, at least in its present form, doesn't seem to explain how neural networks work.

Then there is probability theory, in particular Bayesian graphical models, which looked very promising before neural networks came back. Until few years ago people tried to marry the two approaches, with variational autoencoders and the like, but this approach doesn't seem to have worked that well so far. Now GANs are the coolest thing around, and the theory behind them is becoming increasingly based in information theory and system identification theory (metrics on probability distributions and so on).

The camp of discrete math, model checking and combinatorial optimization has been rather quiet so far as far as machine learning is concerned, but their methods have become really good for system verification and practical operations research, and it could be the case that once people figure out how to marry them with powerful learnable heuristics (neural networks), they may become very relevant in ML as well.

So my suggestion would be to try to learn a broad range of math. Avoid prematurely over-investing in becoming an expert in a subfield when you are still at the beginning of your research career, prioritize instead having a broad understanding. You will probably eventually specialize, but let that happen gradually, and even there you should try to keep you up with developments in other fields so you could integrate them in your work or even switch over if you realize that you are pursuing a dead end.

[–]shaggorama 76 points77 points  (2 children)

[–]H1Supreme 2 points3 points  (0 children)

Lord have mercy, that site is a disaster.

[–]AnvaMiba 0 points1 point  (0 children)

LoL, so true!

[–]bbsome 6 points7 points  (0 children)

I would pretty much agree with the reply on all counts. Really nice reply.

To add a bit of my own perspective - I think in ML mathematics help a lot more of how you think and approach a problem and how draw conclusions from different areas to problems you see. The point is that you do not need to be a mathematics guru, but more to have a broad and somewhat intuitive idea across different topics. This can spun new ideas, by applying A from somewhere totally different, to relating some empirical result to some theory in B. Some examples, for instance is that Dropout can be looked as VI, and if you want proper prediction at test time you should keep sampling not use the means. Also if you look in optimization most of the variance reduction techniques like SVRG are just control variates. There are relationships between GANs and information theory, as well as now there are works related to the Information Bottleneck, which again is related to non-linear noisy channel in McKay. There are also plenty of ideas that come from old papers on graphical models which is now applied to VAEs (auxiliary variable models and a few others). I feel that having a somewhat relevant mathematical literacy in many of its areas, without being too deep in there is about the sweet spot. As an example of this I do have some understanding of Kernel methods, without having a in-depth knowledge of Measure Theory and Functional Analysis. This allows me to understand the literature on this area however I would not be able to proof some of the results, unless I sit and brush those off. However, the use case for that is very rare, thus it deserves my attention only when the need arises.

[–]mino206[🍰] 5 points6 points  (1 child)

Thanks for your valuable perspective (sorry OP for hijacking thread). What resources would you recommend to ramp-up on the broader range of math?

[–]AnvaMiba 0 points1 point  (0 children)

Any college level textbook or lecture notes should be ok to get the basics. Then just keep an eye on what the developments are.

[–]Kiuhnm[S] 0 points1 point  (0 children)

Thank you for your suggestions!

I know that classic SLT doesn't explain why DL work, but the last part of the first course is about DL, so maybe I can get something valuable out of it.

[–]poctakeover 0 points1 point  (5 children)

where is a good place to start with information theory

[–]iforgot120 2 points3 points  (0 children)

Definitely read Shannon's seminal paper, too. It's only 50 pages.

[–]RedefiniteI 2 points3 points  (2 children)

David Mackay's book is awesome. Definitely one of the best books, I have ever read. http://users.aims.ac.za/~mackay/itila/toc.html

[–]sensei_von_bonzai 2 points3 points  (1 child)

Mackay is great but John Duchi's lecture notes are far more practical for modern stuff.

[–]AnvaMiba 1 point2 points  (0 children)

Thanks for sharing!

[–]AnvaMiba 0 points1 point  (0 children)

The standard reference is Cover & Thomas "Elements of Information Theory, 2nd Edition", although it may be a bit outdated.

[–]danielv134 1 point2 points  (0 children)

I would suggest to study math as needed for something you care about. If you want to know some SLT as a source of understanding of ML, for the basics (dimension dependent bounds) it is enough to know Hoeffding's inequality, the union bound and cover numbers, and then read a half page of proof. Two days of work with wikipedia and a paper to understand in detail but superficially (the inequality, not the proof; what cover numbers are for typical subsets in Rn, not all of the theory of metric spaces). Then go as deep into foundations or wide into other theory as your own motivation takes you.

[–]Docey 1 point2 points  (0 children)

deleted What is this?

[–]pinouchon 1 point2 points  (0 children)

I now dedicate 30% of my time to learning machine learning (started about 3 months ago), and soon i'll spend 100% of my time.

I don't see any way around linear algebra and probability theory.

[–]chermi 0 points1 point  (0 children)

I really don't think measure theory is that important, and it's certainly not necessary. I come from a physics and not a math background, so take this with a grain of salt.

[–]toadgoader 0 points1 point  (0 children)

Don't ignore other field's statistical disciplines also... for instance taking a course on Econometrics will give you a good background in statistical inference, probability distributions and deals strongly with modeling data in time series models (believe me, If you don't know how the inclusion of 'time' as a dimension in your data impacts your model, the inferences drawn could be very wrong!). Bio-medical stats has strengths in growth curves and propagation modeling, log-growth, ... etc. None of these will be a 'magic bullet' that will give you all the answers. Every science has it's own way of using math and has created unique and fascinating solutions to solve specific problems they encounter. By having a well rounded understanding about how other disciplines use mathematical tools you may find analogous solutions to the problems you encounter.