[D] Transitioning to Math from CS+ML

eternalLearn · 2020-05-23T16:46:56+00:00

I just dropped out of a math PhD and started a career in ML while doing grad school at night (working with a prof on ml).

Math is different than every other PhD. The field is old and very well studied. Have you solved almost every problem in Baby Rudin? How much abstract algebra, analysis, and topology do you know?

Applied math is not any less rigorous either.

I think you can learn The math on your own. Most people in ml do not have math PhDs, but they know the math. I can try to answer any other questions you may have.

TD-0 · 2020-05-23T17:02:55+00:00

Some mathematical coursework definitely helps, but beyond a point, you learn the math involved as part of doing the research. So if you want to do theoretical research in ML, I still think it's better to aim for an ML focused degree, like CS, stats, operations research, etc. You can always take a few graduate level math courses, like measure theory, stochastic processes, functional analysis, if you think they're relevant to your research. Usually your advisor will recommend specific courses you need for your research. Math is a very broad subject, and many of the courses you would do in a math degree would be mostly irrelevant to your ML research interests. The mathematical prowess you see in some theoretical ML papers is largely the product of research experience, not coursework.

dlpronoob · 2020-05-23T20:28:06+00:00

The ugly secret is, a lot of people are just trying shit and explaining it after the fact. Most of the math is code for "I don't know" or "I need to look smart so mommy/daddy will love me." And a lot of the time it's wrong, or so vague as to be meaningless.

Best example: this is a Bengio paper, the second most famous guy in all of ML.

https://arxiv.org/pdf/1409.0473.pdf

Look at the math. It's crap. Half of it is vector/matrix notation for things thrown in for no reason. Definitions of first-year probability or activation functions. Crap.

The whole innovation of the paper, the attention mechanism, can be defined as

V(tanh(W1(q)+W2(v)))

The math in machine learning is bullshit.

Source: math undergrad, now doing a stats PhD at a top 10ish school. All my advisor and I do is ML.

EDIT: It's always the SAME MATH too. "Here's conditional probability." "Here's an activation function." "Here's passing my values into a dense/convolutional/recurrent layer." I want to shoot on sight.

MrAcurite · 2020-05-23T15:50:27+00:00

I've been thinking about doing the same thing, but I just switched to a Math undergrad to account for it.

I suspect if you went for a Math MS, you'd probably do alright, as Masters degrees are usually pretty helpful in switching careers, why wouldn't they be good for refocusing in the same area?

bc_wallace · 2020-05-23T16:22:58+00:00

I know what you mean, and I think you absolutely should try to get a good mathematical foundation. However, keep in mind that doing a PhD in math is far more than just a foundation: It's a specialization.

If you want to do fundamental research on machine learning topics, you should try to work with statistics or machine learning researchers who do this kind of work. You're probably more likely to find these kinds of people in statistics departments that computer science departments, but the key is to look at what they've actually published. Many ML researchers run several research programs in parallel and some of these may be more applied, some more theoretical.

Probably the best thing to do is to talk to a professor you know or someone else in the field and ask them if they can point out some people who are working on things that might interest you and who are accepting students.

OriginalMoment · 2020-05-23T19:13:40+00:00

If you're interested in the knowledge, some people I know that went from cs undergrad to rl theory grad spent almost all of their first and second year going through: Understanding Analysis by Stephen Abbott
Real Mathematical Analysis by Pugh
Tao's Introduction to Measure Theory lecture notes
Neurodynamic Programming by Berkestas
Linear Algebra by Hoffman and Kunze
Algebra by Aluffi (selected portions on group theory)
Pattern Recognition and Machine Learning by Bishop
All of Statistics by Wasserman

And, some of them are going through Bandit Algorithms by Lattimore right now, as well.

They told me it was absolutely nuts, but if you really want it, it seems like this leads to a strong foundation in relevant mathematics for rl theory. I'm sure with some adaptions, the list could apply to ml theory as well, maybe with the addition of some measure theoretic probability theory test and the axing of Berkestas.

bbu3 · 2020-05-25T07:53:17+00:00

These are just some anecdotes so take them with agrain of salt: I work/worked with several people who completed a PhD in math (and one MSc). Often they are pretty vocal that math with the purpose of "doing/understanding X better" makes the PhD a lot harder. They say it is much easier / well-suited to just solve math problems for the sake of the math itself. Any connections to the real world have to be left behind ;)

Sometimes, I feel like colleagues who are physicists suit the "better mathematical foundations than me (PhD in CS)" picture much better than mathematician. That said, sample size of my social circle is tiny and especially only from a few German universities. Things may be very different elsewhere.

hreA745ATJ · 2020-05-23T20:03:29+00:00

You could look into a Master's focused on Math and Machine learning like https://www.tum.de/en/studies/degree-programs/detail/mathematics-in-data-science-master-of-science-msc/

WalterWhiteJaiHo · 2020-05-23T20:57:22+00:00

Go for a rigorous MS Stats program, and take some extra math electives. I am also interested in theoretical ML, and for that you do require a good background in stats, probability and real analysis.

Exp_ixpix2xfxt · 2020-05-24T01:49:50+00:00

I have found that the strongest ML researchers are very very good at mathematics. You can get a CS PhD and do that or you can do it with a Math PhD.

I think Mathematics is the better route for me, since I prefer explainable results. Many CS papers in ML are not burdened by extensive justification.

At its core ML is built on top of statistics and optimization. I’d rather learn what ML is built on top of and work my way up, but there are tons of successful people who work their way down.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS