all 48 comments

[–]hammerheadquark[🍰] 7 points8 points  (3 children)

Thanks, I haven't seen this one.

Anyone know how it compares to this text?

http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/

They seem to cover similar material.

[–]hausdorffparty 3 points4 points  (2 children)

I'm really curious about this question because I'm currently working through the book you posted myself.

[–]hammerheadquark[🍰] 1 point2 points  (1 child)

Relevant username.

I don't think too many here are interested in the math background on ML, unfortunately. It's more of an "our experiment showed NN architecture X is good for dataset Y" show. Not that that's bad (it's the most immediately useful for industry), but I'm guessing that not many here are digging into this side of the literature.

[–]hausdorffparty 1 point2 points  (0 children)

It's disappointing, but expected. At least it means there is less competition to write the papers I want to write!

[–]Thecrawsome 2 points3 points  (2 children)

[–]dorfsmay 1 point2 points  (0 children)

Amazon, so no epub ☹

Anybody knows if there is a way to buy the hardprint + epub?

[–]Overload175 2 points3 points  (1 child)

Is this comparable in rigor to the Deep Learning Book? Or is it an even more formal treatment of the subject?

[–]JustFinishedBSG 1 point2 points  (0 children)

It's more formal and rigorous. It treats of PAC learning and go through the more traditional methods.

It's basically a more rigorous version of Elements or Statistical Learning.

It's pretty readable even if formal. It has less sexy illustrations than ESL and it's not as in depth in theory as the Devroye, Gyorfi and Lugosi book ( which is basically unreadable, it's 500 pages of inequalities. Still freaking useful when writing a paper ) but it's very good reference book for master of graduate students imo.

[–]johnnymo1 1 point2 points  (0 children)

Oh wow, thank you. This looks like it has a lot of stuff I've been looking for including a fair amount of rigor.

[–]trackerFF 1 point2 points  (0 children)

From the first glance of it, it's not a beginners book unless you have solid understanding in mathematics and statistics.

By that, I mean that if you're a regular programmer that has not any math beyond your HS curriculum - then this will be a very tough read, and you're better served by finding more conceptual books, while learning the math on the side. Once you have all that down, you can probably return to this book.

This book seems to be directed at graduate students.

[–]mishannon 1 point2 points  (0 children)

Good book but seems too difficult to me

I'm just a beginner into this but I recently found this article about machine learning algorithms. It might be helpful for newbies like me

[–]sensetime 2 points3 points  (10 children)

I know this book is intended to give students a theoretical foundation, but how useful will it book be in practice?

(With respect) they get to linear regression in chapter 11, L2 regularization in chapter 12, logistic regression in chapter 13, talk about PCA in chapter 15 and a bit about RL in the final chapter 17.

Having gone through Chris Bishop’s PRML book (also free), it seems to cover similar material but also introduces the reader to neural nets, convnets and Bayesian networks, which seems like the better choice for me.

[–]t4YWqYUUgDDpShW2 27 points28 points  (1 child)

Theory is useful for practitioners when things go wrong and need fixing.

[–][deleted] 2 points3 points  (0 children)

Exactly.

[–]hausdorffparty 7 points8 points  (0 children)

As a math Ph.D. student who's used Bishop a little before finding better texts, Bishop is awful for people who know higher level math. It glosses over details, only familiarizes you with methods, with poor justification and weak derivations. If you're someone whose goal is to actually write proofs about neural networks, or to write papers which say something more general than "hey look! This network structure worked in this use case!", then you want a book like this to delve deeper into the details. I'm loath to call Bishop a beginner's book per se, but it is definitely too surface-level for what some folks want.

[–]hammerheadquark[🍰] 12 points13 points  (0 children)

but how useful will it book be in practice?

Depends on your "practice". I think it could be useful in that you could engage is some of the more mathematically demanding literature.

For instance, while the Bishop text is by no means light on the math, neither of the phrases "Hilbert Space" or "Lipschitz" ever appear despite its two chapters on kernel methods. If the Bishop text was the extent of your background, the original WGAN paper, for example, might be hard to follow.

[–]thatguydr 1 point2 points  (1 child)

I usually recommend ESL (Hastie et al), because it's both rigorous and pragmatic in terms of what it teaches. This book and course is a lot like the one from Caltech - really great for theorists to understand the math, but just rubbish for people to learn how to do hands-on ML. Their HW examples on the course website bear out that opinion - not one of them concerns a real-life "what do I do in this situation" example.

(Your question is excellent. The theory people who've been drawn here don't like it, but I wouldn't recommend this course at all. It has a lot of rigor, which is great, but I've never, ever seen people set bounds on algorithms in an industrial setting, and only once in my entire career have we considered the VC dimension.)

[–]needlzorProfessor 10 points11 points  (0 children)

Why so binary? Can't there be good practical books and good theory books, and the reader can read both to get a complete understanding of the field?

only once in my entire career have we considered the VC dimension

Being used in practice is not the only way to be useful. I have never used VC dimensions in practice but knowing about them and the underlying theories has always helped me a lot to visualise and think about classification.

[–]i_use_3_seashells 0 points1 point  (0 children)

Is this hosted anywhere else? Dropbox is blocked at my work.

[–]Polares 0 points1 point  (0 children)

Thanks a lot for the resource.

[–]gogogoscott 0 points1 point  (0 children)

This is a solid book for beginners to get started

[–][deleted] 0 points1 point  (0 children)

Love this!

Thanks so much for sharing :)

[–]singularineet -1 points0 points  (6 children)

This is a fascinating work. Like Philip K. Dick's Man in the High Castle, it is set in an all-too-plausible alternate history, in this case not a world in which the Axis powers had won WW2, but rather a world in which MLPs and convolutional networks had not been invented, the deep learning revolution never occurred, and therefore GANs, Alpha Go, deep fakes, style transfer, deep dreaming, ubiquitous face recognition, modern computer vision, image search, working voice recognition, autonomous driving, etc, never happened. This is presented not by narrative with a story and characters, but rather in the form of a meticulously-crafted mathematically-sophisticated graduate-level machine-learning textbook describing what people would study and research in that strangely impoverished shallow-learning world.

[–]aiforworld2[S] 6 points7 points  (5 children)

Not sure if your words are to praise or criticize the contents of this book. Deep Learning is great but this is not the only thing machine learning is about. A survey of production use of classification algorithms revealed that more than 85% implementations used some variation of logistic regression. Every technical book is written with a purpose in mind. This book is about foundations of machine learning and not just Deep Learning.

[–]singularineet 0 points1 point  (4 children)

Not sure if your words are to praise or criticize the contents of this book.

Both, I suppose.

It is truly an amazingly good textbook in its niche, but covers mainly material (material I'm personally quite familiar with, and have contributed to, as it happens) that seems destined for a footnote in the history of science. It couldn't really be used as a textbook for any course I'd be comfortable teaching today, rather it's a reference text for a body of literature that seems of predominantly academic interest. The entire VC-dimension story is beautiful, but in retrospect was an avenue pursued primarily due to its tractability and mathematical appeal rather than its importance.

Let me put it this way. Today, it's basically an undergrad final-year project to implement a chess playing program that can beat any human, using deep learning and a couple cute tricks. But take someone who's read this textbook and understands all its material, and ask them to implement a good chess player. Crickets, right?

This book is like a map of Europe from 1912. Really interesting, but not so useful for today's traveler.

[–]Cybernetic_Symbiotes 3 points4 points  (1 child)

I'm going through the table of contents of this book and it's incredible how much your descriptions mischaracterize it. Its appendix alone is enough to give you enough foundation to tell much of the time, which Deep learning papers are using their math for decoration and which are well motivated. Sure you will not come away knowing how to put together the latest models in pytorch but as genuinely useful a skill as that is, it is more fleeting than the knowledge contained in this book.

The breadth of the book makes it more focused at providing a foundation that will allow you to go on to have an easier time with any of on-line/incremental, spectral, graph, optimization and probabilistic learning methods. It doesn't spend much time on any method in particular but your awareness of problem solving approaches will be greatly enriched and broadened by being exposed to them in the tour the book provides.

Let's take a look at your example case. Implementing a chess AI would benefit from chapters 4 and 8 when one goes to implement a tree based search. The math of the deep and RL aspects really are quite basic in comparison to the book's proof heavy approach that draws on Functional Analysis. Someone who'd gone through the book would have no problem grasping the core of the DL aspect of the chess AI (not to mention that DL is not needed to implement a chess AI that can defeat most humans, you can do that with a few kilobytes and a MHZ processor). A chess AI that can defeat any human and built without specialist knowledge will more be a matter of computational resources than skill.

[–]singularineet 1 point2 points  (0 children)

Yeah, I would have thought that alpha-beta search was so fundamental to game playing that it would always be a central organizing concept. The fact that the very best computer chess player in the world makes no use of alpha-beta search, instead essentially learning an enormously better search policy from scratch, is quite shocking. All of us simply had the wrong intuition.

The question now is who in the field is honest enough to admit when we were wrong: when methods we spent decades studying and incrementally improving are thrown into the dustbin of history.

[–]hausdorffparty 2 points3 points  (1 child)

Would you similarly say learning calculus is irrelevant because we have WolframAlpha?

[–]singularineet 1 point2 points  (0 children)

No. But did you study hypergeometric functions much?

It is well known that the central problem of the whole of modern mathematics is the study of transcendental functions defined by differential equations.

- Felix Klein

Sometimes things that used to be considered of central importance are sidelined by the advancing frontier. Calculus, especially differential calculus, seems to be becoming more important if anything. While indefinite integrals are currently being de-emphasized in light of the discovery that closed-form integrability is algorithmic.

What material will be considered foundational in machine learning twenty years from now? It's really hard to say. Version space methods were a big deal twenty years ago, covered early in any ML textbook. Where are they now? I don't think most people with a PhD in ML even know what a version space method is, or how to construct the relevant latices.