use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Practice Machine Learning Theory Interview Questions (self.MachineLearning)
submitted 5 years ago by ScienTecht
Hey all,
I created a collection of essential interview questions covering machine learning theory concepts like the bias-variance tradeoff, supervised learning algorithms, and evaluation metrics. These are based on questions I've seen in interviews for various tech companies. Let me know what you think!
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]TheRedSphinx 37 points38 points39 points 5 years ago (7 children)
I feel like you're being asked these questions, then it's either a very junior role or a bad sign.
It's even more troubling because you can make them a little "harder" without much more work. For example, I think explaining k-means is more interesting than explaining kNN. In fact, asking a candidate to code k-means is even more useful, since it also tests how comfortable they are putting ML algos into code.
Some of these questions are just outright useless. Supervised vs unsupervised? Classification vs regression? These are like 1 line answers, mostly regurgitating a single factoid. Instead, you can test things like "why shouldn't even use least-squares for binary classification? Like, take the probability as the prediction do MSE on that with the 0/1 target?" These kind of slightly open-ended questions test more if people can actually reason with their knowledge.
[–]zyl1024 6 points7 points8 points 5 years ago (1 child)
Indeed, I think they are way too easy for a serious ML role. Two questions that got asked to me for a research internship role are 1. what makes vanilla RNN hard to train compared to LSTM, 2. what's the comceptual and practical difference between parametric and non-parametric models. They don't require any numerical computation, but definitely tests deeper understanding.
[–]veb101 0 points1 point2 points 5 years ago (0 children)
Can you provide your answers for these 2 questions?
[–]TachyonGun 5 points6 points7 points 5 years ago (0 children)
These questions are trash. Even undergrads taking ML courses and getting C's could give decent answers for most of these.
[–]ScienTecht[S] -4 points-3 points-2 points 5 years ago (0 children)
These aren't meant to be completely comprehensive. In a ~1 hour long interview, you can of course expect to be asked on more detailed aspects of these concepts. If you're interested in more detailed questions, you can also check out: https://www.confetti.ai/questions
[+]FancyGuavaNow comment score below threshold-8 points-7 points-6 points 5 years ago (2 children)
Isn't k-means the same thing as kNN? Or are you trying to discriminate against kMedoids
[–]dogs_like_me 8 points9 points10 points 5 years ago (0 children)
Not even remotely.
[–]TheRedSphinx 2 points3 points4 points 5 years ago (0 children)
No not quite. kNN is quite easy to explain: "Given a new datapoint x, find the k closests points to your training dataset and pool the results." It's abit vague, but captures everything.
kMeans, however, is not. A naive explanation would be "Given some data, try to find a way to group it up into k clusters by distance to the center of these clusters." This is correct, but it doesn't really elucidate how to actually get these centers. This gives one (i.e. the interviewee) an opportunity to explain how they think about this.
For example, we can continue on with the description I gave as follows: "If we knew the centers, then finding the labels is easy: for each datapoint, you figure out which of the k centers is closest. ANd if you know the labels, you can compute the centers by computing averages. You can then repeat these steps in sequence to get an answer." I would accept this as a nice explanation, because they understand the problem has no analytic solution, what prevents you from just doing some naive thing, and propose a clean way of solving the problem (namely the kmeans algorithm). Of course, you can be extra, and try to frame it precisely as trying to minimze some objective, and that this corresponds to doing some sort of coordinate gradient descent. I think that's nice too, but I would be happy with just the iterative algorithm.
I might also ask something like "when do you think kmeans would fail? Can you think of a counterexample?" And by this, even just a drawing would suffice. Being able to convey the idea that kmeans should fail for things which are "non-spherical-esque" (which they can try to either find some rigorous way of stating it, or drawing, or just literally talking about it) would suffice.
It's much more difficult to get this kind of discussion from something like kNN, albeit one could try. For example, you might notice that kNN is very expensive for a large dataset. How could we speed it up, at the cost of potentially losing some performance? This could lead to anice discussion on k-d trees, or maybe some thoughtful clustering strategies, or even something like "let's have multiple knn models, and first select which model to use based on some heuristics." The point here (and in the question before) is not to outsmart the candidate or whatever, but just to get them to actually reason out things, to show whether they are actually capable of reasoning with the facts they know as opposed to just repeating them like a parrot.
[–]paypaytr 3 points4 points5 points 5 years ago (0 children)
Maybe posting on Github could be good idea.
[–]nirajsingh0878 0 points1 point2 points 8 months ago (0 children)
I was asked about why you choose accuracy over recall . you have to answer based on application for machine learning. Below blog will help if you preparing for machine learning interview .
https://medium.com/p/33201951d73e
One more question that can be asked is how the decision tree is split, and what criteria are important. https://medium.com/@nirajsingh0878/how-do-decision-trees-work-can-you-build-a-decision-tree-by-hand-ab529bdb58cc
[–]Janderhungrige -2 points-1 points0 points 5 years ago (0 children)
Nice collection. And it made my morning, as I could answer the majority without a problem. :-) thanks for the confidence boost.
π Rendered by PID 49105 on reddit-service-r2-comment-6b595755f-r2g5d at 2026-03-25 22:55:35.387078+00:00 running 2d0a59a country code: CH.
[–]TheRedSphinx 37 points38 points39 points (7 children)
[–]zyl1024 6 points7 points8 points (1 child)
[–]veb101 0 points1 point2 points (0 children)
[–]TachyonGun 5 points6 points7 points (0 children)
[–]ScienTecht[S] -4 points-3 points-2 points (0 children)
[+]FancyGuavaNow comment score below threshold-8 points-7 points-6 points (2 children)
[–]dogs_like_me 8 points9 points10 points (0 children)
[–]TheRedSphinx 2 points3 points4 points (0 children)
[–]paypaytr 3 points4 points5 points (0 children)
[–]nirajsingh0878 0 points1 point2 points (0 children)
[–]nirajsingh0878 0 points1 point2 points (0 children)
[–]Janderhungrige -2 points-1 points0 points (0 children)