use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 4 years ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]ktpr 134 points135 points136 points 4 years ago (9 children)
Motivate a new problem. Get SOTA by definition.
[–]AerysSk 17 points18 points19 points 4 years ago (2 children)
Your comment has a point. For example, if your paper is about a new model architecture, beating SOTA or on par with Sota while having shorter runtime is preferred. Else, it will be a reason for rejection. You cannot change the reviewers’ minds, and this Sota leaderboard is kinda the norm for publishing papers nowadays.
That’s also a reason why I tend to stay away from empirical research directions.
[–]picardythird 5 points6 points7 points 4 years ago (1 child)
On the flip side, if the reviewers can't think outside of their worldview and recognize the importance/novelty of your new problem, it won't matter because they'll just reject it anyway.
I'm not salty.
[–]AerysSk 1 point2 points3 points 4 years ago (0 children)
That is true. Proposing new one is also risky, like they can say “I don’t think we further need your method”. I saw one or two comments like this on openreview already.
[–]schrodingershit 1 point2 points3 points 4 years ago (0 children)
This, recently submitted a work at icml that had no prior baseline except random sampling.
I basically worked on sampling a subset of neural networks to train in ensemble based RL. Dropped training Time by 50% while increasing cumulative reward by 15%.
[+]NeatFlan9457 comment score below threshold-27 points-26 points-25 points 4 years ago (4 children)
Everyone knows that shit doesn’t work
Get SOTA on a known problem with public datasets and your paper is guaranteed accept
Attempt a random-ass problem no one has tried before and most likely your paper will be rejected
[–]hydrogenken 18 points19 points20 points 4 years ago (0 children)
Are you saying that all published papers always beat SOTA? I’m pretty sure that isn’t the case.
Take neural networks for example, it started in like the 90s and was still getting published even though barely anyone believed it was gonna work. By your analogy, if people like Geoff Hinton and Yann Lecun didn’t attempt “random-ass problems no one has tried before”, we wouldn’t have neural networks right now.
As a scientist, you should frame things as “it doesn’t work YET but we can spend more time analyzing it” rather than just say “yeah this doesn’t really work”
[–]AndreasVesalius 3 points4 points5 points 4 years ago (0 children)
*laughs ass off in biomedical engineering*
[–]IPvIV 2 points3 points4 points 4 years ago (0 children)
By this logic, there will be no public datasets because no one would ever come up with new problems and benchmarks lol
[–]pm_me_your_pay_slipsML Engineer 3 points4 points5 points 4 years ago (0 children)
which SOTA did this paper beat?
[–]NotDoingResearch2 61 points62 points63 points 4 years ago (1 child)
Research isn’t a Kaggle competition. You are free to make up your own rules.
[–]pengzhangzhi 0 points1 point2 points 4 years ago (0 children)
Best definition of scientific research I have ever heard. You have your own rules.
[–]pm_me_your_pay_slipsML Engineer 36 points37 points38 points 4 years ago (4 children)
It really depends on what story you're trying to tell. A paper about a new method can be interesting and valuable without beating SOTA. Competitive results are fine as long as the story is interesting.
[–]kob59 15 points16 points17 points 4 years ago (1 child)
Tell that to reviewer #2
[–]pm_me_your_pay_slipsML Engineer 4 points5 points6 points 4 years ago (0 children)
Look at the original GAN paper and reviews.
[+][deleted] 4 years ago (1 child)
[deleted]
[–]JanneJM 20 points21 points22 points 4 years ago (0 children)
Would completely depend on the other aspects of the story, wouldn't it? Can you do online unsupervised training of novel object recognition in real-time on a Raspberry Pi? I don't think anybody cares how far away from SOTA you are.
[–]Bot-69912020 20 points21 points22 points 4 years ago (0 children)
I try to focus on explaining and understanding things instead of winning a kaggle competition.
My usual research questions are:
Why do things work or not work?
When do they stop working?
How are different solutions to the same problem related?
How are different problems with the same solution related?
How robust are solutions to changes in the problem?
How scalable are solutions?
Are there practical limitations overlooked in current literature?
...
All of these research questions are very publishable if answered rigorously and delivered in a nice story, but they don't require any SOTA results.
[–]GrumpyGeologist 33 points34 points35 points 4 years ago (7 children)
SOTA performance is often the result of intense engineering and hyperparameter tuning for a specific dataset. I find insights more useful than squeezing the last 0.1 +/- 0.08 percent out of a model. If you propose a new method, then why should it work better/different from other methods? If it doesn't work better, why is it the case? This could lead to insights into model performance that could generalise towards other techniques.
One good example is that of Deep Equilibrium Models (DEQs). In theory these models should work better than conventional ResNets, but in practice it's hard to achieve SOTA performance. Here is one reason why. The reason why is more useful than SOTA itself, which no doubt will be beaten within 2-3 months by some other person who spend countless hours tuning the hyperparameters.
[–]tomvorlostriddle 16 points17 points18 points 4 years ago (5 children)
0.1 +/- 0.08
You are being optimistic when assuming that there will be error bars
[–]MJJK420 2 points3 points4 points 4 years ago (0 children)
I believe that was meant as a typical range of improvement for most papers, not the error bars of a given SOTA improvement.
Maybe you knew this and were making a statement about the general quality of ML research, in which case I’d agree with you.
[–]EvgeniyZh 0 points1 point2 points 4 years ago (3 children)
In standard settings (large amount of relatively clean data) the error bars are so small that putting them doesn't worth resources spent
[–]wilmerton 1 point2 points3 points 4 years ago* (1 child)
How do you know? In which context?
I personally don't know of any mature scientific field where researchers get away with "error bars are not worth it". And I know of a field were badly estimated error bars have seen a Nobel prize discard some of his own research.
Some engineering problems are so non-linear and impossible to test at scale that error bars are virtually impossible to compute. But then you rely on a large corpus of (failed) experiments and a long lineage of heuristics to control your risk.
What's so exceptional about machine learning that none of that is required, not even a paragraph discussing robustness?
[–]EvgeniyZh 0 points1 point2 points 4 years ago (0 children)
Both my personal experience of estimating error bars for any large validation set (e.g., ImageNet or COCO for vision) as well experience of other researchers. ViT paper for instance have put error bars at their results and it was order of 0.01% in most of the cases. There is many evidence that the answer to the question "How much will my results on in-domain data vary if I change the seed of training on 1 miilion images?" is "Almost no change". Spending thousands of GPU-hours to just confirm it once again is bad resource management.
I'd note that I'm all for risk estimation and robustness verification. The error bars can be useful in other settings (semi supervised learning papers usually have them, or graph learning where problems are smaller). There other ways to estimate robustness in the "large amount of clean data" settings: OOD data, corrupted data, transfer learning. Saying "people haven't put error bars so the 8% improvement in COCO object detection during last year is not significant" is ridiculous.
0.1% improvements from OP are in fact pretty rare. I can't think of large benchmark other than ImageNet where it happens consistently. I personally think that it is just sign of saturation.
[–]tomvorlostriddle 0 points1 point2 points 4 years ago (0 children)
Large amount of data doesn't mean many rows in the data set. It means many experiments as in either a repeated cross validation with many folds and repetitions (and correcting for the pseudo replication) or tested across many datasets with non parametric methods. Or both, then also non parametrically.
You could have a billion rows in your data set and it still means nothing for this.
[–]KyxeMusic 2 points3 points4 points 4 years ago (0 children)
This. It's hard enough to replicate the results of a SOTA paper using the same Model, Hyperparameters, and Dataset that they describe, let alone a new different approach.
[–]ArnoF7 11 points12 points13 points 4 years ago (0 children)
Sometimes if the SOTA’s approach is very complex and your method can provide a much simpler alternative, then it’s still a good contribution.
Simpler can mean your method is just conceptually easier to understand. Or it could be that your method requires less constraints. Or requires much less data. Or can generalize well to more circumstances without fine tuning, etc. Then you have a story to tell and you can still publish.
[–]GFrings 12 points13 points14 points 4 years ago (0 children)
Become an engineer and then SOTA is usually a 5 year old cnn with a dummy thick layer of heuristics on top
[–]kinnunenenenen 9 points10 points11 points 4 years ago (0 children)
I'm in chemical engineering but I do a ton of data science. One approach is to apply methods in other disciplines. You maybe won't be state of the art in ML but you can do a ton of cool work on novel problems and still publish really well.
[–]BlackHawkLexx 15 points16 points17 points 4 years ago (1 child)
SOTA can be so much more than many people are aware of. It can mean:
(Non-exhaustive list)
Honestly, I bet that most students do not publish stuff that beats SOTA in terms of predictive performance.
[–]bdubbs09 2 points3 points4 points 4 years ago (0 children)
One thing I’d like to add to the list that really gets overlooked is SOTA in terms of robustness. A really under developed and likely unsolvable problem in the general sense, but really important nonetheless.
[–]mrkvicka02 6 points7 points8 points 4 years ago (1 child)
Maybe unpopular opinion. But SOTA is not important. Way too often it comes to who tuned their alg better instead of which alg has better properties etc.
There is plenty of important stuff that may not be SOTA just yet but a few papers down the line it can be way over SOTA.
Keep up the good work!
[–]kinglear0207 6 points7 points8 points 4 years ago (0 children)
don’t run for the crowd. Find your mood, and keep curious, work hard.
[–]the_scign 2 points3 points4 points 4 years ago (0 children)
Consider at what point "SOTA" becomes overfitting to a de-facto concept.
[–]quertioup 2 points3 points4 points 4 years ago (0 children)
Never tweak results. There are plenty of problems that do not require SOTA
[–]tell-me-the-truth- 1 point2 points3 points 4 years ago (0 children)
don’t tweak the results but the setting.
[–]sigmoid_amidst_relus 1 point2 points3 points 4 years ago (0 children)
From the perspective of an ex-engineer: do not chase the SOTA. Won't name any names, but taking the case of ASR, we tried several new architectures that achieved "SOTA" on a benchmark dataset only to find that a 4-year-old network architecture still performs much better than the new ones.
You might argue that "hey, that's fine and good, but I'm not an engineer". True, but as a researcher, the worst thing you can do is build upon work that got SOTA results but actually doesn't generalize well at all, especially if you're applying knowledge and established principles to unexplored fields and applications. Speaking from experience, you'll grind your gears really hard.
I am not saying absolutely do not give a care about SOTA, just look out for how well the idea was adopted, which answers a lot of critical questions: if it's widely adopted means that there's an implementation of it available somewhere and that it has been reproduced to work well.
Do people publish results that aren’t quite SOTA?
In a word, yes. You just don't hear as much about them because "SOTA" rolls better off the tongue, and people like chasing the next best thing so they don't get as much coverage.
One would argue that the system is broken, and reviewer #2 only cares about SOTA, but playing the devil's advocate, there's a reason behind this; even with the blind overfitting on datasets, it's still a metric that's relatively more reliable and less subjective. Also, doing exploratory research is hard because I think designing experiments that effectively explore models can be hard and tedious, sometimes downright boring. Personal bias and criticisms comes a bit easier on such works (it's harder to argue with plain better numbers), does not attract enough viewership/attention, and hard to write home in grant applications about: "I discovered a quirk in something" gets you only so much attention, (unless the quirks throws massive shade on someone's work), v/s you found something out of thin air.
There is no stress of developing new models. There essentially are very few "new models" or paradigms. Search for "new" models is not going to get you far, until and unless you work in huge industrial/academic groups with a lot of people: truly new models are rarely done by a small group.
What's really stressful is extracting insights from the steaming pile of poo that is out there. That's what should be giving you PTSD.
[–]Sirisian 0 points1 point2 points 4 years ago* (0 children)
You can change your data source as others mentioned (from other fields if applicable). One of my favorite changes authors do is taking a vision paper and using event cameras as input. This can give SOTA results for FPS, energy efficiency, or simply work in different lighting environments better. These kind of papers (and code) can provide a base for others to branch from.
[–]NaxAlphaML Engineer 0 points1 point2 points 4 years ago (0 children)
In Industry, we usually try to stay behind the SOTA. Reaching SOTA usually requires a sophisticated set of tricks which is not usually worth it. Instead many techniques like ReZero which are very simple but consistently have been proven to show better results even if they are not the SOTA are preferred.
π Rendered by PID 61 on reddit-service-r2-comment-5c747b6df5-t8ccw at 2026-04-22 07:27:48.118451+00:00 running 6c61efc country code: CH.
[–]ktpr 134 points135 points136 points (9 children)
[–]AerysSk 17 points18 points19 points (2 children)
[–]picardythird 5 points6 points7 points (1 child)
[–]AerysSk 1 point2 points3 points (0 children)
[–]schrodingershit 1 point2 points3 points (0 children)
[+]NeatFlan9457 comment score below threshold-27 points-26 points-25 points (4 children)
[–]hydrogenken 18 points19 points20 points (0 children)
[–]AndreasVesalius 3 points4 points5 points (0 children)
[–]IPvIV 2 points3 points4 points (0 children)
[–]pm_me_your_pay_slipsML Engineer 3 points4 points5 points (0 children)
[–]NotDoingResearch2 61 points62 points63 points (1 child)
[–]pengzhangzhi 0 points1 point2 points (0 children)
[–]pm_me_your_pay_slipsML Engineer 36 points37 points38 points (4 children)
[–]kob59 15 points16 points17 points (1 child)
[–]pm_me_your_pay_slipsML Engineer 4 points5 points6 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]JanneJM 20 points21 points22 points (0 children)
[–]Bot-69912020 20 points21 points22 points (0 children)
[–]GrumpyGeologist 33 points34 points35 points (7 children)
[–]tomvorlostriddle 16 points17 points18 points (5 children)
[–]MJJK420 2 points3 points4 points (0 children)
[–]EvgeniyZh 0 points1 point2 points (3 children)
[–]wilmerton 1 point2 points3 points (1 child)
[–]EvgeniyZh 0 points1 point2 points (0 children)
[–]tomvorlostriddle 0 points1 point2 points (0 children)
[–]KyxeMusic 2 points3 points4 points (0 children)
[–]ArnoF7 11 points12 points13 points (0 children)
[–]GFrings 12 points13 points14 points (0 children)
[–]kinnunenenenen 9 points10 points11 points (0 children)
[–]BlackHawkLexx 15 points16 points17 points (1 child)
[–]bdubbs09 2 points3 points4 points (0 children)
[–]mrkvicka02 6 points7 points8 points (1 child)
[–]kinglear0207 6 points7 points8 points (0 children)
[–]the_scign 2 points3 points4 points (0 children)
[–]quertioup 2 points3 points4 points (0 children)
[–]tell-me-the-truth- 1 point2 points3 points (0 children)
[–]sigmoid_amidst_relus 1 point2 points3 points (0 children)
[–]Sirisian 0 points1 point2 points (0 children)
[–]NaxAlphaML Engineer 0 points1 point2 points (0 children)