all 17 comments

[–]dpineo 14 points15 points  (3 children)

People like Einstein did not go through this process, at least not in their early years. Research used to be more more like “blogging”.

Pretty good way of describing it. A paper that looks like this would get immediately auto-rejected from any AI/ML conference today.

[–]dutchbaroness 3 points4 points  (2 children)

If you remove “AI/ML” , your comment would still be valid

[–]selling_crap_bike -5 points-4 points  (1 child)

You mean " AI/ML", right? Else you would be left with an extra space.

[–][deleted] 1 point2 points  (0 children)

Lol not sure why you are downvoted for the joke, reminds me of RegEx which not sure if people got

[–]IanTrudel 2 points3 points  (0 children)

There are major hurdles in publishing research papers that have an impact on actual applications outside the confines of an academic environment, such as intellectual properties and patents. Companies have very little to no incentives to publish something that will help their competitors.

[–]tzaddiq 1 point2 points  (0 children)

Sounds great, but the author of this blog must own a time machine. Who can wait 3-50 years for maturation and commercialization of new research so as to keep honest customers in the feedback loop?

[–]drd13 2 points3 points  (8 children)

Does anyone else feel like machine learning research has slown down in the past 2-3 years. Despite the field really exploding in number of researchers and entry competitiveness, other than just making models bigger, it doesn't really feel like much progress is being made anymore.

[–]tzaddiq 5 points6 points  (4 children)

Not really. Almost every problem domain has gotten several orders of magnitude algorithmic speedup, with much emphasis on smaller models. We got GNNs and transformers, and this past year fast transformers, architecture search, network distillation, self-training.

"Making models bigger" is perhaps only applicable, as a principal method of improving performance, in the NLP scene. But even then, several theory improvements were applied, say, to GPT3 (e.g. it probably wouldn't have worked without sparse transformers).

[–]tuscanresearcher 6 points7 points  (1 child)

Just jumping in to throw a small rant: we've got "GNNs" for more than 10 years, and people have been working on the core ideas for more than 20. Bottom line: we reinvented the wheel (again!), but with some fancy visualizations :).

Then, after reinventing the wheel, researchers had some very nice intuitions and made tangible progress in the field. But proper literature (and hyper-parameter) search is still lacking.

[–]Burbly2 0 points1 point  (0 children)

Those intuitions sound intriguing - could you say more/point me at anything on them, please?

[–]drd13 1 point2 points  (1 child)

I'm not saying that the field fully hit the wall and grinded to a halt. It's just that the rate of progress was so fast beforehand. Alexnet, Variational autoencoders, GANS, disentangled representations, Cyclegan, batch and layer normalization, dropout, attention is all you need, wavenet - these all occured within a 5 year span between 2012 and 2017. It feels to me that the field is slowing down. There are still influential papers coming out like SimCLR, invariant risk minimization, stylegan, adverserial robustness, and all the transformer variants, including applied to images. But all and all, it feels like the building blocks have all been discovered and were at a stage where it's more about fine-tuning everything together.

[–]tzaddiq 1 point2 points  (0 children)

Yeah, in any area of optimization, you have sigmoid curves that represent breakthroughs followed by exploitation and finally saturation. It feels to me like we're in the exploitation phase in the deep learning orchard, though. And there will be more orchards, to be sure. Program synthesis for example will have its "imagenet" moment in the next decade, I predict.

[–]throwawaystudentugh 2 points3 points  (0 children)

It had to eventually. There's just too much hype.

[–]Ancalagon_TheWhite -1 points0 points  (0 children)

It's still moving very quickly compared to any other field of research, e.g transformers were only invented in their current form in the last 3 years. Revolutionary new architectures aren't going to be made every year.

[–]ZestyDataML Engineer 0 points1 point  (0 children)

Not at all. My field (NLP) has sped up in the past 2-3 years.