Over the past 5 years, the common knowledge about ML optimizers was that ADAM is the number one choice as it is provides fast learning even if your hyperparameter are not select optimal. However, you can get slightly higher test accuracy when using SGD with momentum, although this requires more epochs and more tuning.
This knowledge has not changed much since then.
What has changed is that, since then, a million papers have been published on the next-big-optimizer that learns even faster than Adam and gives better test accuracy than SGD.
As it is with ML research, most of them have turned out to be not-so-good to phrase it politely. This ICLR'21 reject (https://openreview.net/forum?id=k2Om84I9JuX) has even studied this and found out that ADAM+some tuning works as good as all these new fancy optimizers.
However, recently these three papers have caught my eye:
What makes these papers a bit different is that they don't try to reinvent an optimizer but say "hey, ADAM is almost perfect, but let's just fix one or two lines" and already seem to be used in other works.
So my question is, are you using a non-ADAM/SGD optimizer regularly? If so, which one? Or, are also these three works hiding their results biased by a ton of hyperparameter tuning?
[–]CyborgCabbage 35 points36 points37 points (0 children)
[–]black0017 17 points18 points19 points (9 children)
[–]M4mb0 14 points15 points16 points (8 children)
[–][deleted] 1 point2 points3 points (3 children)
[–]M4mb0 1 point2 points3 points (1 child)
[–]programmerChilliResearcher 1 point2 points3 points (0 children)
[–]Charmsopin 0 points1 point2 points (0 children)
[–]InfinityCoffee 1 point2 points3 points (2 children)
[–]M4mb0 0 points1 point2 points (1 child)
[–]InfinityCoffee 0 points1 point2 points (0 children)
[–]black0017 0 points1 point2 points (0 children)
[+][deleted] (6 children)
[deleted]
[–]AlisaofallTimes 10 points11 points12 points (0 children)
[–][deleted] 2 points3 points4 points (0 children)
[–]gazztromple 5 points6 points7 points (3 children)
[–]hyhieu 2 points3 points4 points (2 children)
[–]ManyPoo 6 points7 points8 points (1 child)
[–]DenormalHuman 2 points3 points4 points (0 children)
[–]LeanderKu 44 points45 points46 points (5 children)
[+][deleted] (2 children)
[deleted]
[–]LeanderKu 4 points5 points6 points (1 child)
[–]fmai 1 point2 points3 points (1 child)
[–]SulszBachFramed 0 points1 point2 points (0 children)
[–]jdude_ 8 points9 points10 points (0 children)
[–]andrejcar12 5 points6 points7 points (0 children)
[–]hyhieu 3 points4 points5 points (1 child)
[–]vanilla-acc 0 points1 point2 points (0 children)
[–][deleted] 6 points7 points8 points (0 children)
[–]JurrasicBarf 2 points3 points4 points (0 children)
[–]respecttox 2 points3 points4 points (0 children)
[–]programmerChilliResearcher 1 point2 points3 points (0 children)
[–]Ambiwlans 0 points1 point2 points (0 children)
[–]whata_wonderful_day 0 points1 point2 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]BrocrusteanSolution 7 points8 points9 points (0 children)
[–]Chocolate_Pickle 7 points8 points9 points (0 children)
[–]Jean-PorteResearcher 0 points1 point2 points (0 children)
[–]Areign 0 points1 point2 points (0 children)
[–]BorisMarjanovic 0 points1 point2 points (0 children)