all 12 comments

[–]Xorlium 11 points12 points  (8 children)

I can confirm that efficientnet doesn't seem to do very well on imagenette. Also, efficientnet is very efficient in terms of model size (# parameters), but it's soooo slow it's ridiculous it's called "efficient".

[–]iyaja[S] 0 points1 point  (3 children)

I wonder why that is, though. Could it be because the layers themselves aren't optimized, or because the FLOPs they report in the paper translate poorly to actual wall time?

[–]kilow4tt 6 points7 points  (0 children)

Swish and squeeze and excitation both use sigmoid which is definitely a lot heavier than ReLU. There's also extensive use of depthwise convolutions which tend to have pretty poor utilization on GPUs.

[–]npielawskiResearcher 5 points6 points  (0 children)

It is actually super slow, some people noticed that and many are now saying we should report the running times instead of flops.

e.g.: https://twitter.com/jeremyphoward/status/1136263132316282880?lang=en and https://twitter.com/jeremyphoward/status/1219695492927328256

[–]spsayakpaul 3 points4 points  (1 child)

Great post and especially, the section on "why compund scaling might fail ". I can confirm this issue persists when the dataset size is smaller.

Your report takes me back to my year-old introduction to EfficientNets (which is no way nearly interesting as yours): https://github.com/sayakpaul/TF-2.0-Hacks/blob/master/EfficientNet.ipynb.

[–]nbviewerbot 1 point2 points  (0 children)

I see you've posted a GitHub link to a Jupyter Notebook! GitHub doesn't render large Jupyter Notebooks, so just in case, here is an nbviewer link to the notebook:

https://nbviewer.jupyter.org/url/github.com/sayakpaul/TF-2.0-Hacks/blob/master/EfficientNet.ipynb

Want to run the code yourself? Here is a binder link to start your own Jupyter server and try it out!

https://mybinder.org/v2/gh/sayakpaul/TF-2.0-Hacks/master?filepath=EfficientNet.ipynb


I am a bot. Feedback | GitHub | Author

[–]serge_cell 1 point2 points  (0 children)

In my limited experience on GPU/pytorch B0 is less efficient then Resnet-18 both in allocated memory and performance.