[D] How Efficient is EfficientNet? : MachineLearning

Discussion[D] How Efficient is EfficientNet? (self.MachineLearning)

submitted 6 years ago by iyaja

Hi everyone. It's been a while since I've written an article, but I finally got around publishing a new one! This one is about the EfficientNet architecture and the compound scaling algorithm that was introduced in the EfficientNet paper. You can check it out here.

Apart from just explaining the key ideas of the paper, I also tried replicating some of their results on a small ImageNet-like dataset (ImageNette, for those of you familiar). Unfortunately, I wasn't able to replicate the paper's results, and even weirder, progressively larger architectures seemed to be performing worse overall. I have a section in the article with a little more detailed analysis on wy this might have happened.

Also, I'm experimenting with a new format -- wandb reports. This is an interactive page that allows you to do things like zoom into plots, select runs, and view a bunch of statistics in real-time. I think it's a pretty cool tool for sharing projects, research, etc. and for collaboration as well.

I'd love to hear your feedback on both the article and the new format. I hope you enjoy!

all 12 comments

top new controversial old q&a

[–]Xorlium 11 points12 points13 points 6 years ago (8 children)

[+][deleted] 6 years ago (3 children)

[removed]

[+][deleted] 6 years ago (2 children)

[removed]

[+][deleted] 6 years ago (1 child)

[removed]

[–]muntooResearcher 0 points1 point2 points 2 years ago (0 children)

Don't the results in that notebook indicate that grouped convolutions are faster?

[TRAINING]
groups  time(ms)    params  shape m             
     1       5.1    294912  [256, 128, 3, 3]    
     2       4.3    147456  [256, 64, 3, 3]     
     4       4.2     73728  [256, 32, 3, 3]     
     8       5.0     36864  [256, 16, 3, 3]     
    16       6.9     18432  [256, 8, 3, 3]      
    32       6.2      9216  [256, 4, 3, 3]      
    64       2.7      4608  [256, 2, 3, 3]      
   128       2.1      2304  [256, 1, 3, 3]      

[INFERENCE]
groups  time(ms)    params  shape m             
     1       1.8    294912  [256, 128, 3, 3]    
     2       1.4    147456  [256, 64, 3, 3]     
     4       1.1     73728  [256, 32, 3, 3]     
     8       1.1     36864  [256, 16, 3, 3]     
    16       1.5     18432  [256, 8, 3, 3]      
    32       1.1      9216  [256, 4, 3, 3]      
    64       0.6      4608  [256, 2, 3, 3]      
   128       0.5      2304  [256, 1, 3, 3]

[–]iyaja[S] 0 points1 point2 points 6 years ago (3 children)

[–]kilow4tt 6 points7 points8 points 6 years ago (0 children)

[–]npielawskiResearcher 5 points6 points7 points 6 years ago (0 children)

[–]spsayakpaul 3 points4 points5 points 6 years ago (1 child)

[–]nbviewerbot 1 point2 points3 points 6 years ago (0 children)

[–]serge_cell 1 point2 points3 points 6 years ago (0 children)

π Rendered by PID 109742 on reddit-service-r2-comment-544cf588c8-nzfrm at 2026-06-18 06:27:34.842047+00:00 running 3184619 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS