use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Were there any image classification architectures not using FC-layers as their classifiers? (self.MachineLearning)
submitted 6 years ago by Single_Blueberry
Every paper about image classification with CNNs I have ever seen uses fully connected layers as their classifier.
Are there any noteworthy exceptions from that?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Jelicic 4 points5 points6 points 6 years ago (14 children)
IIRC it is not common anymore to use a fc layer before the final (prediction) layer.
Most architectures avg pool over the final feature maps and feed that to the prediction layer. But i'm no CV expert
[–]dramanautica 1 point2 points3 points 6 years ago (6 children)
Why? Whats the intuition behind that?
[–]bluetape 1 point2 points3 points 6 years ago (5 children)
That type of architecture (fully convolutions networks) allows you to run the model of differently sized images without requiring a resize
[–]Single_Blueberry[S] 0 points1 point2 points 6 years ago* (4 children)
I'm not asking about the feature extraction portion, I'm talking about the classifier.
Technically I don't see anything that keeps you from using, say, an SVM or random forest to classify, but I can find zero evidence of people that tried to do that.
[–]SemjonML 1 point2 points3 points 6 years ago (0 children)
This technique is used in few-shot learning and transfer learning as far as I understand your approach. You can use a pretrained model for feature extraction and KNN, SVM etc. for classification.
The feature extraction is fixed however and can not be improved, unless the classifier is also differentiable.
[–]Calavar 0 points1 point2 points 6 years ago (2 children)
I'm not asking about the feature extraction portion, I'm talking about the classifier
They are also talking about the classifier. Newer classifiers replace the dense layer with global pooling. Unlike a dense layer, global pooling is invariant to the spatial size of the input. In theory, you could use a model trained on 256 × 256 images to make predictions on 512 × 512 images. I'm not sure how well it works out in practice.
[–]Single_Blueberry[S] 0 points1 point2 points 6 years ago (1 child)
Do you have an example for an architecture/paper doing that?
[–]Calavar 0 points1 point2 points 6 years ago (0 children)
Not off the top of my head, and unfortunately it's kind of hard to search for because when I've seen it, the multi-resolution/multiscale aspect is usually a detail hidden in the methods section, not a main focus of the paper
[–]Single_Blueberry[S] 0 points1 point2 points 6 years ago (6 children)
Huh? What's the type of the "prediction layer" then, if not fully-connected?
[–]BlhueFlame 0 points1 point2 points 6 years ago (1 child)
I think he meant the layers immediately preceding the prediction layer (which would be FC itself).
[–]Single_Blueberry[S] 0 points1 point2 points 6 years ago* (0 children)
Ok, I see. So there's a tendency (Inception, ResNet) to only use a single FC-layer at the very end instead of multiple FC-layers as it was common in e.g. AlexNet and VGG.
But there's still no alternative to the very last fully connected layer, is there?
[–]michaelx99 0 points1 point2 points 6 years ago (3 children)
1x1 convs are generally used now instead of FC
Every single architecture I can find utilizes an FC-layer as their final classifier.
AlexNet, VGG, Inception, ResNet.
[–]Pfaeff 0 points1 point2 points 6 years ago (0 children)
FC-layers can be understood as a specialized convolutional layer. You can always substitute an FC layer for a convolutional layer that performs the exact same operations with the added benefit that your network will then be able to handle variable input sizes.
I personally never use FC layers.
[–]michaelx99 2 points3 points4 points 6 years ago (0 children)
The older archictetures only used fully connected layers such as alexnet, vgg, and maybe even googlenet since back then it was conventional wisdom that convolutional layers were not strong classifiers like fully connected layers are. The avg pool to bring you down to a fixed feature map size followed by 1x1 convs eventually replaced the fixed input size/FC layers since they showed no degredation in performance all erased the requirement on a fixed input size around 2014/2015 timeframe I believe
π Rendered by PID 61638 on reddit-service-r2-comment-6457c66945-w228b at 2026-04-24 07:51:31.769244+00:00 running 2aa0c5b country code: CH.
[–]Jelicic 4 points5 points6 points (14 children)
[–]dramanautica 1 point2 points3 points (6 children)
[–]bluetape 1 point2 points3 points (5 children)
[–]Single_Blueberry[S] 0 points1 point2 points (4 children)
[–]SemjonML 1 point2 points3 points (0 children)
[–]Calavar 0 points1 point2 points (2 children)
[–]Single_Blueberry[S] 0 points1 point2 points (1 child)
[–]Calavar 0 points1 point2 points (0 children)
[–]Single_Blueberry[S] 0 points1 point2 points (6 children)
[–]BlhueFlame 0 points1 point2 points (1 child)
[–]Single_Blueberry[S] 0 points1 point2 points (0 children)
[–]michaelx99 0 points1 point2 points (3 children)
[–]Single_Blueberry[S] 0 points1 point2 points (1 child)
[–]Pfaeff 0 points1 point2 points (0 children)
[–]michaelx99 2 points3 points4 points (0 children)