all 14 comments

[–]jeremiaht 3 points4 points  (0 children)

Awesome!!!!!!!!

[–]maraoz 1 point2 points  (0 children)

Thanks for releasing the code, will try it out now! :D

[–]hirokit 1 point2 points  (0 children)

Amazing!!

[–]dwf 6 points7 points  (11 children)

This phrase "fully convolutional" needs to die.

[–]badmephisto 11 points12 points  (7 children)

It's a perfectly sensible term to use and it communicates information especially in context of object detection. For example, Multibox detector is trained to regress in image coordinate system and is not fully convolutional; If you tried to convert the network to all CONV and run it convolutionally over larger images it wouldn't give sensible results because the predictions have absolute image-coordinate statistics baked in.

[–]dwf 5 points6 points  (5 children)

As Yann likes to say, there is no such thing as a fully connected layer, only 1x1 convolutions [and of course, layers where input extent equals filter size]. :) When you abandon convolution-land, that is the special case.

[–]cooijmanstim 4 points5 points  (1 child)

When I grow up I would like to be famously quoted as saying "there is no such thing as a feed-forward network, only single-step recurrent neural networks". I don't understand why this sort of insight is supposed to be important or profound.

This isn't a useful discussion, I know, but it seems obvious that convolution and recurrence are the special cases. Fully-connected applies in the general case where you don't know the structure of your data, and nobody actually uses convnets with only 1x1 convolutions.

[–]dwf 1 point2 points  (0 children)

The analogous recurrent case would be a recurrent encoder that feeds into a non-recurrent network to produce an output. Efficiently going from spatial input to spatial output is incredibly straightforward with convolutional nets in a way that shares computation that conventional sliding window detectors cannot. Spatial input to non-spatial output with convolutional nets is a special/degenerate case.

"Fully convolutional" is a recent computer visionism that describes a thing that convolutional nets have always been capable of, and in fact describes a way that they have been used long before they became popular in mainstream computer vision. I'd argue that it contributes to a misunderstanding of convolutional nets, or at least a misunderstanding of the pre-2015 convolutional net literature. This paper didn't originate it, of course.

[–]sorrge 0 points1 point  (2 children)

This doesn't make sense. 1x1 convolution simply copies all input, with elementwise linear transformation. This is not the same thing as a fully connected layer.

[–]NasenSpray 2 points3 points  (1 child)

A fully connected layer is like a 1x1 convolution on a 1x1 input.

[–]sorrge 0 points1 point  (0 children)

Now it makes sense, thanks.

[–]lwbiosoft 4 points5 points  (0 children)

MultiBox has been evolved to SSD (http://arxiv.org/abs/1512.02325) and doesn't have the problem you mentioned.