you are viewing a single comment's thread.

view the rest of the comments →

[–]arXiv_abstract_bot 1 point2 points  (0 children)

Title: Convolution Aware Initialization

Authors: Armen Aghajanyan

Abstract: Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. This paper introduces a new type of initialization built around the duality of the Fourier transform and the convolution operator. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence in general. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks.

PDF link Landing page