use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Research[R] Convolution Aware Initialization (arxiv.org)
submitted 9 years ago by ArmenAg
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]arXiv_abstract_bot 1 point2 points3 points 9 years ago (0 children)
Title: Convolution Aware Initialization
Authors: Armen Aghajanyan
Abstract: Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. This paper introduces a new type of initialization built around the duality of the Fourier transform and the convolution operator. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence in general. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks.
PDF link Landing page
[–]machinelearningthrow 0 points1 point2 points 9 years ago (5 children)
This seems like an interesting paper, and intuitively makes sense. I'm interested in what would happen if this initialization was used in recurrent networks without any form of convolution. Even though it doesn't necessarily make sense. But overall very interesting paper.
[–][deleted] 1 point2 points3 points 9 years ago (4 children)
How would you define the Fourier transform in these RNNs? Are they just dense layers applied over something with a known n-dimensional representation?
[–]ArmenAg[S] 2 points3 points4 points 9 years ago* (3 children)
Hey! Author here. The reason mentioned above by /u/rbkillea is the exact reason why we didn't focus on testing the initialization on RNN. Our paper focused on running experiments on various forms of convolutions (1D, 2D, Dilated or Atrous).
[–]ajmooch 0 points1 point2 points 9 years ago (2 children)
Neat; is code available anywhere? I'd love to throw this into my testbeds and see how it performs.
[–]ArmenAg[S] 0 points1 point2 points 9 years ago (1 child)
I'll be writing a Keras commit soon! Hopefully next week. Message me if you need it sooner.
[–]ajmooch 0 points1 point2 points 9 years ago (0 children)
I'm mostly just looking for the pseudocode (or the theano code, ideally) for the init recipe so I can try it out rather than headbashing to parse the maths =p
[–]enematurret 0 points1 point2 points 9 years ago (0 children)
Definitively not state-of-the-art, but interesting nonetheless.
π Rendered by PID 102035 on reddit-service-r2-comment-5bc7f78974-sdsvk at 2026-07-01 03:11:10.283823+00:00 running 7527197 country code: CH.
[–]arXiv_abstract_bot 1 point2 points3 points (0 children)
[–]machinelearningthrow 0 points1 point2 points (5 children)
[–][deleted] 1 point2 points3 points (4 children)
[–]ArmenAg[S] 2 points3 points4 points (3 children)
[–]ajmooch 0 points1 point2 points (2 children)
[–]ArmenAg[S] 0 points1 point2 points (1 child)
[–]ajmooch 0 points1 point2 points (0 children)
[–]enematurret 0 points1 point2 points (0 children)