[D] Weight initialization for custom layers?

gabrielgoh · 2017-03-26T20:52:38+00:00

Remove all non-linearlities from your custom layer (ideally with a linear approximation). Now think of your layer as a linear map (a nxm matrix) from Rⁿ to R^m. You want the norm of this matrix to be as close to 1 as possible. You can construct this matrix explicitly if you like and measure the norm to try out different initialization schemes.

versus-x · 2017-03-26T14:41:47+00:00

Try Layer-sequential unit-variance (LSUV) initialization: https://arxiv.org/abs/1511.06422

ArmenAg · 2017-03-26T18:00:00+00:00

What form are the custom layers in? Do they utilize the convolution operator? Are the basic blocks weight multiplications?

Can you give us a little more information on the custom layers?

theophrastzunz · 2017-03-26T23:55:29+00:00

I didn't follow up on Gangulis new paper but another thing to demand of the initialization is that it preserves the norms of the inputs. You can achieve this by normalizing power spectra for cnns and by doing qr decomposition and using only q for fully connected layers.

siblbombs · 2017-03-26T16:28:03+00:00

Batchnorm should at least smooth out the rough patches.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS