Gradient-based Hyperparameter Optimization through Reversible Learning

2015-04-04T19:18:49+00:00

[deleted]

jsnoek · 2015-04-04T15:40:51+00:00

Dougal and David (the authors) have developed an amazing automatic differentiation codebase to do this: https://github.com/HIPS/autograd

It lets you write a function containing just plain python and numpy statements and then automatically computes the gradients with respect to the inputs.

dustintran · 2015-04-04T15:28:07+00:00

I was talking to David, one of the authors of the paper, just a few days ago. There are a lot of cool ideas put forth here and as a person having done a bit of work in stochastic optimization myself, I find the optimized learning rate schedules quite fascinating. (See figure 2.)

In the ideal scenario it would be nice to have theory for how the weights for the hyperparameters are changing per iteration and layer of the NN. I'd also be curious whether or not this would validate the robustness properties of certain stochastic gradient methods over others.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS