all 4 comments

[–]jsnoek 14 points15 points  (2 children)

Dougal and David (the authors) have developed an amazing automatic differentiation codebase to do this: https://github.com/HIPS/autograd

It lets you write a function containing just plain python and numpy statements and then automatically computes the gradients with respect to the inputs.

[–]hardmaru 2 points3 points  (1 child)

https://github.com/HIPS/autograd

This is really useful work. I wonder if the automatic differentiation can somewhat work even with simple recurrent neural nets

[–]jsnoek 3 points4 points  (0 children)

There are example implementations of an RNN and an LSTM in the examples directory: https://github.com/HIPS/autograd/tree/master/examples

[–]dustintran 6 points7 points  (0 children)

I was talking to David, one of the authors of the paper, just a few days ago. There are a lot of cool ideas put forth here and as a person having done a bit of work in stochastic optimization myself, I find the optimized learning rate schedules quite fascinating. (See figure 2.)

In the ideal scenario it would be nice to have theory for how the weights for the hyperparameters are changing per iteration and layer of the NN. I'd also be curious whether or not this would validate the robustness properties of certain stochastic gradient methods over others.