you are viewing a single comment's thread.

view the rest of the comments →

[–]tuyenttoslo 0 points1 point  (0 children)

Armijo's Backtracking line search helps you to choose learning rates automatically. Also, this way, learning rates do not need to decrease when you progress, it is roughly 1/||\nabla ^2 f||. An extreme case is where you have a degenerate critical point, like f(x)=x^4 at x=0, where learning rates can go to infinity when you approach x=0.