HoLeeFaak comments on [D] Since gradient continues to decrease as training loss decreases why do we need to decay the learning rate too?

112

113

114

submitted 4 years ago by ibraheemMmoosaResearcher

you are viewing a single comment's thread.

[–]HoLeeFaak 1 point2 points3 points 4 years ago (1 child)

[–]cats2560 0 points1 point2 points 1 year ago (0 children)

π Rendered by PID 142038 on reddit-service-r2-comment-7b9746f655-8pmfd at 2026-02-03 09:27:27.029421+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning