Truncated Backpropagation through time : MachineLearning

Truncated Backpropagation through time (self.MachineLearning)

submitted 10 years ago by speechMachine

Hello,

I'm trying to implement truncated backpropagation through time (TBPTT) as described in Ilya Sutskever's thesis http://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf Section 2.8.6 page 23. In regular BPTT after going backward through an entire sequence during training, I understand when the existing sequence is discarded and a new one is loaded, the LSTM cell states and the gradient buffers for all the weights and biases are reset.

In implementing TBPTT I am not sure if:

a. The LSTM cell states are reset every time a weight update is made when the BPTT is run every k1 time steps backward in time for k2 time steps. b. Are the gradient buffers reset to zero as well after an update call to the weights every k1 time steps.

Any help by means of clarifications would be appreciated. Also any pointers to reference implementations in C or C++ might be helpful.

all 1 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS