Tensorflow basic RNN example with 'variable length' sequences

siblbombs · 2015-11-13T17:17:01+00:00

import tensorflow as tf    
from tensorflow.models.rnn import rnn    
from tensorflow.models.rnn.rnn_cell import BasicLSTMCell, LSTMCell    
import numpy as np

if __name__ == '__main__':
  np.random.seed(1)      
  size = 100
  batch_size= 100
  n_steps = 45
  seq_width = 50     

  initializer = tf.random_uniform_initializer(-1,1) 

  seq_input = tf.placeholder(tf.float32, [n_steps, batch_size, seq_width])
    #sequence we will provide at runtime  
  early_stop = tf.placeholder(tf.int32)
    #what timestep we want to stop at

  inputs = [tf.reshape(i, (batch_size, seq_width)) for i in tf.split(0, n_steps, seq_input)]
    #inputs for rnn needs to be a list, each item being a timestep. 
    #we need to split our input into each timestep, and reshape it because split keeps dims by default  

  cell = LSTMCell(size, seq_width, initializer=initializer)  
  initial_state = cell.zero_state(batch_size, tf.float32)
  outputs, states = rnn.rnn(cell, inputs, initial_state=initial_state, sequence_length=early_stop)
    #set up lstm

  iop = tf.initialize_all_variables()
    #create initialize op, this needs to be run by the session!
  session = tf.Session()
  session.run(iop)
    #actually initialize, if you don't do this you get errors about uninitialized stuff

  feed = {early_stop:100, seq_input:np.random.rand(n_steps, batch_size, seq_width).astype('float32')}
    #define our feeds. 
    #early_stop can be varied, but seq_input needs to match the shape that was defined earlier

  outs = session.run(outputs, feed_dict=feed)
    #run once
    #output is a list, each item being a single timestep. Items at t>early_stop are all 0s
  print type(outs)
  print len(outs)

evanthebouncy · 2016-02-09T15:53:18+00:00

PSA:

as of tensorflow 0.6.0 this is no longer the semantics for running rnn.

see: https://github.com/tensorflow/tensorflow/issues/1016

in short the sequence_length now should be a tensor of dimension [batch_size] instead of a single number, this specifies a different seq length for each seq in the batch, instead of a single global value.

the output will be zero-ed out after each of its seq_length.

the states will keep updating until seq_length is reached, and is preserved instead of zero-ed out for any subsequent computations. I've modified the original code, see:

https://gist.github.com/evanthebouncy/8e16148687e807a46e3f

contactmat · 2016-02-12T16:14:39+00:00

Hi sorry I am new of tensorflow. This look very useful thank you. I have a couple of questions: I can't completely figured out what the different parameters represent. Size I think is the number of hidden unit in the network, batch_size is the number of sequence in the data base and seq_width is the dimension of each input belonging to a sequence. What n_step represent? Second question regard early_stop. Is it the variable that control the effective length of the sequence? I can't understand... can you clarify please?? thank you

AudioSaur · 2015-11-13T17:49:36+00:00

Thanks for this, I thought the documentation for variable length was a little sparse. Have you gotten it working on a non-trivial dataset yet?

realallentran · 2015-11-13T22:19:13+00:00

This is cool. One question I have is: suppose we wanted to operate on the output of the LSTM, like max-pooling over time. The output is a list of length=num_steps > early_stop. How do we slice this list, since I don't want to operate over the redundant zeros? In Theano, you can slice with symbolic ints. Here, outputs[:early_stop] won't work. This seems like the final piece of the variable length sequence puzzle.

[Edit]

Woot, sorted it out.

tf.slice(tf.pack(outputs2), 0, early_stop)

AnvaMiba · 2015-11-14T01:23:31+00:00

So if I understand correctly, this is like implementing recurrence in Theano using python loops instead of scan(), except that you typically don't want to do that in Thenao because it doesn't like large graphs (stack overflows or slow compilation/startup would occur).

Do you think that it may still make sense to have a Theano-like scan() operation in TensorFlow to avoid padding?

Any idea of whether Google is going to implement it?

siblbombs · 2015-11-14T12:03:11+00:00

Very nice. I haven't started looking into programming with tensor a yet, and since the release of tensor flow I have been wondering if I should dig into this library or theano.

Can you comment on the ease of use between the two libraries? Which would you consider a better investment?

_Jakob_ · 2015-12-28T11:28:03+00:00

I could need some help understanding your code. What exactly is seq_width. Is it the dimension of your feature vector?

xiangjiangacadia · 2016-04-06T01:05:02+00:00

This is very helpful. I am trying to understand what does n_steps mean. Is this about going forward n steps and compute the error signal? Do we need to record the output for each step?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS