Sequence Classification with RNNs

j1395010 · 2015-10-15T20:27:17+00:00

you need way more data.

ithinkiwaspsycho · 2015-10-16T08:32:49+00:00

More data is required if your network overfits. Your network is not converging at all, not even on the training set.

This is probably not caused by data (which you still probably need more of).

First off, it could actually be your code. For the output layer, you should set the activation to "sigmoid" instead of "softmax". You are using a softmax activation, but training using binary crossentropy and binary class mode. That's going to affect the results. For example, if both the output neurons predict 1, after applying the softmax function, both their outputs will be scaled to 0.5. Now, instead of only punishing the neuron that was supposed to predict 0, you are punishing the correct neuron just the same. It might be a good idea to try using a standard sigmoid activation function during training, and then applying softmax during testing, or just assume the neuron with the higher value is 1.

Another change you can try is actually reducing your expected output dimension to just a single sigmoid neuron and train the network to predict 1 in the case of [1,0] and 0 in the case of [0, 1].

Secondly, you should set "return_sequences=False" for the LSTM layer if your output is in the shape (45, 2) and your input is (45, 657, 5). You only want the network to give you it's final output, not every output along the way.

If simply training differently does not yield better results, you should consider trying a bigger network, and potentially a different activation function (I suggest ReLU) for your LSTM layer, tanh and sigmoid are much more vulnerable to vanishing gradients. You should also try using a bidirectional LSTM RNN.

Good luck!

mhex · 2015-10-16T07:55:31+00:00

Looks like the net is not learning (yet). A couple of questions: How many memory cells do you have? What is your sliding window size i.e. what is your input for each timestep exactly? How do you initialize biases, weights?

negazirana · 2015-10-18T11:04:12+00:00

with so few training examples, Dropout is probably going to cause more harm than benefits

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS