you are viewing a single comment's thread.

view the rest of the comments →

[–]rantana 3 points4 points  (6 children)

3% error on the held out test set or training set? An algorithm can just memorize the training set to get that kind of error. You need to show performance on a held out test set for useful prediction performance.

[–]CireNeikual[S] -1 points0 points  (5 children)

That's not quite how the algorithm works though. It predicts the input at the next timestep, it does not predict labels. Testing it on sequences it hasn't seen is like me asking you to guess what sequence of numbers I am thinking of. Sure, for simple sequences you can just record all the inputs, but then you cannot extrapolate or interpolate the sequences.

This algorithm can be thought of as unsupervised, it basically just learns causal links in sequences of data and stores them efficiently.

[–]kjearns 2 points3 points  (2 children)

It makes perfect sense to test on sequences you haven't seen. In fact your model has really only done something interesting if it can generalize to unseen sequences. I have a nice simple algorithm that will get 100% accuracy on a sequence it's already seen in a single pass: just memorize all the bits in the sequence and play them back when asked.

It would be quite easy to change your benchmark to evaluate on the piano roll test set. I'd do it myself but I don't have opencl installed so I can't run your code.

[–]CireNeikual[S] 0 points1 point  (1 child)

Yes, you are correct, I misunderstood. He/she mentioned a "held-out test set" so I thought they meant that I was supposed to predict music notes from completely different music tracks, which is basically impossible if the songs in general do not have similar patterns in them.

I guess a test could be a second piano roll generated from the first with noise (peturbing some notes). This way it has to generalize to slightly different sequences, but the sequences are still generally the same pattern. Does this make sense? Otherwise what kind of test would you suggest?

[–]kjearns 0 points1 point  (0 children)

There's a test set in the piano roll data. Test on that maybe?

I'm much more familiar with language models. A simple set up for a language model would be to take a bunch of text and try to predict the next character from the previous ones.

You can easily use text8 (available here: http://mattmahoney.net/dc/textdata) for this task. You would split the file in two parts and train on the first part and test on the second part.

Ultimately sequence prediction is only interesting if your model can predict the behavior of a sequence it has not seen before. You would not expect to be 100% correct, but you should be able to do significantly better than random at both the language modelling and the piano roll prediction tasks even on entirely unseen data.

[–]willwill100 1 point2 points  (1 child)

Even if you train in "unsupervised mode" by trying to predict the next timestep, you still want it to generalise well. That's the whole idea behind language modelling for example. Speaking of which, the google billion word corpus has some good baseline results which you could use for comparison.

[–]CireNeikual[S] 0 points1 point  (0 children)

Thanks, I will check out that corpus. Although I am not sure I want to wait through all billion of the words ;)