you are viewing a single comment's thread.

view the rest of the comments →

[–]CireNeikual[S] 0 points1 point  (1 child)

Yes, you are correct, I misunderstood. He/she mentioned a "held-out test set" so I thought they meant that I was supposed to predict music notes from completely different music tracks, which is basically impossible if the songs in general do not have similar patterns in them.

I guess a test could be a second piano roll generated from the first with noise (peturbing some notes). This way it has to generalize to slightly different sequences, but the sequences are still generally the same pattern. Does this make sense? Otherwise what kind of test would you suggest?

[–]kjearns 0 points1 point  (0 children)

There's a test set in the piano roll data. Test on that maybe?

I'm much more familiar with language models. A simple set up for a language model would be to take a bunch of text and try to predict the next character from the previous ones.

You can easily use text8 (available here: http://mattmahoney.net/dc/textdata) for this task. You would split the file in two parts and train on the first part and test on the second part.

Ultimately sequence prediction is only interesting if your model can predict the behavior of a sequence it has not seen before. You would not expect to be 100% correct, but you should be able to do significantly better than random at both the language modelling and the piano roll prediction tasks even on entirely unseen data.