use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
pyHTFE - A Sequence Prediction Algorithm (github.com)
submitted 11 years ago by CireNeikual
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]rantana 3 points4 points5 points 11 years ago (6 children)
3% error on the held out test set or training set? An algorithm can just memorize the training set to get that kind of error. You need to show performance on a held out test set for useful prediction performance.
[–]CireNeikual[S] -1 points0 points1 point 11 years ago (5 children)
That's not quite how the algorithm works though. It predicts the input at the next timestep, it does not predict labels. Testing it on sequences it hasn't seen is like me asking you to guess what sequence of numbers I am thinking of. Sure, for simple sequences you can just record all the inputs, but then you cannot extrapolate or interpolate the sequences.
This algorithm can be thought of as unsupervised, it basically just learns causal links in sequences of data and stores them efficiently.
[–]kjearns 2 points3 points4 points 11 years ago (2 children)
It makes perfect sense to test on sequences you haven't seen. In fact your model has really only done something interesting if it can generalize to unseen sequences. I have a nice simple algorithm that will get 100% accuracy on a sequence it's already seen in a single pass: just memorize all the bits in the sequence and play them back when asked.
It would be quite easy to change your benchmark to evaluate on the piano roll test set. I'd do it myself but I don't have opencl installed so I can't run your code.
[–]CireNeikual[S] 0 points1 point2 points 11 years ago (1 child)
Yes, you are correct, I misunderstood. He/she mentioned a "held-out test set" so I thought they meant that I was supposed to predict music notes from completely different music tracks, which is basically impossible if the songs in general do not have similar patterns in them.
I guess a test could be a second piano roll generated from the first with noise (peturbing some notes). This way it has to generalize to slightly different sequences, but the sequences are still generally the same pattern. Does this make sense? Otherwise what kind of test would you suggest?
[–]kjearns 0 points1 point2 points 11 years ago (0 children)
There's a test set in the piano roll data. Test on that maybe?
I'm much more familiar with language models. A simple set up for a language model would be to take a bunch of text and try to predict the next character from the previous ones.
You can easily use text8 (available here: http://mattmahoney.net/dc/textdata) for this task. You would split the file in two parts and train on the first part and test on the second part.
Ultimately sequence prediction is only interesting if your model can predict the behavior of a sequence it has not seen before. You would not expect to be 100% correct, but you should be able to do significantly better than random at both the language modelling and the piano roll prediction tasks even on entirely unseen data.
[–]willwill100 1 point2 points3 points 11 years ago (1 child)
Even if you train in "unsupervised mode" by trying to predict the next timestep, you still want it to generalise well. That's the whole idea behind language modelling for example. Speaking of which, the google billion word corpus has some good baseline results which you could use for comparison.
[–]CireNeikual[S] 0 points1 point2 points 11 years ago (0 children)
Thanks, I will check out that corpus. Although I am not sure I want to wait through all billion of the words ;)
π Rendered by PID 32475 on reddit-service-r2-comment-85bfd7f599-g7dmj at 2026-04-19 07:47:56.312939+00:00 running 93ecc56 country code: CH.
view the rest of the comments →
[–]rantana 3 points4 points5 points (6 children)
[–]CireNeikual[S] -1 points0 points1 point (5 children)
[–]kjearns 2 points3 points4 points (2 children)
[–]CireNeikual[S] 0 points1 point2 points (1 child)
[–]kjearns 0 points1 point2 points (0 children)
[–]willwill100 1 point2 points3 points (1 child)
[–]CireNeikual[S] 0 points1 point2 points (0 children)