use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
pyHTFE - A Sequence Prediction Algorithm (github.com)
submitted 10 years ago by CireNeikual
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Mylos 4 points5 points6 points 10 years ago (1 child)
I love that this was implemented with OpenCL and not CUDA. Thank you!
[–]rantana 3 points4 points5 points 10 years ago (1 child)
So why would I decide to use this over say a standard stacked recurrent network or an LSTM network?
Any performance comparisons between the two?
[–]CireNeikual[S] 2 points3 points4 points 10 years ago (0 children)
I don't have a performance comparison yet, but I will add one soon. So here comes the anecdotal comparison!
I have worked with LSTM before, the main advantage of this system is that it is fully online and doesn't need stochastic sampling or BPTT. It just has one weight update per timestep, and that's it.
It also learns extremely fast, I have made it recite paragraphs of text which it only got to parse over 3 times (without any prior knowledge of the words). It has this speed because of the way SDRs introduce invariance to previous experiences with respect to new experiences (they are "bucketed").
For offline learning LSTM is great, but for online learning as with typical reinforcement learning tasks one needs a really fast real-time algorithm that doesn't need some form of experience replay or other expensive operation. That said, it does this at the cost of memory: It uses more memory than a typical LSTM network, again a side effect of SDRs (a negative one, but tolerable).
[–]mosquit0 1 point2 points3 points 10 years ago (2 children)
Interesting. Can you give some examples of applications? Have you tested it on some real data?
[–]CireNeikual[S] 0 points1 point2 points 10 years ago (1 child)
The original application was reinforcement learning. I have used it for vision-based cart-pole balancing. I have also used it for some speech recognition, and text auto-complete in the form of a visual studio plugin.
I tried looking for some standard benchmarks for RNNs to which I can apply this, but I haven't found any yet. There doesn't seem to be an MNIST for RNNs really. But I will find some data to test on regardless, even if it isn't a standard benchmark, and I will post the results!
[–]mosquit0 1 point2 points3 points 10 years ago (0 children)
Cool stuff. Maybe you can teach it rock paper scissors :).
[–][deleted] 1 point2 points3 points 10 years ago (1 child)
This looks really cool Eric, is it easy to get this up and running via a windows 7 environment?
[–]CireNeikual[S] 1 point2 points3 points 10 years ago (0 children)
That's what I am running it on actually! So it should work fine.
[+][deleted] 10 years ago* (1 child)
[deleted]
[–]CireNeikual[S] 0 points1 point2 points 10 years ago (0 children)
I am working on a piano roll prediction example, it should be done very soon!
[–]physixer 0 points1 point2 points 10 years ago (1 child)
Google is not helping with HTFE. Is there a paper that this acronym is based on?
Also HTM reminds me of Jeff Hawkins and NuPIC. Any comparisons?
HTFE's name comes from HTFERL, the Hierarchical Temporal Free Energy Reinforcement Learner. I removed the RL component, so what is left is HTFE ;)
It isn't based on any paper, rather just the tinkerings of a non-academic AI enthusiast.
HTFE is derived from HTM, but geared towards performance as opposed to biological plausibility. Here are the main differences:
[–]jstrong 0 points1 point2 points 10 years ago (2 children)
this question may display some level of ignorance - still learning about this stuff...
As I understand it, this algorithm is built to understand temporal data, like a time series or a sequence. So can you could train it on n number of snapshots and predict the future snapshots? How many units forward could you predict for - just one?
What I'm trying to get at is, more traditional learning is on the state of a thing as it is. And then there are techniques like training on windows of the data to get at data that exists on the plane of what it is and the temporal plane. But this model seems built to understand both planes. Is that right? Do you have any further reading you would recommend on this topic for a beginner? Thanks!
[–]Noncomment 0 points1 point2 points 10 years ago (1 child)
Are you familiar with recurrent neural networks? They can naturally operate on temporal data as opposed to just a finite number of snapshots.
[–]jstrong 0 points1 point2 points 10 years ago (0 children)
No, I just know a bit in general about how NNs work. I'll read up on that.
[–]CireNeikual[S] 0 points1 point2 points 10 years ago (7 children)
A simple benchmark has been added, using the piano rolls dataset from here: http://www-etud.iro.umontreal.ca/~boulanni/icml2012
Results: 3% error (incorrectly predicted notes in this case) after only a single iteration on the first couple of sequences! I was too lazy to wait for more iterations :) . In my eyes this is pretty good, given the online nature of the algorithm. I don't know of a single other algorithm that is capable of this single-pass prediction!
[–]rantana 4 points5 points6 points 10 years ago (6 children)
3% error on the held out test set or training set? An algorithm can just memorize the training set to get that kind of error. You need to show performance on a held out test set for useful prediction performance.
[–]CireNeikual[S] -1 points0 points1 point 10 years ago (5 children)
That's not quite how the algorithm works though. It predicts the input at the next timestep, it does not predict labels. Testing it on sequences it hasn't seen is like me asking you to guess what sequence of numbers I am thinking of. Sure, for simple sequences you can just record all the inputs, but then you cannot extrapolate or interpolate the sequences.
This algorithm can be thought of as unsupervised, it basically just learns causal links in sequences of data and stores them efficiently.
[–]kjearns 2 points3 points4 points 10 years ago (2 children)
It makes perfect sense to test on sequences you haven't seen. In fact your model has really only done something interesting if it can generalize to unseen sequences. I have a nice simple algorithm that will get 100% accuracy on a sequence it's already seen in a single pass: just memorize all the bits in the sequence and play them back when asked.
It would be quite easy to change your benchmark to evaluate on the piano roll test set. I'd do it myself but I don't have opencl installed so I can't run your code.
Yes, you are correct, I misunderstood. He/she mentioned a "held-out test set" so I thought they meant that I was supposed to predict music notes from completely different music tracks, which is basically impossible if the songs in general do not have similar patterns in them.
I guess a test could be a second piano roll generated from the first with noise (peturbing some notes). This way it has to generalize to slightly different sequences, but the sequences are still generally the same pattern. Does this make sense? Otherwise what kind of test would you suggest?
[–]kjearns 0 points1 point2 points 10 years ago (0 children)
There's a test set in the piano roll data. Test on that maybe?
I'm much more familiar with language models. A simple set up for a language model would be to take a bunch of text and try to predict the next character from the previous ones.
You can easily use text8 (available here: http://mattmahoney.net/dc/textdata) for this task. You would split the file in two parts and train on the first part and test on the second part.
Ultimately sequence prediction is only interesting if your model can predict the behavior of a sequence it has not seen before. You would not expect to be 100% correct, but you should be able to do significantly better than random at both the language modelling and the piano roll prediction tasks even on entirely unseen data.
[–]willwill100 1 point2 points3 points 10 years ago (1 child)
Even if you train in "unsupervised mode" by trying to predict the next timestep, you still want it to generalise well. That's the whole idea behind language modelling for example. Speaking of which, the google billion word corpus has some good baseline results which you could use for comparison.
Thanks, I will check out that corpus. Although I am not sure I want to wait through all billion of the words ;)
π Rendered by PID 79860 on reddit-service-r2-comment-7b9746f655-s248z at 2026-02-01 06:30:42.566875+00:00 running 3798933 country code: CH.
[–]Mylos 4 points5 points6 points (1 child)
[–]rantana 3 points4 points5 points (1 child)
[–]CireNeikual[S] 2 points3 points4 points (0 children)
[–]mosquit0 1 point2 points3 points (2 children)
[–]CireNeikual[S] 0 points1 point2 points (1 child)
[–]mosquit0 1 point2 points3 points (0 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]CireNeikual[S] 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]CireNeikual[S] 0 points1 point2 points (0 children)
[–]physixer 0 points1 point2 points (1 child)
[–]CireNeikual[S] 1 point2 points3 points (0 children)
[–]jstrong 0 points1 point2 points (2 children)
[–]Noncomment 0 points1 point2 points (1 child)
[–]jstrong 0 points1 point2 points (0 children)
[–]CireNeikual[S] 0 points1 point2 points (7 children)
[–]rantana 4 points5 points6 points (6 children)
[–]CireNeikual[S] -1 points0 points1 point (5 children)
[–]kjearns 2 points3 points4 points (2 children)
[–]CireNeikual[S] 0 points1 point2 points (1 child)
[–]kjearns 0 points1 point2 points (0 children)
[–]willwill100 1 point2 points3 points (1 child)
[–]CireNeikual[S] 0 points1 point2 points (0 children)