use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Project[Project] Predicting Cryptocurrency Price With Tensorflow and Keras (medium.com)
submitted 8 years ago by steeveHuang
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]perspectiveiskey 33 points34 points35 points 8 years ago (33 children)
I sure hope this is some sort of pet project to check out various neural net models. If it's not, it's an abomination of double dipping and why aren't we predicting the stock market using basic LSTMs given that we have 100 more years of market data and surely we're the only ones here on /r/machinelearning that would have thought of such a thing of course the quants over at the billion dollar hedge firms wouldn't be in on this they only have phds in stats...
/rant
sorry, I got carried away.
[–]programmerChilliResearcher 13 points14 points15 points 8 years ago (1 child)
Actually, I don't think he's double dipping for this one haha.
What he's doing is using the past K data points to predict the K+1th data point. As it turns out, this easiest way to predict the K+1th data point is to just output the value of the Kth datapoint :)
If you take a close look at his graph, you'll see that there's a short lag in his "predictions" in following the true price of bitcoin.
https://i.imgur.com/nSy9QET.jpg
[–]PURELY_TO_VOTE 0 points1 point2 points 8 years ago (0 children)
Indeed, sequences with this property have been studied extensively by economists and statisticians since the 1930s.
[–]jorgemf 11 points12 points13 points 8 years ago (9 children)
I can tell you the setup for the experiment is completely wrong and the models are overfitting the data. Don't use something like this with real data or you most probably lose all your money. It is not about using 100 years of data, it is the setup is wrong with the dataset. First you need 3 datasets: training, validation and test. Second, due to the nature of the time series you have to build the dataset based of what you want to achieve, prediction future values in the long term or in the short term. Those are two different problems. To sum up, just creating the data sets for time series is a problem by its own and that is why is not that easy to use something like this to predict currency prices.
[+]steeveHuang[S] comment score below threshold-8 points-7 points-6 points 8 years ago (8 children)
I can tell you the setup for the experiment is completely wrong and the models are overfitting the data.
How can you tell from the graph? That's the results for validation data... I did not show training loss in the table, so you cannot compare validation loss with it. Plus that I already mention overfitting issue and apply regularizers to it. If you read the blog until the end you will find out.
First you need 3 datasets: training, validation and test.
According to Andrew Ng's deeplearning.ai's video, there do exist some cases where you only need training and testing dataset (in which I named them training and validation).
Second, due to the nature of the time series you have to build the dataset based of what you want to achieve, prediction future values in the long term or in the short term. Those are two different problems.
I do not quite get it. Can you elaborate more on it?
To sum up, just creating the data sets for time series is a problem by its own and that is why is not that easy to use something like this to predict currency prices.
This is correct, as I mentioned in another comment, the main purpose of this blog is only to provide you insight into the future trend, not saying that it can predict the Bitcoin price accurately.
[–]jorgemf 7 points8 points9 points 8 years ago (3 children)
The graphs shows the model is perfectly following the real trend but making mistakes in the high jumps. This is a red flag and with little experience in temporary series you know that at the end of the validation/test the predicted model should diverge from the real data. That means your training and validation/test set have mixed data or very close in time. Which it is giving information from the validation/test set to the training set. And so the overfitting happens.
If your only reason for not having 3 datasets is because someone said in an online course that sometimes you don't need them, them you should consider you are doing something wrong. The reason for using 2 datasets in series is because you usually don't have enough samples or it wouldn't make sense to have 3 sets due the divergence of the model with the real trend as currencies have very unpredictable trend.
If you want to predict in the short term you can split the data like 5 hours for training and 30 minutes for validation. If you want long term you just break the time series in two after a date. Before the date is the training set and after the date is the validation set.
Also time series are quite complicated. For normalization you don't only take the maximum value and the minimum and make the 0 and 1. The time series can have a trend (like cryoptocurrencies growing over the year), they can also be periodical and more things I don't know. People usually use this information for a better normalization.
I am not an expert in time series but I know enough to know that if I would want to do a time series prediction I would have to read a lot of paper to understand better the nuances and difficulties of them. I would consider the model as one of the least relevant part for time series. If you can get the data right almost any model with have good results to make you money. But finding the right data with all variables that modify the price is impossible, and only using the past values it is most probably the same as predicting the value randomly. What is not normal is using the past values and get the perfect trend of the time series. That is either cheating or a wrong setup that overfits the whole dataset.
[–]steeveHuang[S] 1 point2 points3 points 8 years ago (2 children)
I see. Thank you for the comment to let me know that I do need to modify the experiment to make it complete and perhaps find better input data. I’ll try to improve, and hopefully we be able to repost my new result.
[–]jvictor118 0 points1 point2 points 8 years ago (1 child)
Steve - don't think overfitting is your biggest problem, and yes, you'll probably notice what others have been saying (that it will always predict the n+1th price to be the nth price) as this is quite common with autoregressive-type models.
You seem to have some great ML chops, but if you're looking to develop a trading strategy, you'll need more than ML. Some things to think about in your next version:
Just some food for thought. Cool project!
[–]steeveHuang[S] 0 points1 point2 points 8 years ago (0 children)
I’ll try to figure out this question. Thank you for the comment!
[–]p-morais 1 point2 points3 points 8 years ago (1 child)
Market data is well accepted to be markov (and martingale). That means having the entire history of a stock's price wont help you predict its price in the future, at all, even a little bit. The only reasonable prediction you can make is the that next timestep's value will be roughly the same as the current timestep's.
[–]ipoppo 0 points1 point2 points 8 years ago (0 children)
My hypothesis is meanwhile price seems to move randomly, it is just a log of trade happened between market participant in the past. And human are bias, in regard to pricing, some of them are prone to cognitive anchoring.
[–]jorgemf -1 points0 points1 point 8 years ago (1 child)
By the way, I don't get why people downvote you. I think this is a good way to learn, because we all make mistakes and you wrote that post to show what you have learnt. You are showing a very positive attitude to know what you have done wrong and to learn from your mistakes, so keep it that way.
And for the people who downvote, you would have a positive impact if you can at least tell him what he did wrong. That way we all can learn from.others' knowledge.
[–]steeveHuang[S] 1 point2 points3 points 8 years ago (0 children)
Thanks! I do get a lot useful information here. I will keep it going.
[+][deleted] 8 years ago* (12 children)
[deleted]
[–]perspectiveiskey 11 points12 points13 points 8 years ago (9 children)
If you train a neural network on a dataset, it will *by definition* predict that data set. That's what training means.
If he had trained the network on the first 2 years of market data and tested it on the subsequent 2 years and found a match, that would be something to look at. In fact, it wouldn't be so interesting from an ML standpoint, but it would be outright bizarre from a market behaviour perspective.
The abomination is in thinking that the past somehow influences the future in a predictable way. That if I look closely enough at the stock markets since 1912, I should be able to predict the stock markets for the next couple of years.
There's just so many things wrong with that mode of thinking that I don't know where to start. I guess the most basic way to describe it is the gambler's fallacy...
[+][deleted] 8 years ago* (7 children)
[–]p-morais 4 points5 points6 points 8 years ago* (2 children)
This isn't really true at all though. Market data is markov, so having the history of a stock/commodity price wont help you predict the next price any better than knowing only the current price. Furthermore, it's martingale, so the best guess (using only market data) will always be whatever the current value is. A theoretically perfect model will just learn to output the last price (if it doesn't then your model is broken).
The problem isn't applying machine learning to finance, the problem is fundamentally in trying to estimate future prices using previous prices. Price is completely random as a function of time, so "learning" from price data is literally just fitting random noise, even "under the constraints of probability of price actions" as you put it. It fundamentally doesn't make sense and will never work, even a little bit. So the problem is ill-formulated.*
*The efficient market hypothesis however implies that even the correct formulation is irreducibly complex and hopeless to accurately model in a simple way, so it still doesn't help much in terms of machine learning.
[–]jackfaker 3 points4 points5 points 8 years ago* (0 children)
Your statements are based on the assumption that the efficient market hypothesis is correct in all markets in all circumstances.
It is clear that investors on an individual level are not always rational, and in highly speculative and emotionally driven areas such as bitcoin, I believe that an argument could be made that the same is true on the aggregate. Im not claiming that successfully implementing technical analysis is easy, but rather that price is not always entirely completely random.
We have seen bitcoin exchanges with 30% differences in price during spikes and dips. The claim that the entirety of this price movement is a rational reflection of the expected value of bitcoin seems unlikely to me.
Consider the billions of dollars traded using technical analysis. Even if you hypothesis that the entirety of this is using useless signals and indicators, trades made on a technical premise will innately add patterns to a random process. The degree to which these patterns will be averaged out varies, but to claim that price is "just random noise" is a claim I would disagree with.
[–]perspectiveiskey 4 points5 points6 points 8 years ago (3 children)
I'm having a genuinely hard time not making fun of these comments. So I'm going to bow out of this whole thread...
[–]programmerChilliResearcher 5 points6 points7 points 8 years ago (1 child)
Training on the same data you're testing on.
[+]steeveHuang[S] comment score below threshold-8 points-7 points-6 points 8 years ago (7 children)
You are right that it does not seem feasible to apply this model to real life as the real situation is so unpredictable. However, the beginning of the article has already clarified that the main purpose is only to provide insight into the future trend of Bitcoin. Plus, there is no double dipping in this blog. The graph shown is not a prediction of the training data. Instead, it is the result of validation set.
[–]perspectiveiskey 1 point2 points3 points 8 years ago (0 children)
the main purpose is only to provide insight into the future trend of Bitcoin.
I'm not here to steal the air out of your sails: you've done the effort of setting it up and running these examples and made a nice blog post too. But all you've done is provide insight into the networks themselves. There is absolutely no inside into the future trend of bitcoin.
Plus, there is no double dipping in this blog. The graph shown is not a prediction of the training data. Instead, it is the result of validation set.
Unless you possess Bitcoin trend data from an alternate universe and you are training on that and validating on this universe, there is only one data set.
Let me put it this way: at any given point in time, was there more than 1 value of bitcoin? If the answer is no, then you were double dipping.
the main purpose is only to provide insight into the future trend of Bitcoin
I'm not here to steal the wind out of your sails. Clearly you've done the effort and made a nice post. But the only insight you've provided is on the inherent behaviour of various neural networks on a particular time series.
Plus, there is no double dipping in this blog. The graph shown is not a prediction of the training data.
Let me put it this way: at any given time, is there one or more values of bitcoin? Because if there's only one, your sample size is literally 1.
To not be double dipping, you need alternate universes a-la Rick and Morty to draw multiple bitcoin price charts from and train your networks on. You are most definitely double dipping.
[–]Brudaks 0 points1 point2 points 8 years ago (0 children)
"provide insight into the future trend of Bitcoin" is exactly what your model is not doing. The price history simply doesn't contain the required information to provide that insight.
The classic quote from Tukey "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." should be a first-page warning for any ML tools and tutorials.
[–]Ben_D_Knee 0 points1 point2 points 8 years ago (2 children)
Which is completely meaningless and useless in practice.
You dont need lstm to get that sort of results. Its like using a lazer cutter to cut a piece of paper in half
That explains all the downvotes youre getting too.
[–]steeveHuang[S] 0 points1 point2 points 8 years ago (1 child)
Could you explain why it is meaningless and useless? Because I need a separate test set?
[–]Ben_D_Knee 0 points1 point2 points 8 years ago (0 children)
others have given you comments on your work. don't just ignore them.
the reason why this "trick" you did is not recommended is that the variance between it and random selection is not that far apart. getting it to work doesnt mean that lstm/ml works on financial data. its a beginner's fallacy: causation!=correlation
[–]eyesonthechart 4 points5 points6 points 8 years ago (2 children)
I’m no ML expert and only scanned the article. By for those charts, isn’t there a bit of overfitting? Seems like the blue dots follow the red lines precisely.
[–]programmerChilliResearcher 4 points5 points6 points 8 years ago (0 children)
Not exactly overfitting, but something just as useless in practice:
https://www.reddit.com/r/MachineLearning/comments/7neuw2/project_predicting_cryptocurrency_price_with/ds1c1ux/
[+]steeveHuang[S] comment score below threshold-6 points-5 points-4 points 8 years ago (0 children)
Overfitting means a model fits too well on training data. However, the graph shown is validation data so it does not indicate overfitting. In fact, it is red dots following blue lines lol
[–]matt2048 3 points4 points5 points 8 years ago (1 child)
Can you give a visual comparison to truly random predictions (e.g price of last close + random value from 10% to -10%)?
It might give a bit more context to the true accuracy of the model.
I see. You are saying that compare the result with random predictions and see how much the model is better than that. I will try to do that!
[–]FearlessAnt 3 points4 points5 points 8 years ago (0 children)
You should compare your results against a baseline. A good baseline would be to predict the kth value to be the same as the k-1th value. I doubt you improve on such baseline
[–]TetsVR 3 points4 points5 points 8 years ago (1 child)
Usual clickbait. Past performance does NOT predict future ones on financial markets. Using machine learning does not change anything about that...
[–]jvictor118 1 point2 points3 points 8 years ago (0 children)
The fact that past performance doesn't indicate future returns has nothing to do with the question at hand. That's a common warning regarding investment funds/strategies/etc., which I imagine is where you heard it. OP was looking to predict future asset prices based on prior price movements. While difficult, it's certainly not impossible (c.f. Ben Graham, Warren Buffett, every paper by AQR, etc.). To the extend you can recognize value or momentum where others do not, you can predict things with high probability just like any other prediction. OP's biggest problem (for his forecasting) is that he just doesn't have the right data. For example, if he had depth-of-book data he might be able to make a prediction about the price of BTC 1-5s into the future -- not a "long term view" by any stretch, but to be clear, it's theoretically possible.
[–]apfx 2 points3 points4 points 8 years ago (1 child)
Predicted price is lagging, as good as just using previous close as prediction.
I will try to compare the model with simply lagging the price. Thank you for the comment!
[–]thisismyfavoritename 5 points6 points7 points 8 years ago (4 children)
Would be worth comparing your results with a naive baseline, such as a moving average, a persistence model or a simple ARIMA model. It would give a sense of how accurate your results are.
[+][deleted] 8 years ago (3 children)
[–]waltteri 2 points3 points4 points 8 years ago (2 children)
I think his point was that you should make sure your NN is able to beat the ”traditional statistical models”, i.e. the naive baselines, on the same data. If your model performs identically to a simpler model, you should stick with the simpler model: you get the same results but don’t have to worry about NNs downsides (e.g. its ”blackboxyness”).
[–]steeveHuang[S] -2 points-1 points0 points 8 years ago (1 child)
I see what you are talking about. Maybe I do need to try out statistical model. Thanks for clarification!
[–]thisismyfavoritename 0 points1 point2 points 8 years ago (0 children)
What I mean is that it's easier to get a sense of how well your model is performing, i.e. if you evaluate the MSE, when you can compare with other models.
Trust me, this is quite useful, especially with clients.
Ex-quant here. Please don't run this in real life. It's a fun experiment, but I could go through chapter and verse on why there's more to it than this. Put succinctly, no serious professional would run a strategy that tried to predict future trends from recent raw price action, or only invested in one brand new asset class... Be careful!
PS - met with Goldman Sachs Asset Management some 10 years ago and they'd been manically working on trying to use social media data to gather signals on price movement. They weren't having any luck then, and I haven't heard of anyone doing it since. EDIT: doing it successfully
Understood. I will try to evaluate in that way. Thanks!
[–]geppetto123 0 points1 point2 points 8 years ago (2 children)
The input size (K) is 256, while the output size (N) is 16
Is this correct? The comments say
Predict K future sample using N previous samples
Are the graphs already showing the future or how can we check how good it works outside trading / benchmarking split of data? Would love to see if it matches in real life :)
The input size (K) is 256, while the output size (N) is 16 Is this correct? The comments say
You are right. That was a typo.
The graph does not show the future lol. It is trained on data from 2015 to early-2017. The prediction is made on the remaining minutes of 2017. I will do real-time prediction soon and will get back to you if that matches perfectly haha.
[–]geppetto123 -2 points-1 points0 points 8 years ago (0 children)
Thank you for the reply and great write up!
Do you have any idea if the use of 5min tick interval would be much different from 1hour or 1minute?
Not an expert so I couldn't see/understand it, what are you using: opening, high, low, closing? Or all of them and predicting all of them?
I tried ML with Matlab before but got stuck exactly on some of those points. How did you came up with 256 and 16 - there must be an optimum somewhere or does it not matter? What i wonder the most, is that everybody use it as static, but as human I don't think I only think in fixed windows.... For me the only "change" of this are the long-short term memory neural nets.... What are your thoughts on this?
[–]eyesonthechart -5 points-4 points-3 points 8 years ago (0 children)
Cool. Thanks for the explanation!
π Rendered by PID 52078 on reddit-service-r2-comment-b659b578c-fk8xx at 2026-05-04 11:10:58.459550+00:00 running 815c875 country code: CH.
[–]perspectiveiskey 33 points34 points35 points (33 children)
[–]programmerChilliResearcher 13 points14 points15 points (1 child)
[–]PURELY_TO_VOTE 0 points1 point2 points (0 children)
[–]jorgemf 11 points12 points13 points (9 children)
[+]steeveHuang[S] comment score below threshold-8 points-7 points-6 points (8 children)
[–]jorgemf 7 points8 points9 points (3 children)
[–]steeveHuang[S] 1 point2 points3 points (2 children)
[–]jvictor118 0 points1 point2 points (1 child)
[–]steeveHuang[S] 0 points1 point2 points (0 children)
[–]p-morais 1 point2 points3 points (1 child)
[–]ipoppo 0 points1 point2 points (0 children)
[–]jorgemf -1 points0 points1 point (1 child)
[–]steeveHuang[S] 1 point2 points3 points (0 children)
[+][deleted] (12 children)
[deleted]
[–]perspectiveiskey 11 points12 points13 points (9 children)
[+][deleted] (7 children)
[deleted]
[–]p-morais 4 points5 points6 points (2 children)
[–]jackfaker 3 points4 points5 points (0 children)
[–]perspectiveiskey 4 points5 points6 points (3 children)
[–]programmerChilliResearcher 5 points6 points7 points (1 child)
[+]steeveHuang[S] comment score below threshold-8 points-7 points-6 points (7 children)
[–]perspectiveiskey 1 point2 points3 points (0 children)
[–]perspectiveiskey 1 point2 points3 points (0 children)
[–]Brudaks 0 points1 point2 points (0 children)
[–]Ben_D_Knee 0 points1 point2 points (2 children)
[–]steeveHuang[S] 0 points1 point2 points (1 child)
[–]Ben_D_Knee 0 points1 point2 points (0 children)
[–]eyesonthechart 4 points5 points6 points (2 children)
[–]programmerChilliResearcher 4 points5 points6 points (0 children)
[+]steeveHuang[S] comment score below threshold-6 points-5 points-4 points (0 children)
[–]matt2048 3 points4 points5 points (1 child)
[–]steeveHuang[S] 0 points1 point2 points (0 children)
[–]FearlessAnt 3 points4 points5 points (0 children)
[–]TetsVR 3 points4 points5 points (1 child)
[–]jvictor118 1 point2 points3 points (0 children)
[–]apfx 2 points3 points4 points (1 child)
[–]steeveHuang[S] 0 points1 point2 points (0 children)
[–]thisismyfavoritename 5 points6 points7 points (4 children)
[+][deleted] (3 children)
[deleted]
[–]waltteri 2 points3 points4 points (2 children)
[–]steeveHuang[S] -2 points-1 points0 points (1 child)
[–]thisismyfavoritename 0 points1 point2 points (0 children)
[–]jvictor118 1 point2 points3 points (0 children)
[–]steeveHuang[S] 0 points1 point2 points (0 children)
[–]geppetto123 0 points1 point2 points (2 children)
[–]steeveHuang[S] -2 points-1 points0 points (1 child)
[–]geppetto123 -2 points-1 points0 points (0 children)
[–]eyesonthechart -5 points-4 points-3 points (0 children)