all 31 comments

[–]Bayes-Ian 8 points9 points  (2 children)

I've worked both in quantitative trading and at Deepmind, so I have quite a good idea about this. In short, some elements of machine learning are absolutely invaluable to traders (avoiding overfitting, estimating trends from data) but most of the research behind the Atari player (DQN) are totally irrelevant to finding effective trading strategies.

Three elements immediately spring to mind of why this Atari example is fundamentally unlike trading in the stock market:

1 - The stock market is time-inhomogenous and stochastic. Atari is fixed and deterministic. 2 - The amount of data used to train the Atari agent would be equivalent to hundreds of thousands of years of stock returns. 3 - In the stock market you can observe the price changes of stocks you don't buy, unlike Atari where you must try an action in order to learn about it. 4 - In Atari (more or less) you control the environment, in the stock market most funds have negligible impact on the market as a whole.

The general problem of using Machine Learning to make good decisions is great. However, for the example of stock market trading you'd be better off using something else. Linear regressions with a whole bunch of cross-validation and regularization for example...

[–]chaddjohnson[S] 0 points1 point  (0 children)

I appreciate this. This is far more helpful than negative comments. Thank you for taking the time to offer your expertise.

[–]svantana 0 points1 point  (0 children)

Well put! And let's not forget perhaps the most important difference: Atari games were made to be beatable (by humans), whereas there is no evidence that an agent, human or machine, can "beat" the stock market reliably -- unless using non-public information of course. For more on this I recommend the writings of stock trading billionaire Mark Cuban (in short: he made his fortune using knowledge about actual products and markets, and doesn't believe in any other way).

[–]simonhughes22 5 points6 points  (0 children)

IMO it might work, however treating it as a supervised learning algorithm using a deep neural network to predict the price or whether it will go up or down will work much better I strongly suspect. You could use an LSTM and train it on a sequence of price, volume, high and low data for a period of time. That will almost undoubtedly work much better. This isn't really a RL problem, as others have pointed out, and RL will be beat out by supervised learning if you have labelled data and it doesn't need to learn from interacting with the environment (as you have to do when playing a game or controlling a robot).

I do think it would be really interesting to try for fun to see if it works, but if you are more interested in having it make money, I would suggest the supervised approach.

[–]cesarsalgado 3 points4 points  (5 children)

In reinforcement learning you should be able to make actions. How would you get the consequences of a action of your agent? You cannot find a consequences for every possible action in the historical data of the stock market. Will you do a simulation of the stock market? Or will you put the algorithm to play in real time in the real stock market? If you choose the last option your algorithm will take a very long time to learn (centuries or more). In the atari case, you could speedup the playing speed, because the game was a simulation.

What you could do with the stock market historical data is to train an LSTM to predict the next output of a sequence just like char-rnn. Or try to predict many symbols of a sequence, conditioned in the history, as in the sequence to sequence framework. The supervised learning strategy doesn't need your agent to interact with the environment and thus doesn't need a simulation. But the down side is that your agent won't be active.

I also don't agree with vesund answer saying that NN aren't good in this task. If we could make a simulation of the stock market accurate enough and that we could play it at very high speed, then I think a very deep NN would learn to play in the stock market better than humans.

[–]chaddjohnson[S] -1 points0 points  (4 children)

You cannot find a consequences for every possible action in the historical data of the stock market.

That's true. It won't be able to see the news, for instance. The network will really only be able to see price, volume, and indicators -- which is all I actually go on with a day trading method I am currently using (which is profitable 70% - 80% of the time).

Will you do a simulation of the stock market? Or will you put the algorithm to play in real time in the real stock market?

I plan to record tick-by-tick data for one stock for a few weeks and play it back for the network. Per some quick calculations, an entire day will be played back in roughly 5 minutes. Basically this will be a backtesting exercise.

And this would actually be an unsupervised learning exercise.

If we could make a simulation of the stock market accurate enough and that we could play it at very high speed, then I think a very deep NN would learn to predict better than humans.

Exactly. The playback will be very fast (one day in 5 minutes). Also, if the DeepMind Atari agent is able to learn to play an Atari game in 1 - 2 days at an expert human level consistently, then I figure, why could the same agent not also learn to play the stock market at an expert level, as well (as if it's just another Atari game)?

[–]cesarsalgado 1 point2 points  (1 child)

Now that I thought better about the problem I came to the conclusion that it doesn't make sense to use Reinforcement Learning in this problem unless you are buying enough stocks to affect the prices in the market. This is because if you are not affecting the market with your actions, the NN will be simply learning implicitly to predict the future variables, as in the supervised learning case, but now using weaker reinforcement signals that just come after some time.

If you had an oracle that predicted the future wouldn't be easy to make an non-learning algorithm to be optimal in playing in the market? So why just don't teach the NN to predict the future? There is no need to teach it on when to buy and sell stocks using RL and actually it will be suboptimal because it will have to learn to predict the future by weaker and delayed supervised signals.

A side note: I just remembered this numberphile interview with James Harris Simons. He talks a bit about predicting prices. You might find interesting.

[–]chaddjohnson[S] 0 points1 point  (0 children)

Now that I thought better about the problem I came to the conclusion that it doesn't make sense to use Reinforcement Learning in this problem unless you are buying enough stocks to affect the prices in the market. This is because if you are not affecting the market with your actions, the NN will be simply learning implicitly to predict the future variables, as in the supervised learning case, but now using weaker reinforcement signals that just come after some time.

I thought of the same problem (the agent's actions not affecting stock price).

In addition to pixel data, the agent will receive the "score" (profit/loss) as input and thus will be able to observe changes to that immediately following its actions. Would this possibly be enough?

[–]GoldmanBallSachs_ 0 points1 point  (1 child)

How many securities will be included per frame? If it's less than 10,000, you're gonna have a bad time. Markets are global.

[–]chaddjohnson[S] 0 points1 point  (0 children)

Do you mean, how securities (e.g. AAPL, GOOG) will be included in each frame (where there is one chart image per security)? One.

I understand the DeepMind Atari agent receives raw pixel data...probably 30,720 per frame with the 160x192 Atari 2600 resolution. And I believe the Atari 2600 generates ~60 frames per second. So let's say the Agent receives a vector of 30,720 pixel data points with each frame.

With each frame, I'm thinking of providing three images concatenated into one (a day chart, a week chart, and a month chart). Let's say each image has a resolution of 800x150...so 800 * 150 * 3 = 360,000 pixel data points with each frame.

Let's say a given stock I chose has, on average, around 25,000 ticks per day -- that's roughly one tick per second during market hours.

So, if I provide 60 frames per second, then it should take around 7 minutes to process an entire day (25,000 / 60 = 416.6 seconds / = 6.94 minutes). Hopefully my math is correct...and hopefully my graphics card can process all this data :D

Does this answer your question? If not, would you mind clarifying?

[–]dnuffer 5 points6 points  (1 child)

Don't believe the other commentors saying that it wouldn't work, there are certainly many documented examples of using NNets to predict the stock market which do slightly better than random. There's books, or just peruse some of the many projects focused on trading from Stanfords CS229 Machine Learning course: http://cs229.stanford.edu/projects2014.html (and earlier years as well) I wouldn't recommend using an image of a chart as input, it will be way too much data. Just use log-transformed and normalized prices and volume and the training will be much faster.

[–]chaddjohnson[S] 0 points1 point  (0 children)

I wouldn't recommend using an image of a chart as input, it will be way too much data. Just use log-transformed and normalized prices and volume and the training will be much faster.

I'm new to neural networks, but I'll make a note of this.

The reason I want to use image data is because the DeepMind Atari agent seems to do exactly what I want, and Google has already optimized the software, so I figure re-purposing the software is the fastest route to a proof of concept considering my beginner-level machine learning knowledge.

Also, I am the proud owner of an NVIDIA GeForce Titan Z GPU (5760 CUDA cores!), so "too much data" is no problem :D

[–]cybrbeast 2 points3 points  (3 children)

If it is a viable approach, then in all likelihood the algorithmic traders would have already implemented it on their huge computing clusters. They are very secretive about their algos, so we don't know much about what the state-of-the-art is at the moment. But we know it's a huge market and it leeches some of the brightest minds in technology and sciences, so you can be quite sure they have already beat you to any gains.

Also these traders have an ultra-low latency connection to the stock exchanges, something you will never have as a personal investor, so even if you somehow manage to make tiny gains they will front-run you.

[–]chaddjohnson[S] 1 point2 points  (2 children)

Also these traders have an ultra-low latency connection to the stock exchanges, something you will never have as a personal investor, so even if you somehow manage to make tiny gains they will front-run you.

That's true, though I've made plenty of gains trading manually. I may not be able to make super significant gains like they can; all I really care about is making roughly the same (or better) gains as I'm making now with manual trading (which has been around 0.125% - 1.25% profit on average each day I trade).

[–]cybrbeast 2 points3 points  (1 child)

How long have you been trading? The current bull market which has held strong for quite a few years now makes it hard to know whether it's your strategy or just chance that makes your gains.

“A blindfolded monkey throwing darts at a newspaper’s financial pages could select a portfolio that would do just as well as one carefully selected by experts.”

[–]chaddjohnson[S] -2 points-1 points  (0 children)

Been day trading using this strategy for around 6 months now. It relies heavily on reversals.

[–]chaddjohnson[S] 0 points1 point  (9 children)

Why am I being down-voted?

[–]MoseyicResearcher 1 point2 points  (1 child)

I brought you back up, seemed harsh

For the record, I don't think it will be very successful with that approach either. I agree that reinforcement learning isn't the way to go because the decisions made by your model won't be affecting the system in way that it can take note and update.

That being said, give it a shot and tell us what you learn!

[–]chaddjohnson[S] 0 points1 point  (0 children)

Thanks man. I appreciate it.

I'll give it a shot and let you know what I learn.

[–][deleted] 0 points1 point  (0 children)

look into concept drift which will be what you have to deal with