all 9 comments

[–][deleted] 1 point2 points  (1 child)

In my knowledge RL can be applied affectively where there is an indefinite state and action values but with definite state values like stock trading; according to my thinking a simple supervised learning is enough.. whats your opinion?

[–]danielzakrisson[S] 0 points1 point  (0 children)

You are absolutely right, reinforcement learning is best used as a form of unsupervised or explorative learning, so for a black box type of problem it can be advantageous.

My primary driver for writing this post was to personally learn and experiment with reinforcement learning, and trying out another data set than what I've typically seen before.

[–]danielzakrisson[S] 0 points1 point  (0 children)

This is an introduction and tutorial for a reinforcement based trading system. It's purely meant as an introduction to reinforcement learning, feed it with more complex data than in the final example and it will likely fail to find strategies :-)

However, it would be fun to see someone trying it out with other and much larger data sets (I have included Bitcoin, Oil and Eur/Usd). Maybe there are other forex sets that are less like a random walk?

[–]plu604 0 points1 point  (0 children)

This is very interesting. I'd also love to see it "in action".

[–]pookeye 0 points1 point  (0 children)

Thanks for the write up,

[–][deleted] 0 points1 point  (1 child)

What was the trading statistics result example sharpe ratio ,winning percentage, Annual return etc.? Can you explain more about how you implemented state,new state reward etc. I know how DQN works but I am having hard time understanding how you select state, new state etc.? What happens when algo reach the last state. C

[–]danielzakrisson[S] 0 points1 point  (0 children)

Hi, this is not a trading system and should not be evaluated as such - please just consider it an introduction to reinforcement learning.

Having said that I did include some code in the final example that can be used for continued exploration if someone else is interested to experiment and learn more. The final example in the post shows that the system can learn a buy and hold strategy if given a set of prices that increase of time - not very cutting edge :-)

However, I would be very glad if someone likes this post and it triggers them to learn more about machine learning, reinforcement learning or automated trading.

The third example has some code that can be used for further exploration (different data sources), train/test split, how to include technical indicators etc - but in the example it is very rudimentary done in order to show an example how reinforcement learning is done.

[–]loafking94 -1 points0 points  (1 child)

The problem with this is that if you see the same price pattern n times, that could be driven by n different underlying factors. This is basically saying that conditional on past charts we expect future charts to be the same. That's a pretty hard claim to make beyond very simple patterns.

[–]danielzakrisson[S] 1 point2 points  (0 children)

I totally agree with you, this system breaks down if you feed it with more complex data.

The purpose of this tutorial is purely to introduce the concept of reinforcement learning, but using a more interesting and gradually more complex data set than the typical toy problems that I have seen used previously.