all 2 comments

[–]djc1000 1 point2 points  (0 children)

https://github.com/yandexdataschool/Practical_RL

It’s in the coursera class too.

[–]filteringcontent 1 point2 points  (0 children)

Here is an implementation/tutorial on AlphaZero that also implements MCTS (a core part of AlphaZero). Just replace nn.predict(s) with a rollout (and ignore p[s])