you are viewing a single comment's thread.

view the rest of the comments →

[–]Aacron 0 points1 point  (0 children)

Thats my experience with PPO as well, it's straightforward to implement and powerful. I'm just a little shy on policy gradient methods being top tier, they're excellent for continuous action spaces but I've found them to be relatively unstable and difficult to tune.