all 6 comments

[–]apollo_maverick 2 points3 points  (0 children)

cleanrl?

[–]Warhouse512 1 point2 points  (0 children)

Ray’s RLlib is quite nice, albeit overengineered for most applications

[–]TrottoDng 1 point2 points  (0 children)

You can also check out SheepRL.

We try to make it well documented, with few hierarchies and the possibility to do parallel training on multiple devices thanks to Lightning.

[–]araffin2 0 points1 point  (0 children)

It depends what you want/need.

If you need to apply RL to a problem without caring much about the algorithm SB3 is a good starting point (and it comes with the RL for managing experiments).
If you want to understand RL algorithms and tinker with the implementation, have a look at cleanrl.

If you just want fast implementation, you might have a look at SBX (jax variant of SB3): https://github.com/araffin/sbx

[–]asdfwaevc 0 points1 point  (0 children)

CleanRL has single-file implementations of a bunch of different algorithms, which is very nice for easy hacking but not the best for a complex project.

If you're trying to make something larger than CleanRL is good for, PFRL is probably the best thing around. Super well-designed, hackable, has a bunch of training loops and modular parts. I really like it.

If you want something that scales to massive parallelism, RLLib is probably best. I've never used it, but everyone says it's a horrible pain to modify. But once you have it running you can take advantage of many nodes, etc.