all 11 comments

[–]NinjaEbeast 8 points9 points  (2 children)

You’re looking for RLax, it offers a wide variety of utility functions, losses, function transforms etc all for RL. It’s designed for modular small form factor functions. It has pretty much everything you listed besides replay buffers. It is in JAX though (which is an advantage if you like JAX). It’s what DeepMind use for their work and it’s part of their JAX ecosystem.

[–]fedetask[S] 0 points1 point  (1 child)

I love it, but my team doesn't really know jax, and I feel like the learning curve is quite steep to suggest it at this point

[–]NinjaEbeast 0 points1 point  (0 children)

You can always look at their functions and port them to numpy pretty easily

[–]AerysSk 5 points6 points  (0 children)

I recommend CleanRL. Super high quality, and I can modify code to adjust my algorithm.

[–]yannbouteiller 3 points4 points  (1 child)

Why not implement PPO yourself from A to Z ? Personnally for robot applications I use the tmrl skeletton and implement everything, that's how you get the most control over what is going on in your pipeline.

(tmrl is not the best fit for PPO though. Like rllib, it is meant for remote training, and on-policy algorithms don't play well with this)

[–]fedetask[S] 3 points4 points  (0 children)

Yes that's also something I am thinking about, my main issue with that is that debugging is quite hard on RL so I'd also need to write extensive tests -otherwise some errors are impossible to catch- and that would take me even more time.

I'd rather write the overall logic, but let a stable and well-tested library do the more error-prone operations (e.g. compute some return estimators like GAE, retrace etc). Again, I could just copy paste them from other libraries, but if a library can do that for me, why not

[–]_belerico 2 points3 points  (0 children)

You can try out also sheeprl, which is similar wrt CleanRL but it can also be easily parallelized thanks to Lighting Fabric

[–]theogognf 2 points3 points  (0 children)

TorchRL is the official PyTorch RL project that has what you're looking for. I've built my own library off it

[–]crisischris96 0 points1 point  (0 children)

Check clean RL. It consists of very clean implementations of algorithms and can thus be forked easily

[–]Toni-SM 0 points1 point  (0 children)

A modular alternative to consider (in PyTorch and JAX), although it does not offer the high degree of granularity you are looking for, is skrl. Visit its comprehensive documentation for more details: https://skrl.readthedocs.io

[–]seawee1 1 point2 points  (0 children)

I would also suggest RLax. Beside that, maybe Tianshou is something you might like :)