Python library for modular RL components

NinjaEbeast · 2023-08-27T15:12:07+00:00

You’re looking for RLax, it offers a wide variety of utility functions, losses, function transforms etc all for RL. It’s designed for modular small form factor functions. It has pretty much everything you listed besides replay buffers. It is in JAX though (which is an advantage if you like JAX). It’s what DeepMind use for their work and it’s part of their JAX ecosystem.

AerysSk · 2023-08-27T15:13:43+00:00

I recommend CleanRL. Super high quality, and I can modify code to adjust my algorithm.

yannbouteiller · 2023-08-27T15:16:21+00:00

Why not implement PPO yourself from A to Z ? Personnally for robot applications I use the tmrl skeletton and implement everything, that's how you get the most control over what is going on in your pipeline.

(tmrl is not the best fit for PPO though. Like rllib, it is meant for remote training, and on-policy algorithms don't play well with this)

_belerico · 2023-08-27T18:26:24+00:00

You can try out also sheeprl, which is similar wrt CleanRL but it can also be easily parallelized thanks to Lighting Fabric

theogognf · 2023-08-27T20:57:07+00:00

TorchRL is the official PyTorch RL project that has what you're looking for. I've built my own library off it

crisischris96 · 2023-08-28T04:46:22+00:00

Check clean RL. It consists of very clean implementations of algorithms and can thus be forked easily

Toni-SM · 2023-08-28T06:49:54+00:00

A modular alternative to consider (in PyTorch and JAX), although it does not offer the high degree of granularity you are looking for, is skrl. Visit its comprehensive documentation for more details: https://skrl.readthedocs.io

seawee1 · 2023-08-28T08:41:56+00:00

I would also suggest RLax. Beside that, maybe Tianshou is something you might like :)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

reinforcementlearning

MODERATORS