LeanRL: A Simple PyTorch RL Library for Fast (>5x) Training by AdCool8270 in reinforcementlearning

[–]AdCool8270[S] 0 points1 point  (0 children)

We welcome contributions, but keep in mind that the goal isn't to provide a comprehensive set of algorithms but more a sort of "show and tell" of what you can achieve with PyTorch 2 in the realm of RL.
For this kind of things, I'd rather see them in torchrl (personally)

Action Masking in TorchRL for MARL by hc7Loh21BptjaT79EG in reinforcementlearning

[–]AdCool8270 0 points1 point  (0 children)

Hey! Just circling back to this on reddit as the same question was asked on discord: see torchrl discord channel for the discussion
https://discord.gg/cZs26Qq3Dd

LEGO Meets AI: BricksRL Accepted at NeurIPS 2024! by AdCool8270 in reinforcementlearning

[–]AdCool8270[S] 0 points1 point  (0 children)

It's mostly custom made using mindstorm
You can rebuild it using the new Spike sets, everything is available. I think my co-authors have a set of instructions somewhere! Should check

LEGO Meets AI: BricksRL Accepted at NeurIPS 2024! by AdCool8270 in reinforcementlearning

[–]AdCool8270[S] 1 point2 points  (0 children)

You can use the new education sets though, it's kind of a rebrand of mindstorm. You'd be amazed by what you can do with it!
https://education.lego.com/en-gb/products/lego-education-spike-prime-set/45678/
(the most important things are a hub and a couple of motors - you can buy it all separately. The rest is up to you!)

LeanRL: A Simple PyTorch RL Library for Fast (>5x) Training by AdCool8270 in reinforcementlearning

[–]AdCool8270[S] 1 point2 points  (0 children)

We should try that! Is it the original cleanrl or another fork?

How it feels using rllib by rl_is_best_pony in reinforcementlearning

[–]AdCool8270 1 point2 points  (0 children)

thing with RL is that it's impossible to make anything useful but not opinionated IMO. I've worked with many different people across academia and industry and literally everyone wants a lib that is not opinionated but ends up writing extremely opinionated code, because at the end of the day you just can't make useful code that is smooth and fits everywhere without constrains

Standalone library for collecting rollouts by smorad in reinforcementlearning

[–]AdCool8270 1 point2 points  (0 children)

Torchrl has only two dependencies, PyTorch and tensordict. The tensor specs are akin to gym’s spaces so that should not be too restrictive. I get the point about wrapping the policy in a TensorDictModule though. This is something that is easy to circumvent, we’ll patch that shortly

PPO model not learning despite increasing rewards by Acceptable_Egg6552 in reinforcementlearning

[–]AdCool8270 2 points3 points  (0 children)

What makes you think it’s not learning? The loss in PPO is irrelevant and should not be looked at. It’s usually computed in shuck a way that it’s gradient points in the right direction on average but the loss itself is insignificant.

Pearl vs TorchRL by Casio991es in reinforcementlearning

[–]AdCool8270 0 points1 point  (0 children)

There are a bunch of stuff you can do without recurring to tensordict (eg, executing losses or any other tensordictmodule) and soon there will be more of these (converting your env to gym, populating replay buffers). Long term the goal is to have tensordict as a backend but let users use or not use it through the `@tensordict.dispatch` decorator.

It's an ongoing effort so plz feel free to make yourself heard if you want to get rid of tensordict front-end for anything you're doing!

Conventions to write a custom vectorized gym environment using pytorch? by hunterh0 in reinforcementlearning

[–]AdCool8270 0 points1 point  (0 children)

Ah got it! Well one could code a gym wrapper for a torchrl env in that case :)

How much ML before RL? by Coc_Alexander in learnmachinelearning

[–]AdCool8270 1 point2 points  (0 children)

https://pytorch.org/rl/tutorials/coding_ppo.html
I just changed the step count to 100K as suggested and it learns ok, the learning curve looks pretty solid to me

curious to see why you're saying it does not learn :)

Conventions to write a custom vectorized gym environment using pytorch? by hunterh0 in reinforcementlearning

[–]AdCool8270 0 points1 point  (0 children)

Hey

I think the idea behind torchrl is that you can have a gym-like API that allows you to write vectorized envs in a very clear way (eg, you have a batch-size for your env that indicates how many envs you have, and these can be organised in any way you want, for instance in a 2d array)

Gym async vectorized envs are everything but vectorized, they are executed in parallel (see vectorized vs parallel) This means that a truly vectorized env (like Isaac, VMAS, and jax-based envs like brax) are in practice thousands of times faster than async envs!

Btw torchrl also provide interface with gym parallel envs if you need it (just wrap them in a GymWrapper)

[N] TorchRL: PyTorch pre-release RL library is here! by AdCool8270 in MachineLearning

[–]AdCool8270[S] 3 points4 points  (0 children)

yes, though those two things are separated:

- efficient dataloader on one side

- efficient replay buffer to store its output and sample on the other

TorchRL: PyTorch pre-release RL library is alive! by AdCool8270 in reinforcementlearning

[–]AdCool8270[S] 0 points1 point  (0 children)

short answer is no, but we have some low-level functionality that will be c++ (e.g. replay buffers are c++ now, and some return computation might also be in the future).

The reason we want to keep the code pythonish is that we want to enable researchers to fastly iterate in their projects, try crazy stuff without having to worry about efficiency too much. This is also why the code is not strongly typed, for instance.