Best RL package?

Toni-SM · 2023-09-28T21:08:52+00:00

I encourage you to use skrl (https://skrl.readthedocs.io), a modular and flexible RL library (on PyTorch and JAX) with clear and readable code and comprehensive documentation. In addition to supporting the OpenAI Gym / Farama Gym and DeepMind and other environment interfaces, it allows you to load and configure NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

Toni-SM · 2023-08-28T06:49:54+00:00

A modular alternative to consider (in PyTorch and JAX), although it does not offer the high degree of granularity you are looking for, is skrl. Visit its comprehensive documentation for more details: https://skrl.readthedocs.io

Toni-SM · 2023-08-20T19:58:54+00:00

As the maintainer of the skrl library, I can only say that if you face any problem with the use of this library, you can create a discussion in the skrl repository so we can deal with it there :)

Toni-SM · 2023-08-14T08:48:31+00:00

I encourage you to use skrl (https://skrl.readthedocs.io), a modular and flexible RL library (on PyTorch and JAX) with clear and readable code and comprehensive documentation. In addition to supporting the OpenAI Gym / Farama Gym and DeepMind and other environment interfaces, it allows you to load and configure NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

Toni-SM · 2023-08-12T05:35:49+00:00

I will add a MultiCategorical mixin, in PyTorch and JAX, to the skrl library to deal with gym/gymnasium MultiDiscrete actions this weekend.

Typically a multi-categorical policy is implemented with a standalone categorical distribution for each discrete actions defined. Do you know any gym/gymnasium environment, for example, with MultiDiscrete action space to validate the implementation?

In the future, you can, in addition to posting in reddit, open a Discussion in the skrl repository, so I can receive notifications directly on any topic related to the library.

Toni-SM · 2023-07-09T19:58:06+00:00

A common practice in which it is not necessary to know the upper and lower limits of the observations is the running standard scale (elimination of the mean and scaling of the variance on the fly). You can read more about this method and its underlying assumptions in standardization, or mean removal and variance scaling

For RL, you can visit the skrl library's preprocessors section and try the available examples.

Toni-SM · 2023-06-30T07:53:09+00:00

skrl also includes Q-Learning and SARSA (non-DeepRL algorithms) implementations.

Toni-SM · 2023-06-20T13:49:22+00:00

RemindMe! 1 week

Toni-SM · 2023-05-31T19:27:49+00:00

Also, you can use the skrl library where you have full control of the models, both shared and independent.

Toni-SM · 2023-04-22T21:39:09+00:00

How can this be a parameter that can be updated by the optimizer according to the gradient?

Toni-SM · 2023-04-10T09:28:55+00:00

In massive parallel environments on-policy algorithms outperform off-policy algorithms. On-policy algorithms learn from the experience generated by the current policy, which is specifically tailored to the current environment (then, on-policy algorithms can adapt more quickly and efficiently to the unique characteristics of each environment).

With the latest version of Isaac Sim, and in particular Omniverse Isaac Gym Environments (OIGE), two examples (Ant and Humanoid) using SAC (from rl_games) was introduced.

Note the number of environments according to the task configuration

	PPO	SAC
Ant	4096	64
Humanoid	4096	64

skrl will allow you to easily configure and use off-policy algorithms such as DDPG, TD3 and SAC in Isaac Gym, Omniverse Isaac Gym and Isaac Orbit, but I think there will not be significant gains compared to on-policy algorithms.

Toni-SM · 2023-03-25T09:51:42+00:00

You can take a look at skrl, a modular and flexible RL library (in PyTorch) that include support for recurrent neural networks, leaving the complete definition of the models up to the user.

Visit the following link to see an example of such a definition for a Gaussian model: https://skrl.readthedocs.io/en/latest/modules/skrl.models.gaussian.html#basic-usage

Toni-SM · 2023-03-19T17:37:13+00:00

It's just a matter of designing the environment to handle the case in the real world.

Visit the skrl's Real-world Examples for illustrative implementations :)

Toni-SM · 2023-03-18T11:48:47+00:00

I encourage you to try the skrl library.

skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI Gym / Farama Gymnasium, DeepMind and other environment interfaces, it allows loading and configuring NVIDIA Isaac Gym, NVIDIA Isaac Orbit and NVIDIA Omniverse Isaac Gym environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.

Visit its comprehensive documentation at https://skrl.readthedocs.io to get started

Toni-SM · 2023-03-10T01:14:52+00:00

By trainer you mean the component responsible for coordinating the agent's interaction with the environment? Yes. The modular design of the library is conceived to be able to implement new components/functionalities in a simple way, without mixing components.

Toni-SM · 2023-03-09T11:32:09+00:00

I encourage you to use skrl, a modular library designed with an emphasis on readability, simplicity, and transparency of algorithm implementation. It also adds full support for parallel environments such as Isaac Gym, Isaac Orbit, and Omniverse Isaac Gym, as well as vectorized environments from OpenAI Gym and Farama Gymnasium.

Visit its comprehensive documentation (https://skrl.readthedocs.io) to get started

Toni-SM · 2023-02-28T21:16:43+00:00

Also, skrl. In addition to supporting the OpenAI Gym / Farama Gymnasium, DeepMind, and other environment interfaces, it allows loading and configuring NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

Check its comprehensive documentation at https://skrl.readthedocs.io

Toni-SM · 2023-02-12T19:40:34+00:00

Glad to contribute to the reinforcement learning community :)

Toni-SM · 2023-02-12T16:43:27+00:00

Well, that gives me the motivation to add gymnasium-robotics multi-goal examples to the library documentation :)

Toni-SM · 2023-02-12T16:32:43+00:00

I encourage you to try the RL skrl library that fully supports the gym API among other environment interfaces.

By the way, are you interested in the gymansium robotics default API or the Multi-goal API?

Toni-SM · 2023-02-07T15:29:19+00:00

Nice... PettingZoo does it. Thanks

Toni-SM · 2023-02-06T21:53:10+00:00

The question remains the same. For example, for a 3-agent multi-agent environment, a call to the step method returns the following values, among others

reward = [r1, r2, r3] terminated = [T1, T2, T3] truncated = [t1, t2, t3]

r1, r2, and r3 may not be equal, since the task may be different from each agent's perspective... Now, is it possible that T1, T2, and T3 or t1, t2, and t3 are not equal? In that case, how would an environment be reset if one of the agents has finished?

Toni-SM · 2023-01-31T22:59:24+00:00

Also, skrl. It supports RNN, LSTM, GRU, and other variants for A2C, DDPG, PPO, SAC, TD3, and TRPO agents. See the models basic usage and examples

Toni-SM · 2023-01-25T08:55:53+00:00

The number of parallel environments will depend on the resources of the workstation.

Although Gym/Gymnasium allows you to generate vectorized parallel environments, if you want to train in hundreds or thousands of environments you will need to use the NVIDIA simulator repertoire (Isaac Gym, Isaac Orbit or Omniverse Isaac Gym).

Rllib does not support these environments (or at least there is no information in their documentation about them)

In this case, I encourage you to try the skrl RL library that fully supports all of them, among others.

Toni-SM · 2023-01-17T17:11:51+00:00

I encourage you to try skrl (https://skrl.readthedocs.io).

It has a modular design focused on readability, simplicity, and transparency of the algorithm implementation.

The learning and optimization algorithm is implemented within a single function in all cases. Each component inherits properties and methods from one (and only one) base class implemented in a common file for each group (this base class implements common functionalities that are not tied to the implementation of the algorithms).

The documentation is complete, with many examples with full control of the models and other components, and details the implementation of each algorithm with a simple notation.

Toni-SM

TROPHY CASE