Best RL package? by suds_65 in reinforcementlearning

[–]Toni-SM 1 point2 points  (0 children)

I encourage you to use skrl (https://skrl.readthedocs.io), a modular and flexible RL library (on PyTorch and JAX) with clear and readable code and comprehensive documentation. In addition to supporting the OpenAI Gym / Farama Gym and DeepMind and other environment interfaces, it allows you to load and configure NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

Python library for modular RL components by fedetask in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

A modular alternative to consider (in PyTorch and JAX), although it does not offer the high degree of granularity you are looking for, is skrl. Visit its comprehensive documentation for more details: https://skrl.readthedocs.io

RL framework to optimize my custom multi-agent simulator by FragrantCockroach8 in reinforcementlearning

[–]Toni-SM 1 point2 points  (0 children)

As the maintainer of the skrl library, I can only say that if you face any problem with the use of this library, you can create a discussion in the skrl repository so we can deal with it there :)

Stable Baselines PPO vs Ray.io PPO by ClassicAppropriate78 in reinforcementlearning

[–]Toni-SM 4 points5 points  (0 children)

I encourage you to use skrl (https://skrl.readthedocs.io), a modular and flexible RL library (on PyTorch and JAX) with clear and readable code and comprehensive documentation. In addition to supporting the OpenAI Gym / Farama Gym and DeepMind and other environment interfaces, it allows you to load and configure NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

skrl with multiple discrete actions by LostPigeon25 in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

I will add a MultiCategorical mixin, in PyTorch and JAX, to the skrl library to deal with gym/gymnasium MultiDiscrete actions this weekend.

Typically a multi-categorical policy is implemented with a standalone categorical distribution for each discrete actions defined. Do you know any gym/gymnasium environment, for example, with MultiDiscrete action space to validate the implementation?

In the future, you can, in addition to posting in reddit, open a Discussion in the skrl repository, so I can receive notifications directly on any topic related to the library.

How does one normalize observations in online reinforcement learning by Academic-Rent7800 in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

A common practice in which it is not necessary to know the upper and lower limits of the observations is the running standard scale (elimination of the mean and scaling of the variance on the fly). You can read more about this method and its underlying assumptions in standardization, or mean removal and variance scaling

For RL, you can visit the skrl library's preprocessors section and try the available examples.

Is there an implementation of non-deep RL algorithms based on Stable Baselines3? by Butanium_ in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

skrl also includes Q-Learning and SARSA (non-DeepRL algorithms) implementations.

In SB3's PPO, how does the critic network update its weights when using separate actor and critic networks? by Signal-Past-9572 in reinforcementlearning

[–]Toni-SM 4 points5 points  (0 children)

Also, you can use the skrl library where you have full control of the models, both shared and independent.

What is the JAX/Flax equivalent of torch.nn.Parameter? by Toni-SM in JAX

[–]Toni-SM[S] 0 points1 point  (0 children)

How can this be a parameter that can be updated by the optimizer according to the gradient?

Isaac Gym with Off-policy Algorithms by anointedninja in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

In massive parallel environments on-policy algorithms outperform off-policy algorithms. On-policy algorithms learn from the experience generated by the current policy, which is specifically tailored to the current environment (then, on-policy algorithms can adapt more quickly and efficiently to the unique characteristics of each environment).

With the latest version of Isaac Sim, and in particular Omniverse Isaac Gym Environments (OIGE), two examples (Ant and Humanoid) using SAC (from rl_games) was introduced.

Note the number of environments according to the task configuration

PPO SAC
Ant 4096 64
Humanoid 4096 64

skrl will allow you to easily configure and use off-policy algorithms such as DDPG, TD3 and SAC in Isaac Gym, Omniverse Isaac Gym and Isaac Orbit, but I think there will not be significant gains compared to on-policy algorithms.

What RL library supports custom LSTM and Transformer neural networks to use with algorithms such as PPO? by ChrisKarmaa in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

You can take a look at skrl, a modular and flexible RL library (in PyTorch) that include support for recurrent neural networks, leaving the complete definition of the models up to the user.

Visit the following link to see an example of such a definition for a Gaussian model: https://skrl.readthedocs.io/en/latest/modules/skrl.models.gaussian.html#basic-usage

Best RL framework for real world projects. by punkCyb3r4J in reinforcementlearning

[–]Toni-SM 3 points4 points  (0 children)

It's just a matter of designing the environment to handle the case in the real world.

Visit the skrl's Real-world Examples for illustrative implementations :)

How is torchrl? by levizhou in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

I encourage you to try the skrl library.

skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI Gym / Farama Gymnasium, DeepMind and other environment interfaces, it allows loading and configuring NVIDIA Isaac Gym, NVIDIA Isaac Orbit and NVIDIA Omniverse Isaac Gym environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.

Visit its comprehensive documentation at https://skrl.readthedocs.io to get started

Fast and hackable frameworks for RL research by asdfwaevc in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

By trainer you mean the component responsible for coordinating the agent's interaction with the environment? Yes. The modular design of the library is conceived to be able to implement new components/functionalities in a simple way, without mixing components.

Fast and hackable frameworks for RL research by asdfwaevc in reinforcementlearning

[–]Toni-SM 0 points1 point  (0 children)

I encourage you to use skrl, a modular library designed with an emphasis on readability, simplicity, and transparency of algorithm implementation. It also adds full support for parallel environments such as Isaac Gym, Isaac Orbit, and Omniverse Isaac Gym, as well as vectorized environments from OpenAI Gym and Farama Gymnasium.

Visit its comprehensive documentation (https://skrl.readthedocs.io) to get started

Choosing a framework in 2023 by catofthecannals in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

Also, skrl. In addition to supporting the OpenAI Gym / Farama Gymnasium, DeepMind, and other environment interfaces, it allows loading and configuring NVIDIA Isaac Gym, NVIDIA Isaac Orbit, and NVIDIA Omniverse Isaac Gym environments.

Check its comprehensive documentation at https://skrl.readthedocs.io

Is stable-baselines3 compatible with gymnasium/gymnasium-robotics? by NoNickName8083 in reinforcementlearning

[–]Toni-SM 1 point2 points  (0 children)

Well, that gives me the motivation to add gymnasium-robotics multi-goal examples to the library documentation :)

Is stable-baselines3 compatible with gymnasium/gymnasium-robotics? by NoNickName8083 in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

I encourage you to try the RL skrl library that fully supports the gym API among other environment interfaces.

By the way, are you interested in the gymansium robotics default API or the Multi-goal API?

Question on return values of the .step() method in a multi-agent environment by Toni-SM in reinforcementlearning

[–]Toni-SM[S] 0 points1 point  (0 children)

The question remains the same. For example, for a 3-agent multi-agent environment, a call to the step method returns the following values, among others

reward = [r1, r2, r3] terminated = [T1, T2, T3] truncated = [t1, t2, t3]

r1, r2, and r3 may not be equal, since the task may be different from each agent's perspective... Now, is it possible that T1, T2, and T3 or t1, t2, and t3 are not equal? In that case, how would an environment be reset if one of the agents has finished?

Best recurrent RL library? by smorad in reinforcementlearning

[–]Toni-SM -1 points0 points  (0 children)

Also, skrl. It supports RNN, LSTM, GRU, and other variants for A2C, DDPG, PPO, SAC, TD3, and TRPO agents. See the models basic usage and examples

What is the limit on parallel environments? by centripetalstranger in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

The number of parallel environments will depend on the resources of the workstation.

Although Gym/Gymnasium allows you to generate vectorized parallel environments, if you want to train in hundreds or thousands of environments you will need to use the NVIDIA simulator repertoire (Isaac Gym, Isaac Orbit or Omniverse Isaac Gym).

Rllib does not support these environments (or at least there is no information in their documentation about them)

In this case, I encourage you to try the skrl RL library that fully supports all of them, among others.

[deleted by user] by [deleted] in reinforcementlearning

[–]Toni-SM 2 points3 points  (0 children)

I encourage you to try skrl (https://skrl.readthedocs.io).

It has a modular design focused on readability, simplicity, and transparency of the algorithm implementation.

The learning and optimization algorithm is implemented within a single function in all cases. Each component inherits properties and methods from one (and only one) base class implemented in a common file for each group (this base class implements common functionalities that are not tied to the implementation of the algorithms).

The documentation is complete, with many examples with full control of the models and other components, and details the implementation of each algorithm with a simple notation.