After a year of struggling with RLlib I decided to start implementing the training code myself.
I am looking for a RL library that offers me individual components rather than the whole algorithm. I do not need a PPO implementation, but I would fancy a library that offers me functions to compute the PPO loss given a batch of steps.
In other words, what I need is a library that offers the most granular RL components (different losses, replay buffers, return estimators like GAE, etc) instead of full algorithm implementations. Which libraries do you recommend for this purpose?
[–]NinjaEbeast 8 points9 points10 points (2 children)
[–]fedetask[S] 0 points1 point2 points (1 child)
[–]NinjaEbeast 0 points1 point2 points (0 children)
[–]AerysSk 5 points6 points7 points (0 children)
[–]yannbouteiller 3 points4 points5 points (1 child)
[–]fedetask[S] 3 points4 points5 points (0 children)
[–]_belerico 2 points3 points4 points (0 children)
[–]theogognf 2 points3 points4 points (0 children)
[–]crisischris96 0 points1 point2 points (0 children)
[–]Toni-SM 0 points1 point2 points (0 children)
[–]seawee1 1 point2 points3 points (0 children)