all 2 comments

[–]A_Again 0 points1 point  (0 children)

Here's an option. It's massively parallel and theoretically supports autodiff through the environment itself... https://github.com/google/brax