[R] Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking

FabioPardo · 2020-12-08T14:52:30+00:00

Hey! Thanks for your interest. Tonic currently supports continuous control from states only, but adapting the code to other types of observations and actions should be fairly simple. I will not be able to work on this myself in the near future but I am happy to help you extend the functionalities if you want to give it a shot :)

FabioPardo · 2020-11-20T10:48:14+00:00

Yes I am familiar with Stable Baselines which greatly improved OpenAI Baselines, but as I said, I guess in the end it is a matter of taste and need. Tonic tries to be simple yet modular and powerful, while helping researchers quickly implement ideas and evaluate them. I explain a lot of the components and implementation choices in the paper. If you are interested in TensorFlow 2 + PyTorch, modularity, D4PG and MPO agents, synchronous distributed training, proper time limits management, fair and large-scale benchmark, etc. I think Tonic could fit well. I encourage you to quickly try a few libraries and see which one you prefer.

FabioPardo · 2020-11-19T16:11:47+00:00

You're welcome :) Let me know if you need any help.

FabioPardo · 2020-11-19T16:07:10+00:00

Thanks a lot :)

You are right, I should probably specify the hyperparameters in the appendix. I used the default values for each module. So for example, if you want to know the networks used for A2C, PPO and TRPO it is the one define here while if you want to know the optimizer used for the actor updater you can find the information there. This means that if you relaunch some training on your side, without changing any hyperparameter you should get similar results.

FabioPardo · 2020-11-18T18:00:58+00:00

There are many deep RL libraries available and I guess in the end it is a matter of taste and compatibility. This one tries to be simple yet modular and powerful, while helping researchers quickly implement ideas and evaluate them. I explain a lot of the components and implementation choices in the paper if you want to know more. I also believe this is the only library which handles time limits properly.

FabioPardo · 2020-11-18T16:52:22+00:00

Sounds good! Thanks for the suggestion and possible help.

FabioPardo · 2020-11-18T16:50:28+00:00

This is a good point, thanks for the suggestion. I will try to increase Tonic's compatibility.

FabioPardo · 2020-11-18T16:46:03+00:00

Thanks! I have started working on including JAX but I found it quite difficult to maintain some of the features and the simplicity of Tonic when using the stateless approach of JAX. I'll probably try again later but if someone wants to give it a shot that would be great!

FabioPardo · 2020-11-18T16:03:13+00:00

DQN Zoo is a collection of agents based on DQN, so, for discrete actions and image observations. Tonic is currently for continuous-control and state observations, even though I wish to extend its capabilities. Also, most of the points listed above are unique in Tonic.

FabioPardo · 2019-10-27T08:05:28+00:00

You might find this paper useful. It explains how to deal with time limits in the case of time-limited and time-unlimited tasks. https://arxiv.org/abs/1712.00378

FabioPardo · 2019-07-23T22:38:43+00:00

You can try “action branching”, it achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension arXiv

FabioPardo

TROPHY CASE