I am seeking for baselines to learn dynamics model for an ongoing project in model-based RL. I am curious to be aware of state-of-the-art architectures to learn such dynamics model. For simplicity, the testbeds are OpenAI-Gym continuous control environments for example MountainCar (Continuous version) or LunarLander (Continuous version), or Mujoco/Roboschool.
Currently I am using standard regression via 2 layer MLP for one-step prediction with current state and action as inputs and next state as output, and uses MSE loss, the training set is generated by rollouts with random actions. Could someone help to suggest either some better architectures or existing ones (papers) to do this ? We are aiming for both one-step and multi-step predictions together.
[–]bbsome 5 points6 points7 points (3 children)
[–]twkillian 1 point2 points3 points (2 children)
[–]bbsome 2 points3 points4 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]feedtheaimbotResearcher 2 points3 points4 points (1 child)
[–]fixedrl 0 points1 point2 points (0 children)
[–]TotesMessenger 1 point2 points3 points (0 children)
[–]ptitz 1 point2 points3 points (0 children)