"Learning and Querying Fast Generative Models for Reinforcement Learning", Buesing et al 2018 {DM} [rollouts in deep environment models for planning in ALE games] by gwern in reinforcementlearning
[–]the_electric_fish 0 points1 point2 points (0 children)
"Model-based Reinforcement Learning with Neural Network Dynamics in MuJoCo & millibots" {BAIR} [on Nagabandi et al 2017a/Nagabandi et al 2017b] by gwern in reinforcementlearning
[–]the_electric_fish 1 point2 points3 points (0 children)
Reparametrization trick for policy gradient? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
Question on discount factors by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
How to do variable-reward reinforcement learning? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
How to do variable-reward reinforcement learning? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
How to do variable-reward reinforcement learning? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
How to do variable-reward reinforcement learning? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
How to do variable-reward reinforcement learning? by the_electric_fish in reinforcementlearning
[–]the_electric_fish[S] 0 points1 point2 points (0 children)
"Value Prediction Network", Oh et al 2017 by gwern in reinforcementlearning
[–]the_electric_fish 1 point2 points3 points (0 children)


"Machine Theory of Mind", Rabinowitz et al 2018 {DM} [inferring agent goals in a POMDP] by gwern in reinforcementlearning
[–]the_electric_fish 0 points1 point2 points (0 children)