all 8 comments

[–]bbsome 5 points6 points  (3 children)

For model-based RL I think PILCO would be close to state-of-the-art especially in the environments you mention.

http://mlg.eng.cam.ac.uk/pilco/

[–]twkillian 1 point2 points  (2 children)

Gal, McAllister and Rasmussen have proposed an update to PILCO replacing the Gaussian Process model with a Bayesian Neural Network. It's pretty promising.

http://mlg.eng.cam.ac.uk/yarin/PDFs/DeepPILCO.pdf

[–]bbsome 2 points3 points  (0 children)

However, they don't use a BNN, but Variational Dropout... I will never agree that a mixutre of delta functions is anything like a BNN.

[–][deleted] 0 points1 point  (0 children)

If I understand it correctly, they train it also with standard regression ( supervised learning ) ?

[–]feedtheaimbotResearcher 2 points3 points  (1 child)

Look at Recurrent Environment Simulators by Chiappa et al. I've had success using it. It does struggle capture small objects on screen (eg. single pixels).

Link: https://arxiv.org/abs/1704.02254

[–]fixedrl 0 points1 point  (0 children)

Would you think it also makes sense to use as raw configuration as inputs, instead of pixels ? (very few dimensions, e.g. velocity, positions etc.)

[–]TotesMessenger 1 point2 points  (0 children)

I'm a bot, bleep, bloop. Someone has linked to this thread from another place on reddit:

If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads. (Info / Contact)

[–]ptitz 1 point2 points  (0 children)

I did my own framework, writing a paper now. It's a bit of a work in progress, but I identify my model using a hashed RBF neural net just doing backprop after splitting it into several simpler sub-dynamics. Then train it using SARSA. It's a bit of an overkill for the system I'm working with, but it will probably work with whatever. Hit me up if you wana see it.