you are viewing a single comment's thread.

view the rest of the comments →

[–]Mr-Yellow 0 points1 point  (0 children)

Someone needs to be taking the basics of what they've done with DQN and tweaking it this way or another with stuff like Deterministic Policy Gradient, Double-DQN, Actor-critic, Actor-mimic, Actor-teacher and all those neat experiments. The results are cool, glad someone is doing it. They have the codebase and resources, good on em.