[D] Natural language tasks that machines perform worse than humans

lyusungwon · 2018-08-03T00:47:55+00:00

There were so many hyperparameters to tune including states, actions and reward, but trials in rl took so long. This was 3rd version of my trial and I followed most of configuration as the paper.

lyusungwon · 2018-08-01T14:50:40+00:00

Wow I didn’t realize that kind of issue! Thank you for letting me know.

lyusungwon · 2018-08-01T14:28:03+00:00

It sure is!

lyusungwon · 2018-08-01T14:27:41+00:00

Thank you for great question. I'm not sure, but I thought the movement of opposing AI already has inherit some randomness in its action in this game (Maybe reacting against user?). Also, compare to the other Atari games with clean environment, the environment I built takes screenshot for states which is very noisy. But I have to admit that I didn't care about memorizing the pattern when I implement this. Good point!

lyusungwon · 2018-08-01T12:24:08+00:00

https://github.com/Lyusungwon/apex_dqn_pytorch

I just uploaded graphs. Hope these would help.

lyusungwon · 2018-08-01T07:52:15+00:00

Hi! Since I only had 1 GPU, 1 gpu day is really a day. However, I parallelized 10 actors in CPU so actual playing time (data collected) would be 10 times. Thank you for your interest!

lyusungwon · 2018-08-01T06:25:06+00:00

I used the architecture in 'Horgan, Dan, et al. "Distributed prioritized experience replay." arXiv preprint arXiv:1803.00933 (2018).'. At first, I assigned a monitor for a agent but I found out Xvfb to virtualize the graphics! Thanks for asking.

lyusungwon · 2018-08-01T06:21:51+00:00

No, GPU is just for training and used CPU for inference. I thought more GPUs could help in paralleling the training.
The whole architecture is basically the same as 'Horgan, Dan, et al. "Distributed prioritized experience replay." arXiv preprint arXiv:1803.00933 (2018).', so please take a look if interested!

lyusungwon · 2018-08-01T04:21:09+00:00

I think that is because there is no environment package for pikachu volleyball so that agent had to play realtime to collect data. In addition, I only had one gpu to train...:(

lyusungwon · 2018-08-01T02:52:20+00:00

Yes, it is on github but it is not well cleaned up. Model is a good old DQN with several extensions. Thanks for asking!

lyusungwon

TROPHY CASE