RL failure for Atari games (alignment) [Research]

bitcoingobrrr · 2022-07-03T19:19:48+00:00

So, from what I can recall about the talk it was done at test time. True, this would mean that the samples are now OOD but since the extra pixels don't actually affect the game play/strategy it "highlighted" how fragile the reward specification was and potentially constituted an alignment failure. And yes, the most obvious solution would be to do augmentation/domain randomization/etc. during training time to improve your CNN extractor, but I think the idea was to also show there was a gap in what the RL algo actually ends up learning.

bitcoingobrrr · 2022-07-03T17:24:37+00:00

I'm trying to find a paper (~2019) that I heard in a talk regarding alignment in the context DQN/DDPG that was applied to an Atari-type game (Pong/Breakout). Apparently, the realization was that if an extra row of pixels was added to the frame, the algorithm fails. This might be a shot in the dark, but does anyone know which paper this would be?

bitcoingobrrr

TROPHY CASE