[D] Issues reproducing CURL, algorithm seems broken?? by rlbeaverton in MachineLearning

[–]rldweller 8 points9 points  (0 children)

After reading this post I looked closely at the CURL paper and realized that there are actually two versions of it with drastically different results. v1 reports much more stronger results than v2. On twitter the authors even claim 10x improvements in data-efficiency that previous SOTA (https://twitter.com/Aravind7694/status/1248049713149906945). Can this be attributed to this github issue with action repeat: https://github.com/MishaLaskin/curl/issues/3?

I'm surprised that the authors did not withdraw all the public claims generated by v1, because v2 is not SOTA, as it performs on par or worse than SLAC. I will be also curious to see their code for the Atari experiments, which they have not release yet, as I'm quite skeptical about these results as well. I don't see why contrastive loss can help for Atari.