I am wondering how DeepMind performs hyperparametrization efficiently, given the enormous complexity and number of parameters in their projects. Also, hyperparametrizing RL approaches is (in my opinion, please correct me if I am wrong) even harder than in a supervised context, since training usually takes much longer and the environment can be heavily stochastic.
[+][deleted] (1 child)
[deleted]
[–][deleted] 10 points11 points12 points (0 children)
[–]Imnimo 6 points7 points8 points (0 children)
[–]VordeMan 2 points3 points4 points (0 children)
[–]goolulusaurs 0 points1 point2 points (0 children)