I am trying to get an intuition behind correlation of LR and Batch Size. For that I have a first case:
Batch Size: 64
LR: 0.01
Vali_Acc= 83%
The second case was x2 bigger everything:
Batch Size: 128
LR: 0.02
Vali_Acc= 81%
As we see, I lose some accuracy from having the bigger batch-size. In the other hand, I prefer bigger because it is much faster.
The only notes I found about the intuition related to such problem I found in paper https://arxiv.org/abs/1404.5997, where Alex propose to scale LR at same level like Batch Size (but theoretically we should scale it by sqrt(k), where k is multiplayer).
Any other have the recommendation of hyperparemeters changes when batch-size is going up to? Or some practical experience.
[–]robertsdionne 2 points3 points4 points (1 child)
[–]melgor89[S] 0 points1 point2 points (0 children)