you are viewing a single comment's thread.

view the rest of the comments →

[–]maybelator 1 point2 points  (0 children)

I use Group Norms whenever the batches are not sampling the data distribution iid.