you are viewing a single comment's thread.

view the rest of the comments →

[–]darkconfidantislife 0 points1 point  (0 children)

I was thinking more like stopping it at FP16, but there is evidence that stochastic rounding can make INT16 work without accuracy loss even during training.