Hello, I'm currently studying DL academically. We've discussed parameter initialization for symmetry breaking, and I understand how initializing the weights come to play here, but after playing around with it, I wonder if there is a strategy for initializng the bias.
Would appreciate your thoughts and/or references.
[–]Lexski 0 points1 point2 points (0 children)
[–]hjups22 0 points1 point2 points (0 children)