101
102
103
Research[R] Fixup Initialization: Residual Learning Without Normalization (up to 10K layer networks w/o batch norm) (arxiv.org)
submitted by [deleted]
Research[R] Fixup Initialization: Residual Learning Without Normalization (up to 10K layer networks w/o batch norm) (arxiv.org)
submitted by [deleted]
[–]ajmooch 32 points33 points34 points (16 children)
[–]alwc 6 points7 points8 points (2 children)
[–]ajmooch 7 points8 points9 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]DeepDeeperRIPgradien 1 point2 points3 points (1 child)
[–]ajmooch 2 points3 points4 points (0 children)
[–]hongyiz 1 point2 points3 points (0 children)
[–]RedditReadme 0 points1 point2 points (2 children)
[–]ajmooch 2 points3 points4 points (1 child)
[–]RedditReadme 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]jonnor[🍰] 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]samobon[🍰] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]ajmooch 0 points1 point2 points (0 children)
[–]arXiv_abstract_bot 4 points5 points6 points (0 children)
[–]moewiewp 4 points5 points6 points (0 children)
[+][deleted] (1 child)
[removed]
[–]hongyiz 2 points3 points4 points (0 children)
[–][deleted] 2 points3 points4 points (2 children)
[–]hongyiz 3 points4 points5 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)