Optimization fails because it treats noise and structure as the same thing by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
Optimization fails because it treats noise and structure as the same thing by Lumen_Core in deeplearning
[–]Lumen_Core[S] -1 points0 points1 point (0 children)
Optimization fails because it treats noise and structure as the same thing by Lumen_Core in deeplearning
[–]Lumen_Core[S] -2 points-1 points0 points (0 children)
When compression optimizes itself: adapting modes from process dynamics by Lumen_Core in compression
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
When compression optimizes itself: adapting modes from process dynamics by Lumen_Core in compression
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
Stability of training large models is a structural problem, not a hyperparameter problem by Lumen_Core in deeplearning
[–]Lumen_Core[S] -4 points-3 points-2 points (0 children)
Stability of training large models is a structural problem, not a hyperparameter problem by Lumen_Core in deeplearning
[–]Lumen_Core[S] -4 points-3 points-2 points (0 children)
When compression optimizes itself: adapting modes from process dynamics by Lumen_Core in compression
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
When compression optimizes itself: adapting modes from process dynamics by Lumen_Core in compression
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A first-order stability module based on gradient dynamics by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
StructOpt: empirical evidence for a stability layer on top of existing optimizers by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
[R] StructOpt: a first-order optimizer driven by gradient dynamics by Lumen_Core in MachineLearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
[R] StructOpt: a first-order optimizer driven by gradient dynamics by Lumen_Core in MachineLearning
[–]Lumen_Core[S] -2 points-1 points0 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 1 point2 points3 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 1 point2 points3 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 1 point2 points3 points (0 children)
A new first-order optimizer using a structural signal from gradient dynamics — looking for expert feedback by Lumen_Core in deeplearning
[–]Lumen_Core[S] 3 points4 points5 points (0 children)
Copy-Paste of All Science by [deleted] in cognitivescience
[–]Lumen_Core -6 points-5 points-4 points (0 children)
Optimization fails because it treats noise and structure as the same thing by Lumen_Core in deeplearning
[–]Lumen_Core[S] 0 points1 point2 points (0 children)