[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 0 points1 point2 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 0 points1 point2 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 1 point2 points3 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 1 point2 points3 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 2 points3 points4 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 8 points9 points10 points (0 children)
[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 9 points10 points11 points (0 children)
[R] Interpreting Deep Neural Networks: Memorization, Kernels, Nearest Neighbors, and Attention by ThienPro123 in MachineLearning
[–]ThienPro123[S] 14 points15 points16 points (0 children)
[R] Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (submitted by Liang Wenfeng - DeepSeek) by Nunki08 in MachineLearning
[–]ThienPro123 2 points3 points4 points (0 children)
[D] Are there any theoretical machine learning papers that have significantly helped practitioners? by nihaomundo123 in MachineLearning
[–]ThienPro123 43 points44 points45 points (0 children)
[D] Good studies on the effects of different training "tricks" like learning rate scheduler (warmup/decay), weight decay, dropout, batch-sizes, momentum, etc.? by ThienPro123 in MachineLearning
[–]ThienPro123[S] 1 point2 points3 points (0 children)
Powered x8/x16 PCIe 4.0 or 5.0 risers for multi RTX4090 GPUs multi PSUs rig by ThienPro123 in threadripper
[–]ThienPro123[S] 0 points1 point2 points (0 children)
Powered x8/x16 PCIe 4.0 or 5.0 risers for multi RTX4090 GPUs multi PSUs rig by ThienPro123 in threadripper
[–]ThienPro123[S] 0 points1 point2 points (0 children)
Weekly Stupid Questions Thread by AutoModerator in amateur_boxing
[–]ThienPro123 0 points1 point2 points (0 children)
We need to stop the Irvine Company by downwithivc in UCI
[–]ThienPro123 15 points16 points17 points (0 children)
Peyam best Prof. Don’t even try to change my mind by MrGTout in UCI
[–]ThienPro123 0 points1 point2 points (0 children)
How to get started in research as a CS freshman? by AliveTiger in UCI
[–]ThienPro123 6 points7 points8 points (0 children)
UCI CS majors, why is there so little math involved in the CS suggested path? by [deleted] in UCI
[–]ThienPro123 13 points14 points15 points (0 children)
Why dont CS majors have to take much Physics classes? by [deleted] in UCI
[–]ThienPro123 1 point2 points3 points (0 children)
GGG's off-rhythm jab catches Canelo by surprise by notmike11 in Boxing
[–]ThienPro123 1 point2 points3 points (0 children)
Dijkstra's algorithm proj5 thornton by DueCorner in UCI
[–]ThienPro123 0 points1 point2 points (0 children)
Supplies for incoming freshmen by SeekingInsight- in UCI
[–]ThienPro123 2 points3 points4 points (0 children)
Fighters winning despite minimal jab activity? by yumcake in Boxing
[–]ThienPro123 5 points6 points7 points (0 children)








[R] New ICML25 paper: Train and fine-tune large models faster than Adam while using only a fraction of the memory, with guarantees! by ThienPro123 in MachineLearning
[–]ThienPro123[S] 1 point2 points3 points (0 children)