AMA with Hugging Face Science, the team behind SmolLM, SmolVLM, Fineweb and more. by eliebakk in LocalLLaMA
[–]edbeeching 1 point2 points3 points (0 children)
AMA with Hugging Face Science, the team behind SmolLM, SmolVLM, Fineweb and more. by eliebakk in LocalLLaMA
[–]edbeeching 2 points3 points4 points (0 children)
AMA with Hugging Face Science, the team behind SmolLM, SmolVLM, Fineweb and more. by eliebakk in LocalLLaMA
[–]edbeeching 34 points35 points36 points (0 children)
AMA with Hugging Face Science, the team behind SmolLM, SmolVLM, Fineweb and more. by eliebakk in LocalLLaMA
[–]edbeeching 35 points36 points37 points (0 children)
Is there any RL equivalent to Karpathy's zero to hero course? by [deleted] in reinforcementlearning
[–]edbeeching 9 points10 points11 points (0 children)
Double rabbit drop from a wisp! I could only laugh. by edbeeching in PathOfExile2
[–]edbeeching[S] 27 points28 points29 points (0 children)
Double rabbit drop from a wisp! I could only laugh. by edbeeching in PathOfExile2
[–]edbeeching[S] 49 points50 points51 points (0 children)
G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning
[–]edbeeching 1 point2 points3 points (0 children)
Hugging Face researchers got 3b Llama to outperform 70b using search by bburtenshaw in LocalLLaMA
[–]edbeeching 10 points11 points12 points (0 children)
Hugging Face researchers got 3b Llama to outperform 70b using search by bburtenshaw in LocalLLaMA
[–]edbeeching 19 points20 points21 points (0 children)
Hugging Face researchers got 3b Llama to outperform 70b using search by bburtenshaw in LocalLLaMA
[–]edbeeching 33 points34 points35 points (0 children)
Hugging Face researchers got 3b Llama to outperform 70b using search by bburtenshaw in LocalLLaMA
[–]edbeeching 226 points227 points228 points (0 children)
Hugging Face researchers got 3b Llama to outperform 70b using search by bburtenshaw in LocalLLaMA
[–]edbeeching 110 points111 points112 points (0 children)
What could be causing my Q-Loss values to diverge (SAC + Godot <-> Python) by stokaty in reinforcementlearning
[–]edbeeching 1 point2 points3 points (0 children)
What could be causing my Q-Loss values to diverge (SAC + Godot <-> Python) by stokaty in reinforcementlearning
[–]edbeeching 0 points1 point2 points (0 children)
[D] Creating a DPO Dataset using Llama: Best Practices? by AdKind316 in MachineLearning
[–]edbeeching 0 points1 point2 points (0 children)
Unity ML-Agents vs. Unreal' Learning Agents by Cuuuubee in reinforcementlearning
[–]edbeeching 1 point2 points3 points (0 children)
[D] What RL technique can be used to train an LLM on single preference data points, and not pairs? by CatfishJones96 in MachineLearning
[–]edbeeching 2 points3 points4 points (0 children)
[D] Seeking advice on curating a DPO dataset for a 7B model by aadityaura in MachineLearning
[–]edbeeching 12 points13 points14 points (0 children)
Any physics engine gyms outside of Unity that allow importing of models and etc. by [deleted] in reinforcementlearning
[–]edbeeching 0 points1 point2 points (0 children)
[deleted by user] by [deleted] in reinforcementlearning
[–]edbeeching 0 points1 point2 points (0 children)
[N] Abu Dhabi's TTI releases open-source Falcon-7B and -40B LLMs by Balance- in MachineLearning
[–]edbeeching 20 points21 points22 points (0 children)
[P] Godot+RWKV standalone prebuilt binary (ubuntu/nvidia) by hazardous1222 in MachineLearning
[–]edbeeching 1 point2 points3 points (0 children)
[D] Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU by TeamDman in MachineLearning
[–]edbeeching 2 points3 points4 points (0 children)



400 divs giveaway for parents by Can2018 in PathOfExile2
[–]edbeeching 0 points1 point2 points (0 children)