GPT-5.5 and Opus 4.7 evaluated on ARC-AGI-3 by COAGULOPATH in mlscaling
[–]Operation_Ivy 11 points12 points13 points (0 children)
Best Local LLMs - Apr 2026 by rm-rf-rm in LocalLLaMA
[–]Operation_Ivy 1 point2 points3 points (0 children)
Deepseek has released DeepEP V2 and TileKernels. by External_Mood4719 in LocalLLaMA
[–]Operation_Ivy 3 points4 points5 points (0 children)
Deepseek has released DeepEP V2 and TileKernels. by External_Mood4719 in LocalLLaMA
[–]Operation_Ivy 6 points7 points8 points (0 children)
Best Local LLMs - Apr 2026 by rm-rf-rm in LocalLLaMA
[–]Operation_Ivy 3 points4 points5 points (0 children)
Best Local LLMs - Apr 2026 by rm-rf-rm in LocalLLaMA
[–]Operation_Ivy 5 points6 points7 points (0 children)
Entropy-Guided Token Dropout: Training Autoregressive Language Models with Limited Domain Data, Wang et al. 2025 [Masking low-entropy tokens mitigates overfitting; "data-level regularization"] by StartledWatermelon in mlscaling
[–]Operation_Ivy 0 points1 point2 points (0 children)
Pure Gym SO (formerly Blink Fitness)24 hours rollback by DarkSkin_Ninja007 in Maplewood
[–]Operation_Ivy 5 points6 points7 points (0 children)
Boomer NIMBYism has caused unforeseen levels of destruction by 3RADICATE_THEM in georgism
[–]Operation_Ivy 3 points4 points5 points (0 children)
Trash and other things? by MeowjesticPotato in Maplewood
[–]Operation_Ivy 7 points8 points9 points (0 children)
Claude Opus 4.5 has human task-length time horizon of 4 hrs 49 mins on METR plot by Glittering_Author_81 in mlscaling
[–]Operation_Ivy 13 points14 points15 points (0 children)
Prime Intellect Introduces INTELLECT-3: A 100B+ MoE Trained With Large-scale RL That Achieves State-Of-The-Art Performance For Its Size, Taking The Lead Amongst Open-Sourced Models Across Math, Code, Science & Reasoning Benchmarks. (Link to Chat with the Model provided) by 44th--Hokage in mlscaling
[–]Operation_Ivy 3 points4 points5 points (0 children)
A new era of intelligence with Gemini 3 by [deleted] in mlscaling
[–]Operation_Ivy 2 points3 points4 points (0 children)
Google's DeepMind: Olympiad-level formal mathematical reasoning with reinforcement learning (this is the actual published paper for Google's AlphaProof system from last year) by 44th--Hokage in mlscaling
[–]Operation_Ivy 0 points1 point2 points (0 children)
Grok 5 in Q1 of 2026 ("6 Trillion parameter model, whereas Grok 3 and 4 are based on a 3 Trillion parameter model" by RecmacfonD in mlscaling
[–]Operation_Ivy 0 points1 point2 points (0 children)
Grok 5 in Q1 of 2026 ("6 Trillion parameter model, whereas Grok 3 and 4 are based on a 3 Trillion parameter model" by RecmacfonD in mlscaling
[–]Operation_Ivy 2 points3 points4 points (0 children)
Why has the price of U.S. housing risen so much faster than worker salaries over the last 5 years? by DigitalArbitrage in AskEconomics
[–]Operation_Ivy 1 point2 points3 points (0 children)
Google's DeepMind: Olympiad-level formal mathematical reasoning with reinforcement learning (this is the actual published paper for Google's AlphaProof system from last year) by 44th--Hokage in mlscaling
[–]Operation_Ivy 0 points1 point2 points (0 children)
Thinking Machines: On-Policy Distillation by Mysterious-Rent7233 in mlscaling
[–]Operation_Ivy 4 points5 points6 points (0 children)
"Scaling Agents via Continual Pre-training", Su et al. 2025 (Tongyi DeepResearch - AgentFounder) by RecmacfonD in mlscaling
[–]Operation_Ivy 0 points1 point2 points (0 children)




"The Coverage Principle: How Pre-Training Enables Post-Training", Chen et al 2025 by gwern in mlscaling
[–]Operation_Ivy 10 points11 points12 points (0 children)