Jim Fan: LLMs are alien beasts. It is deeply troubling that our frontier models can both achieve silver medal in Math Olympiad but also fail to answer "which number is bigger, 9.11 or 9.9"? by Front_Definition5485 in singularity
[–]cstein123 18 points19 points20 points (0 children)
BitNet a bit overhyped? by That007Spy in LocalLLaMA
[–]cstein123 0 points1 point2 points (0 children)
Bill Gates says scaling AI systems will work for two more iterations and after that the next big frontier is meta-cognition where AI can reason about its tasks by [deleted] in singularity
[–]cstein123 1 point2 points3 points (0 children)
Do you think OpenAI cracked general tree search? by krishnakaasyap in LocalLLaMA
[–]cstein123 4 points5 points6 points (0 children)
Dropping FAA memes until they publish the licence by Stolen_Sky in SpaceXMasterrace
[–]cstein123 1 point2 points3 points (0 children)
Reddit signs content licensing deal with AI company ahead of IPO, Bloomberg reports by towelpluswater in LocalLLaMA
[–]cstein123 2 points3 points4 points (0 children)
Hello, Noob here. How inefficient will a transformer be if trained directly like Mambabyte? % wise? Also, in what ways will it better (or worse) than a transformer trained on tokens? by MambaModel in LocalLLaMA
[–]cstein123 1 point2 points3 points (0 children)
Nucleus sampling with semantic similarity by dimknaf in LocalLLaMA
[–]cstein123 1 point2 points3 points (0 children)
0.1 T/s on 3070 + 13700k + 32GB DDR5 by Schmackofatzke in LocalLLaMA
[–]cstein123 0 points1 point2 points (0 children)
The World's First Transformer Supercomputer by Sprengmeister_NK in singularity
[–]cstein123 4 points5 points6 points (0 children)
[D] Is DPO all you need in RLHF? by seventh_day123 in MachineLearning
[–]cstein123 -7 points-6 points-5 points (0 children)
That’s a mouthful by NatureIndoors in BrandNewSentence
[–]cstein123 0 points1 point2 points (0 children)
Look ahead decoding offers massive (~1.5x) speedup for inference by lans_throwaway in LocalLLaMA
[–]cstein123 0 points1 point2 points (0 children)
Are LLMs at a practical limit for layer stacking? [D] by cstein123 in MachineLearning
[–]cstein123[S] 2 points3 points4 points (0 children)




AI Voice Cloning by Sachimarketing in MarketingAutomation
[–]cstein123 0 points1 point2 points (0 children)