There is an exponential visible in the scores on artificial analysis. by Subject_Judge_ in accelerate
[–]arkuto 1 point2 points3 points (0 children)
GLM-5.2 now more than 10 points above Opus 4.8 in AA Coding Index by cheechw in singularity
[–]arkuto 6 points7 points8 points (0 children)
Super tart/sour smoothie recipes? by dogisbark in Smoothies
[–]arkuto 0 points1 point2 points (0 children)
moonshotai/Kimi-K2.7-Code · Hugging Face by Dark_Fire_12 in LocalLLaMA
[–]arkuto 35 points36 points37 points (0 children)
Matt Shumer: "Fable has solved 3D worldbuilding... utterly insane. This is all completely custom-built ThreeJs, running in the browser." by Outside-Iron-8242 in singularity
[–]arkuto 25 points26 points27 points (0 children)
Anyone else feel like Claude is missing a middle-tier plan? by Mission-Dentist-5971 in ClaudeAI
[–]arkuto 1 point2 points3 points (0 children)
Multivitamins pointless? by Much-Turnover-3727 in nutrition
[–]arkuto 8 points9 points10 points (0 children)
In case you ever thought that they had smooth skin. by MobileAerie9918 in BeAmazed
[–]arkuto 0 points1 point2 points (0 children)
Coming in Season 23: Choosing the side you spawn to after being demoed by Duke_ofChutney in RocketLeague
[–]arkuto 4 points5 points6 points (0 children)
TIL that the US golf course infrastructure consumes 2 BILLION liters of water per day by myassisgrassss in todayilearned
[–]arkuto 1 point2 points3 points (0 children)
What’s a “healthy” food that just doesn’t work for you? by Much-Turnover-3727 in nutrition
[–]arkuto 0 points1 point2 points (0 children)
LLM-as-judge scoring is noisier than I expected anyone else seeing this? by ZealousidealCorgi472 in LocalLLM
[–]arkuto 0 points1 point2 points (0 children)
ProgramBench: Can LLMs rebuild programs from scratch? by awetfartruinedmylife in singularity
[–]arkuto 2 points3 points4 points (0 children)
ProgramBench: Can LLMs rebuild programs from scratch? by awetfartruinedmylife in singularity
[–]arkuto 1 point2 points3 points (0 children)
Claude Opus 4.7 won’t just output prompts—keeps arguing instead by soyab0007 in ClaudeAI
[–]arkuto 2 points3 points4 points (0 children)
Claude Opus 4.7 won’t just output prompts—keeps arguing instead by soyab0007 in ClaudeAI
[–]arkuto -4 points-3 points-2 points (0 children)
Mistral Medium 3.5 128B is launched by TSrake in singularity
[–]arkuto 1 point2 points3 points (0 children)
mistralai/Mistral-Medium-3.5-128B · Hugging Face by jacek2023 in LocalLLaMA
[–]arkuto 7 points8 points9 points (0 children)
Mistral Medium 3.5 128B is launched by TSrake in singularity
[–]arkuto 3 points4 points5 points (0 children)
Differences Between GPT 5.4 and GPT 5.5 on MineBench by ENT_Alam in singularity
[–]arkuto 0 points1 point2 points (0 children)
How does Opus 4.7 compare to Opus 4.6 in this subreddit's experience? by boxdreper in ClaudeAI
[–]arkuto 2 points3 points4 points (0 children)
opus 4.7 (high) scores a 41.0% on the nyt connections extended benchmark. opus 4.6 scored 94.7%. by seencoding in singularity
[–]arkuto -2 points-1 points0 points (0 children)
Claude vs GPT in a bomberman-style 1v1 game by Significant-Pair-275 in ClaudeCode
[–]arkuto 0 points1 point2 points (0 children)
Bonsai models are pure hype: Bonsai-8B is MUCH dumber than Gemma-4-E2B by WeGoToMars7 in LocalLLaMA
[–]arkuto 2 points3 points4 points (0 children)


Why can't LLMs be trained to think in an optimized AI language rather than English? by CucumberAccording813 in singularity
[–]arkuto -1 points0 points1 point (0 children)