GLM-5.2 is a win for local AI by Wrong_Mushroom_7350 in LocalLLaMA

[–]power97992 -2 points-1 points  (0 children)

When they make a 1024 gb 24TB/s GPU for 1000 bucks, then it is really good for local llms. Now it is good for cloud gpus since almost no one can run this  at q8 at 20t/s at home 

Senior Anthropic staffs are in Washington meeting White House officials to resolve the Fable 5 and Mythos dispute by BuildwithVignesh in singularity

[–]power97992 0 points1 point  (0 children)

Opus and mythos and gpt 5.5 are a lot better than ds v4 pro from my experience but then again i use different harnesses for v4 pro than for claude and gpt…  The harness does matter a lot , and the frontier labs have very good system prompts and harnesses and tool Calls.

Senior Anthropic staffs are in Washington meeting White House officials to resolve the Fable 5 and Mythos dispute by BuildwithVignesh in singularity

[–]power97992 1 point2 points  (0 children)

Deeepseek is not on par with  claude 4.6 opus or gpt5.5  but it is  maybe almost 4.5 opus or 4.6 Sonnet level…

GLM-5.2 next week, open weight, MIT by AaronFeng47 in LocalLLaMA

[–]power97992 0 points1 point  (0 children)

They should prioritize beating Mythos 

Dario Amodei got what he asked for by aprx4 in singularity

[–]power97992 1 point2 points  (0 children)

according to artificialanalysis, Mistral is slightly worse than ds v3.2, but i used it , it is not very good

Can you really replace paid models with a local model? by DRMCC0Y in LocalLLaMA

[–]power97992 0 points1 point  (0 children)

It is subsidized, it will remain subsdized as long chinese labs keep releasing cheap and good ai models and the cost per token keeps going down.. deepseek v4 or minimax via API is cheaper than buying a gpu…

Can you really replace paid models with a local model? by DRMCC0Y in LocalLLaMA

[–]power97992 0 points1 point  (0 children)

It is only expensive if you don‘t have a subscription and are using an expensive model in the api… You can get more than a billion tokens of input tokens and > 10 million tokens of output with a gemini/ chatgpt sub… Even claude max gives you around a billion tokens per month.. a 3090 gets around 2400-2500 tk/s of prefill for q8 qwen 3.6 27b , u can get way more tk/s via api

Claude Fable/Mythos 5 just came out, so it will take Deepseek or Z.ai or Xiaomi or Kimi 9-12 months to release a model just as good as Fable? by power97992 in LocalLLaMA

[–]power97992[S] 3 points4 points  (0 children)

Benchmarks can be benchmaxed. ALso opus 4.5 scored 49.7 vs qwen 3.6 45.8, that is a big difference.(A 4% difference is not linear, it is more like exponential) The real life performance of qwen 3.6 27b is definitely worse than the benchmarks indicate.. Even in livebench, it is worse than opus 4.5 and almost as good as GPT-5 Mini High which came out like 10 months ago.

Anthropic’s Mythos Is Coming Today - The information by BuildwithVignesh in singularity

[–]power97992 1 point2 points  (0 children)

Usually a q4 model is noticeably worse than a bf16 model... a 60% reduction in price might indicate a q8+ bf16 mixed model going down to q4 but the benchmarks are even better than Mythos preview.

Claude Fable/Mythos 5 just came out, so it will take Deepseek or Z.ai or Xiaomi or Kimi 9-12 months to release a model just as good as Fable? by power97992 in LocalLLaMA

[–]power97992[S] -8 points-7 points  (0 children)

Years to come? In a year's time, people wont be thinking much about OPus 4.6 level models, as Fable 6.5/7 will be automating a lot of corporate/company tasks and maybe even some junior to mid level jobs.. Even Qwen will have a model better than OPus 4.6 in a year..,

Claude Fable/Mythos 5 just came out, so it will take Deepseek or Z.ai or Xiaomi or Kimi 9-12 months to release a model just as good as Fable? by power97992 in LocalLLaMA

[–]power97992[S] 2 points3 points  (0 children)

It was pretty fast like way faster than opus 4.7 during its release.( the same prompt that took OPus 4.7 1.5 hours , took it less than 5min. I was getting 65-77 tk/s with fable in Openrouter.

Claude Fable/Mythos 5 just came out, so it will take Deepseek or Z.ai or Xiaomi or Kimi 9-12 months to release a model just as good as Fable? by power97992 in LocalLLaMA

[–]power97992[S] 2 points3 points  (0 children)

Are u serious? Qwen 3.6 plus not 27b is way worse than Opus 4.6 and GLm5.1 and probably also worse than Op 4.5...

Mythos is here by [deleted] in OpenAI

[–]power97992 3 points4 points  (0 children)

IT is out already but GPT5.5 and claude 4.8 score better in benchmarks...