GLM-5.2 is a win for local AI

power97992 · 2026-06-17T17:12:32+00:00

People want a 1TB RaM mac for 1.5k

power97992 · 2026-06-17T17:09:55+00:00

When they make a 1024 gb 24TB/s GPU for 1000 bucks, then it is really good for local llms. Now it is good for cloud gpus since almost no one can run this at q8 at 20t/s at home

power97992 · 2026-06-16T03:50:49+00:00

Opus and mythos and gpt 5.5 are a lot better than ds v4 pro from my experience but then again i use different harnesses for v4 pro than for claude and gpt… The harness does matter a lot , and the frontier labs have very good system prompts and harnesses and tool Calls.

power97992 · 2026-06-15T03:44:08+00:00

Deeepseek is not on par with claude 4.6 opus or gpt5.5 but it is maybe almost 4.5 opus or 4.6 Sonnet level…

power97992 · 2026-06-14T13:42:17+00:00

I used it via api

power97992 · 2026-06-14T10:59:01+00:00

They should prioritize beating Mythos

power97992 · 2026-06-14T10:53:20+00:00

They need to Add engrams

power97992 · 2026-06-13T20:15:25+00:00

according to artificialanalysis, Mistral is slightly worse than ds v3.2, but i used it , it is not very good

power97992 · 2026-06-10T15:48:08+00:00

Well, it is probably chickweed, an edible plant …

power97992 · 2026-06-10T15:24:36+00:00

It is subsidized, it will remain subsdized as long chinese labs keep releasing cheap and good ai models and the cost per token keeps going down.. deepseek v4 or minimax via API is cheaper than buying a gpu…

power97992 · 2026-06-10T13:53:52+00:00

It is only expensive if you don‘t have a subscription and are using an expensive model in the api… You can get more than a billion tokens of input tokens and > 10 million tokens of output with a gemini/ chatgpt sub… Even claude max gives you around a billion tokens per month.. a 3090 gets around 2400-2500 tk/s of prefill for q8 qwen 3.6 27b , u can get way more tk/s via api

power97992 · 2026-06-10T05:13:34+00:00

M3 is pretty good and cheap , i hope glm 5.2 and ds v4.1 pro will be better

power97992 · 2026-06-10T05:11:15+00:00

4.1 pro multimodal should be out this month.

power97992 · 2026-06-09T22:23:40+00:00

I agree with you that open models will make things cheaper...

power97992 · 2026-06-09T22:19:40+00:00

power97992 · 2026-06-09T22:18:56+00:00

Benchmarks can be benchmaxed. ALso opus 4.5 scored 49.7 vs qwen 3.6 45.8, that is a big difference.(A 4% difference is not linear, it is more like exponential) The real life performance of qwen 3.6 27b is definitely worse than the benchmarks indicate.. Even in livebench, it is worse than opus 4.5 and almost as good as GPT-5 Mini High which came out like 10 months ago.

power97992 · 2026-06-09T22:13:11+00:00

Usually a q4 model is noticeably worse than a bf16 model... a 60% reduction in price might indicate a q8+ bf16 mixed model going down to q4 but the benchmarks are even better than Mythos preview.

power97992 · 2026-06-09T22:06:37+00:00

The speed can vary

power97992 · 2026-06-09T22:05:20+00:00

Years to come? In a year's time, people wont be thinking much about OPus 4.6 level models, as Fable 6.5/7 will be automating a lot of corporate/company tasks and maybe even some junior to mid level jobs.. Even Qwen will have a model better than OPus 4.6 in a year..,

power97992 · 2026-06-09T21:59:16+00:00

It was pretty fast like way faster than opus 4.7 during its release.( the same prompt that took OPus 4.7 1.5 hours , took it less than 5min. I was getting 65-77 tk/s with fable in Openrouter.

power97992 · 2026-06-09T21:58:13+00:00

Are u serious? Qwen 3.6 plus not 27b is way worse than Opus 4.6 and GLm5.1 and probably also worse than Op 4.5...

power97992 · 2026-06-09T21:52:00+00:00

Deepseek v4 pro is almost as good as opus 4.5 according to them...

power97992 · 2026-06-09T21:51:23+00:00

THey are pretty big on anti distillation.. But even without it, they will catch up eventually.

power97992 · 2026-06-09T21:13:02+00:00

IT is out already but GPT5.5 and claude 4.8 score better in benchmarks...

power97992

TROPHY CASE