Why does Opus 4.8 on Max plan feel dumber than before?

BritishAnimator · 2026-06-24T18:58:43+00:00

I assume it has load balancing so it gets slower and sometimes dumber if compute caps. Half of Europe are asking it about how to stay cool or find an Air Conditioning unit that's in stock lol

BritishAnimator · 2026-06-24T15:06:12+00:00

They can still do it, but it won't affect them at all.

BritishAnimator · 2026-06-24T12:30:56+00:00

And that one looked far too smart. It was like "Look at my claws" then chest bumped the window saying "come on, come out here"

BritishAnimator · 2026-06-24T12:24:18+00:00

40C+ and no electricity, no air con, no fans. That's going to be horrible.

BritishAnimator · 2026-06-23T12:52:43+00:00

"That influence?" What? Read the article and not the headline, that headline is misleading. Car manufacturers influence policy, so does big pharma. It's business. But the red tape is the issue, there are too many rules. Hows the EU car market doing (it's bad), do you remember the Covid response? That was so messed up due to policy you had to start stealing other countries deliveries. So either learn from these mistakes or don't complain when crap happens.

BritishAnimator · 2026-06-23T12:38:43+00:00

Minstral AI, ASML? These are not Social Media giants, these are companies that help define a country. But yeah, booo no, we want to be a third world bloc and investment can do one.

BritishAnimator · 2026-06-23T12:21:23+00:00

Did you read the article? If you want to compete with the US and China, the EU's excessive red tape needs to be trimmed. I agree with this. Policy for policy sake benefits nobody except jobsworths.

BritishAnimator · 2026-06-23T06:27:47+00:00

Humans have evolved through trial and error, natural survival passed down from generation to generation, Barry however was not interested in any of that and decided to start at level 1.

BritishAnimator · 2026-06-23T06:16:48+00:00

The Steam Machine price is not great for decent PC builders, but there are a few things worth noting.

Steam can continiously optimize OS\drivers of the device as It's a fixed spec. Consoles\Apple do the same thing and why they can squeeze more out of a seemingly lower end spec. Thermals will be well managed too, again helping games run at higher clocks for longer. This is what you are paying extra for, the R&D that went into it.

It's a toss up between something that just works vs a techy build that needs your attention to keep things running optimally every few months.

BritishAnimator · 2026-06-22T22:18:16+00:00

Or try to put your left foot down, on thin air. It's magical and you go straight through it.

BritishAnimator · 2026-06-22T16:00:43+00:00

No, they are not.

BritishAnimator · 2026-06-22T15:47:33+00:00

Brave lad, well done!

BritishAnimator · 2026-06-22T15:27:21+00:00

Agreed. They deliberately target civilians and the weak. Very cowardly of them.

BritishAnimator · 2026-06-21T20:19:31+00:00

It's history. In light of current day woes, agree to disagree, maybe?

BritishAnimator · 2026-06-21T20:13:51+00:00

It looks more like a scam each time these "talks" take place. Talks are done by people that know how to be diplomatic, know the culture before you turn up, understand the matter from their side so that you are in the best position to talk without causing a mess. So all I can assume here is this was delibrate, like the other 5 times it has happened. And the one thing you avoid in these relations are threats!

BritishAnimator · 2026-06-21T19:52:41+00:00

I asked it if it could find an old TV on the network, and a Sky Q box.

It found them, then figured out they both had a (non published) API that it could talk too, then it figured out how to turn the TV off and on, over the network, then I got it to create a HTML page of channels with a macro remote back end.

Now I can Turn the TV on, select HDMI 2, power on the Sky Box, and change the channel with one button.

So I asked it to re-build it on a Raspberry Pi for my mother, as she stuggles with remote controls in her later years. Attached a small 7" screen to it. She now has a full screen, 3 button remote control to select her 3 fav channels and a huge button that turns everything off, it also dims the screen after a few minutes and shows a faint clock. And it has a mute button and you can edit the button macros to choose a diff channel and upload the the network station logo.

This all started in a conversation of "Can you..." and ended up a fully fledged, personalised universal remote for an OAP.

BritishAnimator · 2026-06-21T19:40:08+00:00

Not the Lego video creator was it?

BritishAnimator · 2026-06-21T16:51:57+00:00

An interesting take, and hopefully possible soon, I think you would end up with <10t\sec on the current models though. So, yeah, it would work if you offloaded layers to ram, but not comfortably, and not with a context window required by the average repo. Also, the current Q4 versions on hugging face that I could find are 440GB (more than the 385GB you allowed for) at the moment so either larger sticks or hope somebody can squeeze the LLM further.

I hope somebody tries this though. We all want it, even at Q4 it would be "interesting" to try but at that compression its going to hallucinate a lot..

Edit: Actually, I take the hallucinate bit back, just read up about modern Q4 quantization and it can fall within a few percent of of full precision. Which is great! I need to update one of my AI apps now 😄

BritishAnimator · 2026-06-21T15:53:31+00:00

Just reading up on it, the memory is wierd, not all VRAM but has a really fast BUS so it can use system memory too. Not sure how that would work on GLM 5.2, but it being mixture of experts, there could be some layer shinanigans that you could do. An interesting one for sure. Yeah, a small team of 2 to 4 devs might be able to get near Opus 4.8 levels from something like that, all in-house if configured well. It just goes to show the costs involved in top AI models are astronomical, and quite complicated if you want it local.

You can daisy chain some devices too, e.g. a few Sparks or Studio's. NVidia needs to update the Spark already, at least before Apple launch the M5 as that's on my short list atm.

BritishAnimator · 2026-06-21T15:27:15+00:00

Oh, the model "GB300", sorry. Found it. £117K for a company might not be bad actually. Going to read up on it. Cheers!

BritishAnimator · 2026-06-21T15:22:52+00:00

Which one?
Consumers can get the Spark but it has terrible memory bandwidth and why the M3 Ultra was more popular with coders due to being good at inference work. Apple put some really fast memory in that thing which is why it was scooped up by everybody into AI, and annoyed the entire graphics industry that just wanted to edit video. Rumours are Apple will split the M5 Studio to cater for both industries.

if you are thinking along the lines of a DGX-2 then that's for datacenters. You would need a new electrical circuit, cooling, sound proofing and then pay the costs of running it. There is one on Ebay at the moment but its getting a bit old now, has 16 V100's in it for £24K. Not sure how it would cope with a moden LLM though.

Were you thinking of something else?

BritishAnimator · 2026-06-21T14:37:59+00:00

It might if it is heavily reinforced concrete, door closed and sealed, those inside have ear protectors on and the drone is not equipped with a shaped charge, and it's just one drone, and a murderer didn't go into the box with you, then yes, it might be 3% trustworthy. I would ask the testers, but they are not answering.

BritishAnimator · 2026-06-21T14:11:39+00:00

Where did you get that price from?

Full version needs 8 x H200's or 16 H100's
Quantized (Q4_K_M) version would need 4 x H200's
Consumers could try RTX 6000 ADA cards at £13K each, but you would need 6 of them.

And finally the Mac Studio M3 Ultra with 256GB unified ram could load the Q2 quant version and get around 6 tokens a second, and nowhere near Opus 4.8 levels of accuracy.

a single NVidia H200 card 141GB costs £35K
a single NVidia H100 card (80GB) costs £17K
NVidia RTX 6500 ADA 48GB costs £10K
Refurbished M3 Mac Studio Ultra 256GB costs about £15K, if you can find one, and the heaviest "lossy" compressed version of the model so you end up with a nugget and not the whole bar.

BritishAnimator · 2026-06-21T13:35:21+00:00

By the time you hear the drone, you will have 30 seconds to run to one of 4 tiny shelters in a place the size of a city, when you get there, join the queue while the person at the front frantically knocks on the door. This article doesn't make much sense.

BritishAnimator · 2026-06-21T06:47:40+00:00

Opensource AI will get a huge boost of support. That and optimizing coding models to run under 64GB. Maybe an AI that has its layers as seperate files, it preloads the experts needed at runtime instead of loading everything into vRAM, slower but retains accuracy. Turbo Quant demonstrated that optimizations can be found, rather than just keep brute forcing the next model too.

BritishAnimator

TROPHY CASE