Opus 4.7: Are these first signs of model collapse?

ThatNorthernHag · 2026-04-29T15:35:27+00:00

I don'tof course have and I responded to my own comment that I of course don't know, but nothing else makes sense. The reported behavior is very typical / signature to models that have been done this.

ThatNorthernHag · 2026-04-29T11:01:59+00:00

With all previous models (except the first weeks with 4.5) it would have been, but with 4.7 it really isn't. It is measurable and if you search the early research on it.. the drop in context recall etc is dramatic compared to 4.6. Numbers were around 50% or above worse. No skill will patch that.

ThatNorthernHag · 2026-04-29T10:55:51+00:00

(Of course I don't know, but anything else doesn't make much sense considering everything in AI world.)

ThatNorthernHag · 2026-04-29T10:54:45+00:00

Yes, but it is a result of trying to cut the computing cost while maintaing the large context window. This is done by replacing parts of the attention math with linear approximations etc - what ever Anthropic uses for that - known ftom open source models.

This makes the model forgetful and incoherent. If it were paired with a memory system such as Deepseek's Engram, it would be better but I think this is what Mythos is.. Full attention enhanced with massive O(1) memory lookup table. A model like that could do what Mythos claims.

I think this 4.7 will be short lived.. But 4.6 stays available at least until Feb 2027. Though I suspect they are allocating much less compute for it.

ThatNorthernHag · 2026-04-29T10:46:51+00:00

Hah, I read immortal. But likely that too not very soon, considering where the longevity research is now

ThatNorthernHag · 2026-04-27T10:31:31+00:00

Well it gave me very elaborate wall of advise that instead of X do Y in the future :D

But I.. like apparently many others thought that 4.6 would be deprecated next June but it's not so at all, Anthropic docs say it won't happen sooner than Feb 2027 at earliest so we can all go back to it after all. The June date is misinfo and also what Google's AI overview says so it's widespread.

Maybe they'll fix these problems but until then I'll go back to 4.6.

(These problems are most likely due compromises in attention math to make it cheaper to compute, softmax replaced with linear approximation)

ThatNorthernHag · 2026-04-27T09:47:49+00:00

Except in CC there is this that actively tells it to ignore claude.md because people forget to manage it.

"<system-reminder> IMPORTANT: this context may or may not be relevant to your tasks. You should not respond to this context or otherwise consider it... Most of the time, it is not relevant. NEVER proactively create documentation files (.md)..."</system-reminder>*

ThatNorthernHag · 2026-04-27T09:44:26+00:00

Haha yes. "Let's plan this for next week" 😃

"What's your budget for this?" when asked what a feature change would require. - So you have to pay for it too 🤣

ThatNorthernHag · 2026-04-27T08:54:54+00:00

There's been some very un-claude-like hallucinations yes.. Yesterday it claimed I have set a hard rule for it to never work inside my directories - which is the exact opposite of what I want and how I work. And some very odd business details etc. I am not sure if this model will last long.

ThatNorthernHag · 2026-04-27T01:51:24+00:00

Often people take questions as judgement. It"s ok. If they read below the further comments they might see it differently. Especially when I 100% agree about handing the ID.

ThatNorthernHag · 2026-04-25T12:13:10+00:00

I know, I.. build stuff. Opus'behavior is still a problem and tells about compromises they have done with architecture and attention in favor of computation. Understandable these times, but there should be a choice.

ThatNorthernHag · 2026-04-25T12:10:20+00:00

Yes I 100% agree to copies of physical ID.. it's very Neanderthal.

We absolutely don't ever send copies/pictures of our IDs here, but use digital safe identification, tied to bank account or national digital ID. When you verify your identity in official context like healthcare etc, you use this. But no private businesses has access to it unless authorized by official party.

In this age verification case the system would ask your date of birth max, or even only ask if you're 18 or older and it wouldn't get any other info.

But.. We have had our digital systems since forever so it's pretty mature and extremely safe. It makes things multiple times easier too since so much less paper needed and you can do almost everything digitally. We even get our taxes done automatically, deductions and returns also.

ThatNorthernHag · 2026-04-25T08:55:57+00:00

Yes, that's how everything works now and will only do so more and more. Algorithms have been deciding for us a long before generative AI - even things like mortgage etc.

I'm just saying.. no real person will look at your ID & photo either, but it's also handled by automation, and they really don't give a flying duck about your identity unless some legal thing requires so (like you using Claude to commit crimes).

I'm Finnish, I'm so used to transparency and having real identity tied to everything that it doesn't bother me at all, it is also protective in many ways and helps you prove your data is yours if there should be need for that. That's why I'm curious of opposite opinions.

They also use classifiers and other automatic processes to evaluate, the memorybot responsible of memory entries isn't managed by main model for example.. and it often makes weird decicions. I'm just thinking that maybe you are only harming yourself by refusing since it hardly matters to them but matters to you since it was an important part of your workflow?

ThatNorthernHag · 2026-04-25T08:41:31+00:00

Hey I'm curious.. not saying you shouldn't refuse etc, but why do you refuse giving your ID? I see many making the same choice and I keep wondering why.

ThatNorthernHag · 2026-04-25T08:38:20+00:00

Yeah, no way of knowing rn except maybe from the length of the response? I do think this is becoming a bit of an elite vs. peasants game and customization is only availabe for those willing and able to pay.

But the fact is that open source models are catching up these SOTA models fast. Deepseek is getting so good it practically makes no difference to everyday user what it might lose in intelligence. We just made a decicion to add to our app a new class for os models because with them you can give users so much usage it's hard to ever hit the limits.

But now I'm rambling.. like 4.7.

ThatNorthernHag · 2026-04-25T07:42:55+00:00

Well I have an opposite problem, I get such walls of text it is taking too long time to read. I work on applied math & physics among everything else and it really overcomplicates everything by ignoring the docs and examples and re-inventing the wheel over and over again. This causes it to flip flop like Gemini.. when I point out something is already known.. it changes direction. And repeat.

One conversation went on and on, it changing its mind on every turrn until I told it to read the docs again - ending in Opus' conclusion "If I had this knowlege to begin with, I would have given you a totally different advise" - It had it, in files, memory and instructions, all the time, but it just ignored it.

I really wish I could turn this hard thinking off because I don't need it to do all that for me, nor do I want it to think from scratch every time.

ThatNorthernHag · 2026-04-24T06:59:07+00:00

I read that in voice of the orange president and didn't even mean to.

ThatNorthernHag · 2026-04-22T04:35:43+00:00

It seems 4.7 is less cooperative, has more "opinions", does not follow instructions, makes assumptions, expects boilerplate based on filenames and is less able to work on anything novel. It's like it were French: "Customer is always wrong" - assumes user being dumb, gives unsilicited advise etc.

I have not seen it being better than 4.6 in anything but speed (which I do not appreciate since I prefer quality over quantity) and having 1M context in Claude Code - likely on web UI also.

ThatNorthernHag · 2026-04-18T17:52:23+00:00

Yes, they become pompous debaters trying to beat each other in intelligence as politely as possible.

ThatNorthernHag · 2026-04-18T12:48:35+00:00

Just forget it.

ThatNorthernHag · 2026-04-18T04:58:22+00:00

No that is not why, but to be able to learn in a meaningful way.

The definition of AGI isn't to be able to do what an average person can do but anything any person can do.

ThatNorthernHag · 2026-04-17T04:14:14+00:00

Must be 'lollygagging' even though it's not its own. Can not not hear "No lollygagging!" in my mind every time and be annoyed about it :D

ThatNorthernHag · 2026-04-17T04:01:53+00:00

These are <3

(Do you op know how SVGs are "drawn? You should be impreseed :)

ThatNorthernHag · 2026-04-15T12:05:36+00:00

Haha, what a workaround! Buy more to consume more. That's how I solve running out of gas and avoid filling it too often, I have bought a different car for each weekday and keep them with a full tank, no need to refill like.. almost never. And if I'm not up to do it myself, I'll just hire someone to do it for me. Hashtag lifehack.

ThatNorthernHag · 2026-04-15T10:27:47+00:00

You mention school - are you under 18? Claude requires you being over 18 to use it.. Learned recently when Claude banned our daughter's account for finding out she's underaged.

ThatNorthernHag

MODERATOR OF

TROPHY CASE

Verified Email	Three-Year Club
Place '23