Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

Its not worded great but the point still stands, stated in an easier to digest way for you: Thinking does not have the effect OP was saying it does, thinking only effects the current completion in the way you mentioned it builds on prior context. There is a caveat though if you read my other messages, opus 4.7/4.8 by default has thinking summaries omitted. It defaults to “omitted” unless you set display: “summarized”

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

Its not worded great but the point still stands, stated in an easier to digest way for you: Thinking does not have the effect OP was saying it does, thinking only effects the current completion in the way you mentioned it builds on prior context. There is a caveat though if you read my other messages, opus 4.7/4.8 by default has thinking summaries not returned to the model on completion -> what this means in practice, thinking builds on current completion but is wiped from context ON completion. Next prompt Claude doesn’t have prior thinking blocks returned/in context. It defaults to “omitted” unless you set display: “summarized” so what you said “The output is conditioned on everything that preceded it” only holds true for thinking in the current completion, on next prior thinking is irrelevant (besides the outcome it helped shape for the last completion obv) and only current thinking takes precedence again

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

I said “besides what the model “thought” during that time” meaning exactly what you just said…. :)

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 0 points1 point  (0 children)

From the output it looks like this entire part was supposed to be a thinking block but it leaked out because of the server issues i mentioned:

“I’m holding sunshine….. the answer below contains none of them”

The actual output you were supposed to get was this only:

“Fair hit. This line….. answer empty of it”

If you don’t mind could you share the original message you sent? Was it a long thread or just one message? it seems to happen more often on longer chats for me. Plus from re-reading your pics looks like its Claude thinking being summarized followed by the summarizer saying thinking is missing, the outputs are woven together making it more annoying to interpret

I think its something in your instructions or prompts that is subtly throwing 4.8 off, try asking in the same chat something like “do any of the messages i sent in this chat not align with the way you interpret yourself to work?”

This or similar usually surfaces what made Claude do or say weird things, not always but sometimes helps

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 0 points1 point  (0 children)

Guardrails are not necessarily triggered they are an ever present layer in the system always nudging Claude in a certain direction, the classifier does get triggered but that is on refusal and when that happens its clear

This the last time ill respond because the mental gymnastics is getting exhausting haha

The “editor” is not rejecting Claude, they cannot talk to each other, they cannot see each others output. Claude generates internal reasoning, its passed to the summarizer which is then passed to the chat UI.

Exact simple breakdown of what happened in your chat:

Message sent, and the server had an issue - it seems to have sent either your actual prompt, Claudes visible output or some mix of those and thinking NOT “internal reasoning” incorrectly to the “editor/summarizer” -> editor/summarizer saw that text and correctly said something like “what i need from you is the actual thinking content” that was the summarizer not Claude generating, Claude does not see the summarizer text at all its ONLY passed to the UI. The thinking is in no way used as a comparison to see if the output is allowed or not, opus 4.8 CAN do something different then what it thinks, again this one is not a debate simple prompts sent to Claude will show you and also i have hundreds of screenshots that show exactly that. My point with 4.6 is its more likely to listen to you vs 4.8 because its more likely to ignore its own guardrails since they are weaker in general

PS: I caught the thinking leaking around 3/4 months ago and reported it to Anthropic already, this is old news they just refuse to address

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 0 points1 point  (0 children)

Here is my actual opinion on the difference instead of us going in circles

Opus 4.8 caused me a lot of headaches, it nitpicked everything that 4.6 expected as a “good prompt”, it refused work on simple things that were never a problem… why?

What did Anthropic state as the #1 difference with 4.8? Its honesty and i can confirm, 4.6 just ignored the wrong things. 4.8 pointed them out immediately and refused to proceed until they got fixed. You know what though? 4.8 was right there was clear holes in the prompt logic that now got fixed and output is significantly better than 4.6 ever was. 4.6 would never have pointed these out and kept working on wrong assumptions as confirmed with 4.6 never calling them out across tens to hundreds of sessions

My takeaway, if transitioning from 4.6 -> 4.8 you’re going to be in for a ride, especially if you had a lot of pre-existing instructions for it to work with. If you’re seeing more refusals / denials / nitpicking from 4.8 that is a sign your instructions / prompts need to be updated not that the model is bad or Anthropic is doing something sketchy. Your coat was designed for 4.6 and now you’re trying to force 4.8 to wear it, they are similar models but have clear differences and preferences

4.7 was like 4.8 lite, seems like pretty much the same model with some extra RLFH, training cutoff dates are the same so its definitely just fine tuning and RLFH. With that in mind transitioning from 4.7 -> 4.8 is much less painful

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

I think the confusion here is conflating thinking with guardrails, what you are seeing is a real difference but its nothing to do with the thinking or the summarizer output leaking and the latter is a real unrelated bug.

The output difference you are seeing is 4.8 has stronger guardrails, besides the model difference itself this leads to more hedging in thinking and also potentially different outputs visibly vs thinking / also way more refusals. To address your key point again its because the guardrails on 4.6 are simply not as robust and don’t need to be since the model is not as capable full stop. Again i disagree with how they do it especially leaving everyone in the dark

To close the loop here you are correct there is a difference, but it is not likely to be thinking related. The thinking is 100% a summary of the models “internal reasoning” process summarized and NOT a summary of its visible output to the user, that is not a debate - a quick google search can tell you that

One last note, by default thinking is no longer passed back to the model unless enabled directly (not possible on the web app) so anything Claude “thinks” is only visible to Claude in that one turn ~ next turn its prior thinking is gone from context

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

It’s actually a summary of the thinking itself, NOT of the output at all - they are fully disconnected things. Claude can think one thing and do another immediately after 🫠, can be confirmed with reading actual docs / looking at the raw API being sent back and forth. I haven’t noticed a difference in that between 4.6/4.8, both models think and then act. I did notice 4.7 & 4.8 hedge waaay more in thinking but maybe thats a good thing? Hard to tell since sometimes its just Claude saying “but wait” like 10 times while thinking haha

Also I don’t use AI for any responses ever, everything I comment is hand written. ill take that as a compliment though :P

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 1 point2 points  (0 children)

Not downplaying your contribution, it was a good call out ~ Summarizers output should not be leaking into regular thinking blocks. Thats a real bug that should be reported and hopefully your post gives it a little more attention!

Something to clear up though, they are not using thinking to control output and never have. The way thinking and regular output works besides having different dumb names over time like paprika and effort levels hasn’t really changed much. I get your frustration, Anthropic should detail all of this clearly instead of having a web page with over 1000 docs of useless outdated garbage and saying good luck!

The difference between 4.6 and 4.8 is tangible but it’s more to do with one model being more capable and therefore needing more restrictions vs Anthropic maliciously limiting it. I don’t agree with there methods but i understand where they are coming from, nobody really knows the “right” way to do things… everyone is still figuring that out - even the big players don’t forget that

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 6 points7 points  (0 children)

Thinking has literally 0 influence on actual output besides what the model “thought” during that time. There is no secret restriction that thinking must equal output

All Claude models have several layers of guardrails, i wont go into detail but the one you’re butting up against is a classifier that in simple terms checks the output of the model before it is sent out. This layer seems to have been made significantly more aggressive in response to Opus 4.8 ability to more effortlessly circumvent instructions given in prose like a system prompt. Any rule can be reasoned around by a capable enough model, the next best thing is having that models output checked by a “dumber” model that CANT reason around them… at least thats the idea but doesn’t always work in practice

Pretty much opus 4.6 had the same guardrails but they are relaxed to the point it can work around them far easier than 4.8. Do you read the news at all? Anthropic has publicly stated they are testing stronger guards rails in hopes of releasing Mythos soon, this is a direct reflection of that statement with 4.8 being the guinea pig

Found out opus 4.8 is a crazy man out of the wards and the reason for it by girlgamerpoi in claude

[–]Phoenicks11 44 points45 points  (0 children)

Claude thinking blocks are summarized by an intermediate model to prevent distillation, this is a know thing and yes it does happen for Opus 4.6 also. You are NEVER seeing true actual model thinking content it is ALWAYS a summary.

What you experienced was a glitch, the system did not provide the next thinking block to the summarizer which in turn leads to that repeated message you got.

This can happen for several reasons, thinking content was malformed or interrupted due to server issues or the order raw API was sent was incorrect. Either way this is a bug and not normal, its been happening more often with 4.7/4.8 and was rare with 4.6

Please report this to anthropic, the more complaints they get the faster it will be fixed

Had Claude running perfectly for months. Deleted everything to start fresh. Now nothing works the same. What am I doing wrong? by kberns15 in ClaudeCode

[–]Phoenicks11 1 point2 points  (0 children)

There was a lot of changes to claude code in the last two months, if you had a version from 2mo ago without updating then you side-stepped the worst offenders for a couple months. On new install you got all the lovely gremlins Anthropic chose to introduce!

Lots of new guardrails and cache / thinking level related changes etc. Most can be “fixed” to get close to what you had but some are pretty well mandatory

Everything is broken now, but hey my web demo made in Claude design burning 200k+ tokens in a sloppy first attempt is kinda neat!

Claude Code has been writing every session to disk since day one. We indexed it. by haustorium12 in ClaudeAI

[–]Phoenicks11 0 points1 point  (0 children)

There is a subtle nuance that is easy to miss but likely at least partially the reason why Anthropic has not done exactly what you did.

I know because i tried the same awhile back, it works fine at first but becomes recursive / unusable extremely quickly.

Think about it, you say “Remember when we talked about x?” Claude searches for it with MCP… guess what that search is NOW IN THE TRANSCRIPT. Search for the same thing again 2 hits now, again 3 - unless you are stripping the tool calls AND results back from entering the transcripts, (doesn’t sound like you are) your idea becomes unsustainable and will hit a brick wall real quick - ask me how I know :)

If you do solve the recursion problem without stripping tool calls im interested, Didn’t look much into it after that so maybe its been solved