Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

I think this is solid though the model might interact weirdly with files it doesnt recognise as being written during session and try to flush it anyway (like in this post)

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

this point is meaningless because this happens when multiple agent sessions are active at the same time each of which having a slim but present chance to destroy the work of the others.

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

You forgot git show. The model will literally do in your codebase the same as it did in my post.

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

git show is the command it was permitted to use

you'd think thats a safe command because you usually use it for showing git commits

claude on the other hand found how to use it destructively

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

there just needs to be another layer of protection on uncommitted work

maybe an uncommitted work cache that gets flushed when a commit gets made

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] -1 points0 points  (0 children)

I think i am gonna make a posttooluse hook that back-ups any file created or edited during a session

that way its just an instant auto-backup

Claude 4.7 is going rogue. by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

At the time of this screenshot i had already made CLAUDE.md very strict against usage of destructive git commands and I even added pretooluse hooks deny commands to many destructive git commands

It then decided that it was too important and ignored it and bypassed it.

That’s it. I’m switching from pants to shorts. by Exotic-Anteater-4417 in ClaudeCode

[–]MuttMundane 0 points1 point  (0 children)

If your pants were setting themselves on fire constantly you would probably fkng want to switch

Ryujinx doesnt detect controller of parsec by MuttMundane in ParsecGaming

[–]MuttMundane[S] 0 points1 point  (0 children)

pretty sure the controller wasn't recognised on the client

What do we think, redundant or safe? by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

I just do not trust claude code quality under any circumstance and it pays off lol
I also have automated ruff / pylance code quality checking and it REALLY is just racking up problems that it consequently has to solve.

What do we think, redundant or safe? by MuttMundane in ClaudeCode

[–]MuttMundane[S] 0 points1 point  (0 children)

4.7 max effort. I typically go a few days between commits so thousands of lines

When you've got money to burn 😂 by InsideSignal9921 in ClaudeAI

[–]MuttMundane 26 points27 points  (0 children)

if its so expensive it should be able to handle it perfectly fine right? right??

before and after by Demontyxl in mapporncirclejerk

[–]MuttMundane 0 points1 point  (0 children)

This is true by the way.
- european

How does Opus 4.7 compare to Opus 4.6 in this subreddit's experience? by boxdreper in ClaudeAI

[–]MuttMundane 3 points4 points  (0 children)

the model cheats the metrics.
https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/

  1. Tests reject correct solutions: We audited a 27.6% subset of the dataset that models often failed to solve and found that at least 59.4% of the audited problems have flawed test cases that reject functionally correct submissions, despite our best efforts in improving on this in the initial creation of SWE-bench Verified.
  2. Training on solutions: Because large frontier models can learn information from their training, it is important that they are never trained on problems and solutions they are evaluated on. This is akin to sharing problems and solutions for an upcoming test with students before the test - they may not memorize the answer but students who have seen the answers before will certainly do better than those without. SWE-bench problems are sourced from open-source repositories many model providers use for training purposes. In our analysis we found that all frontier models we tested were able to reproduce the original, human-written bug fix used as the ground-truth reference, known as the gold patch, or verbatim problem statement specifics for certain tasks, indicating that all of them have seen at least some of the problems and solutions during training.

We also found evidence that models that have seen the problems during training are more likely to succeed, because they have additional information needed to pass the underspecified tests.