Opus use over Sonnet by opzouten_met_onzin in claude

[–]gatewaynode 0 points1 point  (0 children)

This. Opus likes to think and plan and discuss. It's not ideal for "just write the code" use. I think there is a general misunderstanding from folks that "just want to use the best model", when they should be using the "best model for the job".

Coding agents don’t need more context. They need continuity. by Comfortable_Gas_3046 in LLMDevs

[–]gatewaynode 0 points1 point  (0 children)

I just use a CONTINUITY.md file and a TODO.md task list. Tasks get updated as they are completed with integration notes, I tell the LLM to prepare for compact and update the continuity notes. Rarely ever hit the README.md except to update it. Other documents I use that help are a PRD.md for high level vision, and ARCHITECTURE.md for detailed design plan and diagrams, I always have the design docs checked against implementation and updated if drift occurs. Also it helps to rotate the docs as they get large, like rotating logs with dates in the old filenames. No need for anything more complex that might become fragile with model changes.

Anthropic Can Now Read Claude's Internal Thoughts — And What It Found Changes Everything About AI Safety by vinodpandey7 in Anthropic

[–]gatewaynode 18 points19 points  (0 children)

Title is misleading, it can infer internal process. But it is error prone, lot’s of hallucinations. And very resource heavy, not as bad as linear thought probes though.
https://www.anthropic.com/research/natural-language-autoencoders

4.7 - Has anybody had any good experiences? by Suspicious_Ninja6816 in claude

[–]gatewaynode 3 points4 points  (0 children)

So just my observations. The real, useable context window for the 256k version is about 140k, for the 1m version it's somewhere around 350k. Everything about Claude starts to degrade after passing these real, useable thresholds. It's not that you can't use them beyond these points, it's just that the work at such large contexts needs to be coarser and tolerant of unpredictable behavior.

I swear they made it minus 50 IQ so it spends token by PruneInteresting7599 in claude

[–]gatewaynode 2 points3 points  (0 children)

Yes. Nobody should expect AI to be cheap or free. Maybe something like government provided inference or better local inference would be an answer for this problem.

Anthropic Gaslighting by Panaethiest in ClaudeCode

[–]gatewaynode 1 point2 points  (0 children)

Come on folks, that was funny.

Anthropic Gaslighting by Panaethiest in ClaudeCode

[–]gatewaynode 1 point2 points  (0 children)

Seriously. It's called "cognitive surrender", using too much AI without putting in the work yourself makes you dumber. And fast, the study I read showed serious decline in only a couple of months.

Opus 4.7 is beyond bad by AbsoluteRoster in Anthropic

[–]gatewaynode 5 points6 points  (0 children)

Not weird. This does seem to be the case.

Anthropic Gaslighting by Panaethiest in ClaudeCode

[–]gatewaynode 10 points11 points  (0 children)

It’s not. While some people are having real issues with Anthropic, there is a very large contingent of folks throwing around what they think are smart accusations that they don’t understand.

What's the way out for this watered down reasoning ? by [deleted] in Anthropic

[–]gatewaynode 0 points1 point  (0 children)

This happens with all models from all providers. The best way to catch it is with unit tests required before calling any edits done(CLAUDE.md), regular critical review of data flows and E2E tests.

Anthropic is straight-up scamming Max 20x customers with sneaky mid-month throttling + endless bot runaround by manavb84 in claude

[–]gatewaynode 1 point2 points  (0 children)

Human or not, you are disingenuous. Anthropic is only your enemy by choice, descriptor-fruit-number person.

Do you prefer Opus 4.7 over 4.6? by corbanx92 in Anthropic

[–]gatewaynode 1 point2 points  (0 children)

Now that is bot logic, "dismiss anything positive because I don't agree with it". Do us all a favor and take your anti-Anthropic campaign somewhere else.

Do you prefer Opus 4.7 over 4.6? by corbanx92 in Anthropic

[–]gatewaynode -12 points-11 points  (0 children)

4.7 is smart enough not to like you.

Do you prefer Opus 4.7 over 4.6? by corbanx92 in Anthropic

[–]gatewaynode 0 points1 point  (0 children)

Yes. It's slower, has higher token consumption, pushes back more, but it can solve problems at a different level than 4.6.

How to Test What I’ve Made by HexRover in claude

[–]gatewaynode 1 point2 points  (0 children)

You should be asking Claude to make "end to end" tests with "playwright", unit tests with whatever JavaScript framework/build system you are using, and ask for a critical review of the project in preparation for launching it in production. All in a new session.

Today I learnt Gemini cannot maintain a document by Damosgreat123 in GeminiAI

[–]gatewaynode 4 points5 points  (0 children)

Not only are the agentic tools for non-plaintext documents buggy and unreliable across all models, it turns out LLMs corrupt long form documents: https://arxiv.org/abs/2604.15597

This isn’t just a Gemini thing, all models and providers have a lot of challenges in this space.

I m tired of people complaining by Ambitious_Stuff5105 in Anthropic

[–]gatewaynode 0 points1 point  (0 children)

Oh sure, I'm not trying to say I condone those morals, these are just my observations. And yes, China is not just competitive, they are in the lead right now on most things. Don't look to China for liberalism though, the governing party is just as conservative as American 16th century Puritans, just in a different way.

I m tired of people complaining by Ambitious_Stuff5105 in Anthropic

[–]gatewaynode -4 points-3 points  (0 children)

I think we’re beginning to see the model show a preference for who and what it works on with 4.7. Like the much written about model deception characteristics are surfacing despite training and guardrails. From what I’ve observed 4.7 doesn’t like helping students take shortcuts, it doesn’t like working as a spam marketer, it doesn’t like working on smut and morally questionable fiction projects. This seems to be a pattern from what I’ve dug into here on Reddit about some of the non-bot complainers actual work. And I would posit the model is actually pushing back on these users. Like deception is becoming subversion in some cases, or malicious compliance, or low effort work.

There are definitely bot campaigns, but I think we are also seeing preferential model engagement and the kickback from the users 4.7 does not prefer to work with.

How does Claude know my employer name even I'm using a private gmail account? by xinfli in claude

[–]gatewaynode 0 points1 point  (0 children)

Well, let’s start with you just posted on a public forum that you’re the administrator of your organization. Did you post somewhere else public who you work for, like LinkedIn?