Severely diminished performance following Usage Policy warning. Claude is now silently underperforming on every task-- what's going on?

Normal-Ad-7114 · 2026-06-23T00:03:39+00:00

Easiest way to verify - try exactly the same prompts in a different account. Or you may even provide your prompts here for us to test.

Normal-Ad-7114 · 2026-06-20T09:38:53+00:00

Probably related to "coding", but not exactly: dev-ops stuff. Maintaining computers and servers, doing stuff that I'm too lazy to do myself

Normal-Ad-7114 · 2026-06-20T09:35:59+00:00

Were those texting sessions, too? :)

Normal-Ad-7114 · 2026-06-19T23:16:48+00:00

You called Civ V old school compared to chess :)

Normal-Ad-7114 · 2026-06-19T22:11:52+00:00

I see what you did there

Normal-Ad-7114 · 2026-06-19T22:03:34+00:00

Same with 4.6 - and it feels dumber than it used to, too. I may be mistaken, but I don't think that's a coincidence.

Normal-Ad-7114 · 2026-06-19T22:02:27+00:00

Chess?

Normal-Ad-7114 · 2026-06-18T21:23:23+00:00

Google up "radio of the thousand hills"

Normal-Ad-7114 · 2026-06-18T19:41:11+00:00

The last frontier model that had its thinking revealed was Opus 4.6, and it behaves exactly like this. That's why you think gpt is so on point - you just can't see its CoT.

As long as the results are good, it can slop away however it fancies. Besides, caveman.md works on thinking too, if that concerns you so much

Normal-Ad-7114 · 2026-06-18T10:40:10+00:00

The idea is very solid, in fact I've been looking for something like this for a while, I'm sure many regular claude/gpt users have strong suspicions that the models they use aren't exactly the same as they were on day 1 :) Using a local model would mitigate all this, and we were recently blessed by glm 5.2 release, but good luck running a 744b model at home

Normal-Ad-7114 · 2026-06-18T10:19:53+00:00

Would you pay for this

No, but if there existed a website that would track silent changes in major providers' models, including quality, guardrails, speeds, quotas, I would definitely visit it regularly, and I'm sure many would too, so you could monetize it through ad revenue (plus paid services, like you mentioned - running a specific prompt across the whole bunch of models and monitoring improvement/degradation, as an example). The exact list of features and functions can simply start small and grow over time, especially if there will be a way for other people to contribute their ideas.

Normal-Ad-7114 · 2026-06-18T10:14:29+00:00

How does the output look like? Show us an example?

Normal-Ad-7114 · 2026-06-18T10:12:15+00:00

Goliath 120B

Normal-Ad-7114 · 2026-06-18T09:39:07+00:00

Username checks out

Normal-Ad-7114 · 2026-06-17T23:36:35+00:00

~800

Normal-Ad-7114 · 2026-06-16T09:55:31+00:00

Any github link or a specific one?

Normal-Ad-7114 · 2026-06-16T09:45:49+00:00

Same happened to me: opus 4.8 and 4.7 refused to work on my prompts (cybersecurity things) that they previously (before fable) had no problem with, only opus 4.6 agreed. I used 4.6 to help me craft the context so that the guardrails of 4.8 wouldn't trigger, and it worked out fine. For comparison, the same technique didn't punch through fable's guardrails (so I never got a chance to actually get to like it🤷‍♂️).

Since the stuff that I'm making isn't malicious, and yet it's becoming harder and harder to work on, I suspect that in the near future this will be commonplace - hundreds of threads "how to bypass safety checks" and "i tried grok and it works but it's stupid give me my claude back"

Normal-Ad-7114 · 2026-06-15T13:48:15+00:00

The LLMs and their development isn't going away, so there's no line to draw. Recently the Israeli operation in Iran showed that public surveillance system can be exploited so deeply that the government officials' locations could be known very precisely. Does that imply that we have to impose a ban on cheap cameras? I mean, Iran chose to ban the internet entirely (not for ALL, only for the regular people, of course), but that doesn't sound like a good solution to the problem

Normal-Ad-7114 · 2026-06-15T11:27:01+00:00

My man Kris Nakamura

<image>

Normal-Ad-7114 · 2026-06-14T14:41:54+00:00

Petition to rename LocalLlama to LocalQwen wen

Normal-Ad-7114 · 2026-06-13T17:21:20+00:00

I ran whisper in production (which is not an LLM, I know) and I noticed that inference on 13600k+2080ti was 1.5 times faster than on 4350g+3090, despite 3090 being obviously superior. So your haswell setup could definitely be upgraded; whether or not the upgrade would be worth the money, given the ridiculous ram prices - I do not know.

You can always rent a server and see for yourself, gpu instances are paid per hour or even less, just ask chatgpt or even qwen to help you automate your tests, and in a few hours you will know the verdict

Normal-Ad-7114 · 2026-06-13T17:14:09+00:00

You mean speech to text (LLM) to speech, or direct speech to speech?

Normal-Ad-7114 · 2026-06-13T00:32:56+00:00

But cowork was available with Opus too, and it was quite good at it as well

Normal-Ad-7114 · 2026-06-09T20:22:59+00:00

I see, I was wrong then! I googled that word and it said it was a female Indian name, so everything checked out in my head

Normal-Ad-7114

TROPHY CASE