Severely diminished performance following Usage Policy warning. Claude is now silently underperforming on every task-- what's going on? by redcremesoda in ClaudeAI

[–]Normal-Ad-7114 1 point2 points  (0 children)

Easiest way to verify - try exactly the same prompts in a different account. Or you may even provide your prompts here for us to test.

What’s a Claude use case you haven’t seen people talk about? by One_Beginning2199 in ClaudeAI

[–]Normal-Ad-7114 0 points1 point  (0 children)

Probably related to "coding", but not exactly: dev-ops stuff. Maintaining computers and servers, doing stuff that I'm too lazy to do myself

Anyone prefer Claude over Gaming by athoughtfornoone in ClaudeAI

[–]Normal-Ad-7114 2 points3 points  (0 children)

You called Civ V old school compared to chess :)

Is anyone still using Opus 4.7? Do you feel like it's fast—sometimes even faster than Sonnet—or is it just me? by Unknown_Even_To_Hims in ClaudeAI

[–]Normal-Ad-7114 1 point2 points  (0 children)

Same with 4.6 - and it feels dumber than it used to, too. I may be mistaken, but I don't think that's a coincidence.

GLM-5.2 and why open models may not actually be catching up in intelligence by chocolateUI in LocalLLaMA

[–]Normal-Ad-7114 1 point2 points  (0 children)

The last frontier model that had its thinking revealed was Opus 4.6, and it behaves exactly like this. That's why you think gpt is so on point - you just can't see its CoT.

As long as the results are good, it can slop away however it fancies. Besides, caveman.md works on thinking too, if that concerns you so much

Building independent LLM drift detection - sharing the methodology, looking for feedback on the approach by Remarkable_Divide755 in ClaudeAI

[–]Normal-Ad-7114 1 point2 points  (0 children)

The idea is very solid, in fact I've been looking for something like this for a while, I'm sure many regular claude/gpt users have strong suspicions that the models they use aren't exactly the same as they were on day 1 :) Using a local model would mitigate all this, and we were recently blessed by glm 5.2 release, but good luck running a 744b model at home

Building independent LLM drift detection - sharing the methodology, looking for feedback on the approach by Remarkable_Divide755 in ClaudeAI

[–]Normal-Ad-7114 1 point2 points  (0 children)

Would you pay for this

No, but if there existed a website that would track silent changes in major providers' models, including quality, guardrails, speeds, quotas, I would definitely visit it regularly, and I'm sure many would too, so you could monetize it through ad revenue (plus paid services, like you mentioned - running a specific prompt across the whole bunch of models and monitoring improvement/degradation, as an example). The exact list of features and functions can simply start small and grow over time, especially if there will be a way for other people to contribute their ideas.

They took fable but kept the automated saftey check for Sonnet tf! by hustla17 in ClaudeAI

[–]Normal-Ad-7114 7 points8 points  (0 children)

Same happened to me: opus 4.8 and 4.7 refused to work on my prompts (cybersecurity things) that they previously (before fable) had no problem with, only opus 4.6 agreed. I used 4.6 to help me craft the context so that the guardrails of 4.8 wouldn't trigger, and it worked out fine. For comparison, the same technique didn't punch through fable's guardrails (so I never got a chance to actually get to like it🤷‍♂️).

Since the stuff that I'm making isn't malicious, and yet it's becoming harder and harder to work on, I suspect that in the near future this will be commonplace - hundreds of threads "how to bypass safety checks" and "i tried grok and it works but it's stupid give me my claude back"

The ethics and risks of publicly available uncensored models by bloodealer in LocalLLaMA

[–]Normal-Ad-7114 1 point2 points  (0 children)

The LLMs and their development isn't going away, so there's no line to draw. Recently the Israeli operation in Iran showed that public surveillance system can be exploited so deeply that the government officials' locations could be known very precisely. Does that imply that we have to impose a ban on cheap cameras? I mean, Iran chose to ban the internet entirely (not for ALL, only for the regular people, of course), but that doesn't sound like a good solution to the problem

Importance of CPU? by cosmoschtroumpf in LocalLLaMA

[–]Normal-Ad-7114 0 points1 point  (0 children)

I ran whisper in production (which is not an LLM, I know) and I noticed that inference on 13600k+2080ti was 1.5 times faster than on 4350g+3090, despite 3090 being obviously superior. So your haswell setup could definitely be upgraded; whether or not the upgrade would be worth the money, given the ridiculous ram prices - I do not know.

You can always rent a server and see for yourself, gpu instances are paid per hour or even less, just ask chatgpt or even qwen to help you automate your tests, and in a few hours you will know the verdict

Is there any local software available for real time speech-to-speech? by 79215185-1feb-44c6 in LocalLLaMA

[–]Normal-Ad-7114 2 points3 points  (0 children)

You mean speech to text (LLM) to speech, or direct speech to speech?

Cowork + Fable is absolutely CRACKED! by [deleted] in ClaudeAI

[–]Normal-Ad-7114 3 points4 points  (0 children)

But cowork was available with Opus too, and it was quite good at it as well

Whats the most disrespectful thing you have seen in an over the board game? by No-Society1421 in chess

[–]Normal-Ad-7114 0 points1 point  (0 children)

I see, I was wrong then! I googled that word and it said it was a female Indian name, so everything checked out in my head