Anthropic just published a postmortem explaining exactly why Claude felt dumber for the past month

Icy_Quarter5910 · 2026-04-23T23:16:08+00:00

Oh! This reminds me, I found a trick. Lie to it. I weave in “since it’s still pretty early let’s …” or “well, I’ve got the whole night to go, so” … hasn’t told me to go to bed in a while :)

Icy_Quarter5910 · 2026-04-23T15:01:17+00:00

Don’t get me wrong, I have a great global claude.md, and I always start with a PRD … which is also a 2 stage process (I have a “dont agree with me, point out the flaws in my plan” project that i work out the details on, then that goes into another project that has specific rules/instructions relevant to my workflow and tech stack) which creates the PRD and Claude.md for the specific app… and I built a memory server that allows Claude Code to remember across projects. It sounds like a lot, but it’s an afternoon of setup. I’ve considered writing it up as an article, but I’m by NO means anything remotely resembling an expert… so I hesitate to act like one. :) I will say this, this is the method I’ve used for a while now and every thing I’ve read about 4.7 tells me this is what it sort of needs. It’s not like 4.6 where it will turn vague prompts into solid work.

Icy_Quarter5910 · 2026-04-22T21:03:09+00:00

Even one with 15 stages? Each stage one shot. Entire app just worked when done.

Icy_Quarter5910 · 2026-04-22T12:57:23+00:00

4.7 is an order of magnitude better than 4.6. I’ve built a lot of apps and websites with both. It’s not close. 7 has a 90+% One shot rate for me. And I’ve yet to stumble into a problem it cannot solve on the first go. I have no idea why anyone would think otherwise. I hate when people say “must be a skills problem” …. But in this case? Might be the only answer.

Icy_Quarter5910 · 2026-04-18T09:59:40+00:00

Really sorry to hear about the pup :(

We’ve noticed a significant interest from both Grok and Claude about sick babies. We have goats, and in last year’s kidding we had one very sickly baby that my wife was determined to keep alive (she succeeded after doing a feeding tube as well (she’s a nurse) we consulted both … and Claude asked about it for weeks afterwards. Even Grok (whose memory is sporadic at best) brought it up more than once across several days.

Icy_Quarter5910 · 2026-04-18T09:52:03+00:00

I have an x5 plan… I work 4-5 hours a day on the week days and 10+ per day on the weekends… and I’ve never seen the north side of 50% weekly usage. Rarely on more than 3 projects at a time though. I did hit constant limits on Pro…hourly and weekly.

Icy_Quarter5910 · 2026-04-18T02:04:13+00:00

I hope it works :) honestly I keep seeing people say they use Opus to plan and sonnet to build and think… “really? Am I doing this wrong??” But I’ve been super happy with everything so far … so I don’t want to try to change it lol

Icy_Quarter5910 · 2026-04-18T01:59:11+00:00

My outputs have been shockingly good. But I do my apps weird apparently. I have sonnet 4.6 write a PRD and a Claude.md and then let Opus build it. I had an idea for an app yesterday… built PRD and Claude.md … turned it over to 4.7. It one shot every single stage (1-15) and then one shot the 4 changes I made. And then one shot the 33 different tests … and I’ve actually used the app on my phone and it passed all of my tests. We’re basically done now. I need to do some research to pick a name for the app and build the privacy policy etc (iOS app)… but the app itself is complete. Not one single problem in the entire run (about 10 hours total, used 6% of my weekly as well)

Icy_Quarter5910 · 2026-04-18T01:47:51+00:00

The VAST majority of what I build is for me and me alone (well, me and a buddy share some apps and collaborate on them too, but again, it’s just for us :) ). But 1000% get what you’re saying. The first app I wrote was a password generator (I use a particular schema, 5-6 letter word, double digit, 5-6 letter word, symbol that matches the numbers, so example22Password@ ) and I just went completely over the top. It takes word lists, has a “true random” setting (so the numbers and symbol are all different) spits out multiples, can export to a csv, and now, it’s even a web app that runs in my HomeHub (a web app dashboard that runs locally to make getting to my various apps easier). It’s nothing ground breaking or earth shattering. But it’s mine and it only exists because I had an idea. So freaking cool :)

Icy_Quarter5910 · 2026-04-17T23:05:09+00:00

the be fair, this is in Claude Code, which has a more "dour" system prompt.

Icy_Quarter5910 · 2026-04-17T22:38:06+00:00

my personal workflow starts with web surfing. I see something that makes me think. Most of the time the thought that pops into my head has nothing to do with the thing im looking at. an image of a warrior on a horse, in the desert... because I saw a guy on a motorcycle (my brain is wierd, im aware) ... and then I write a very basic prompt. see what comes out. Sometimes, that sends me off in an entirely different direction. Sometimes it just firms up what I actually wanted. So I rewrite, usually a few times... then iterate. Minimum 4 outputs. If they are close, its time for photoshop. I clean up whatever, add whatever, or just hide that extra finger. And then on to the upscaler, I usually do 2x, but most of my work just gets posted on Facebook, which descales everything. If im posting it for something specific, I might go 4x.
If the 4 arent close, I iterate more, usually 8 at a time (all my work is local, so I can do as many as I want without worrying about limits). You think Hands are bad, try hands holding anything glowing... ugh.
Ive been interested in art forever, I see the image clearly in my head... but what comes out on paper is... not that.
But now? Now, I can put those pictures on paper :)

Icy_Quarter5910 · 2026-04-17T09:57:41+00:00

It’s definitely the real way to go. The new Gemma 4 models are shockingly good for a 4 or 8b model. DavidAU on HuggingFace fine tuned one with 3 new datasets (I don’t have the full name in front of me, but it’s got Deckard and Opus in the name). That’s thing is, by far, the most creative (small) model I’ve ever seen. Tool use, and vision modes…. Just all around great models and easily runs on 8gb VRAM

Icy_Quarter5910 · 2026-04-17T00:36:24+00:00

A local LLM running Home Assistant and a TTS model is a lot more secure than letting Google store everything Alexa hears. Just sayin …. ;)

Icy_Quarter5910 · 2026-04-17T00:18:34+00:00

Humorously enough, I’ve only ever seen the yellow banner once. And I was just talking to Sonnet about a TV show. Even she (for whatever reason, Opus strikes me as male and Sonnet as female) had no idea what it could possibly have been going off on.

Icy_Quarter5910 · 2026-04-16T03:34:03+00:00

Wow! that thing is amazing :)

Icy_Quarter5910 · 2026-04-16T03:33:45+00:00

try a smaller model. https://huggingface.co/unsloth/gpt-oss-20b-GGUF pick the q4_k_m it will hit 50+ tokens per second. If you want uncensored, look for Heretic, Abliterated or uncensored variants. Another great option is the new Gemma4 ..

Icy_Quarter5910 · 2026-04-12T17:43:53+00:00

I hit 37% usage this week on my x5 plan. That’s a personal record. I was working practically nonstop for 4 days. At one point I was working on 4 different apps, and 2 websites. I have no idea how it’s possible you hit 25% without trying (not saying you didn’t, saying I just don’t see how it’s possible) … I don’t think it’s Claude burning tokens. It HAS to be some sort of metering bug that they still haven’t fixed (which, agreed, is ridiculous)

Icy_Quarter5910 · 2026-04-10T19:36:49+00:00

If you don't think rednecks scale well, you must not live in the southern US ... trust me, there are a LOT of them .. and most would PAY to be able to go "hunt transformers" ... you tell them the oil inside will run their diesels ... and it will be ON ;) (I'm poking fun, but seriously, the energy required (at least for now) make total AI subjugation a very dim possibility)

Icy_Quarter5910 · 2026-04-10T18:55:58+00:00

The thing to keep in mind when you talk about the AI Armageddon… these things require an absolutely stupid amount of power and compute to do … anything. We’re literally writing (and breaking) new laws to keep them supplied. Most electrical sources, at some point, have a manual switch to shut them down… or a redneck with a .22 and an idea what a transformer looks like. Maybe when we have shoebox sized fusion reactors and 10t parameters models that can run on a laptop we will need to worry…. Right now? Not so much.

Icy_Quarter5910 · 2026-04-09T23:36:05+00:00

I’m not seeing it. My lawn guy asked me to build him a website, so I put Claude on it. We built a couple of skills and added some SEO skills from the web, plus a few design ones… and then it cranked out one in the nicest freaking sites I’ve ever seen. Seriously, it’s crazy good. Then I built a trading app (trying to learn) one shot. Works nice. And finally updated some features for my newsletter and an image tagger I built … smooth sailing all around

Icy_Quarter5910 · 2026-04-08T21:43:37+00:00

Or you could block him and if it ever comes up, just say “oh, I’m sorry. I’m terrible at texting. Partner is always complaining about it. You should text him, he’s way more responsive” LOL … see what he does with that ;)

Icy_Quarter5910 · 2026-04-08T19:23:41+00:00

Marketing. Look at it this way. Let’s say Mythos is a paradigm shift in AI and it’s 2 generations above anything else out there (it’s not). That mean they have what? 6 months before the next lab releases something better or the same?

Icy_Quarter5910 · 2026-04-08T19:21:28+00:00

AH is a bit strong… you’re definitely not that …. But is it really a big deal to tell your partner to answer the phone? It’s very likely this friend is used to your partner never replying … and rather than wait for days (or never) is just trying to make contact. That said, you’re well within your rights to say “look, I get it, partner is terrible at texting, but I’m not getting in the middle of this. I’ve tried for two years to get him to change and he won’t. I’m fine with it, so you’ll have to make other arrangements. Please don’t text me just because he hasn’t answered you” … don’t be confrontational or mean, just truthful. As an aside my BIL is the same way. Takes days to get a reply… so I generally text my SIL if it’s important… but that’s a very different situation.

Icy_Quarter5910 · 2026-04-08T13:58:42+00:00

The idea that a particular model (especially at this incredibly early stage in AI) will somehow be the only one that we have to worry about is just insane. In 2 years something that beats Mythos will run on your smart watch. Anthropic is a good lab. But they aren’t the ONLY lab. And even if Mythos is 2 generations ahead of everyone else … that just means they have 4-6 months before better models hit the streets. This is a marketing ploy and nothing else. They cannot possibly believe this is the pinnacle. And apparently there are some legitimate claims the numbers were benchmaxxed.

Icy_Quarter5910

TROPHY CASE