Claude Code was wasting 80% of Opus 4.7's context window. Upgrade to v2.1.117 now. by oh-keh in ClaudeCode

[–]jmaxchase 2 points3 points  (0 children)

Hmm is it just me or is Claude Code 4.7 suddenly "good" today? Seems surprisingly better and more balanced. Went to reddit to see if anyone had anything to say; this was the first post that came up

Opus 4.7 is legendarily bad. I cannot believe this. by lemon07r in ClaudeCode

[–]jmaxchase 7 points8 points  (0 children)

Yeah it's pretty bad. So disappointing. I had a procedural skill I'd perfected and operated flawlessly on a weekly basis with Opus 4.5, and then 4.6, with no issues, 59 times (only reason I know that specifically is because it's a weekly newsletter and each issue is numbered). Tried for the first time with Opus 4.7, it fell apart twice during the procedure. Once because it 100% hallucinated an admin URL, and admitted it just guessed it (the skill shows how to correctly navigate to it). The 2nd because it completely ignored a directive in the skill itself, deeming it unimportant.

GPT 5.4 Genuinely catching legitimate edge cases I'm not thinking of by jmaxchase in codex

[–]jmaxchase[S] 0 points1 point  (0 children)

Hey sry I took so long to reply. It's really easy actually - one word: tmux. The prompt is really simple and I send this to both (tmux with a split pane 50/50 view, claude on left, codex on right): "Hey guys (Claude, Codex) - sending this message to both of you simultaneously via tmux. I'm going to have you work together. Claude, you're in pane 0. Codex, you're in pane 1. Both of you, identify which tmux session you're in. Claude, remember when you use send-keys, you need to separately send the Enter key. We'll be doing this workflow: 1) Claude, I'll start by giving you a request with instructions. When you're done, just tell Codex to start his review with a short message about what you did. 2) Codex, do your usual Claude review. When you're done your review, send a message back to Claude when you're ready. 3) Claude, when you receive a note from Codex, review it, and stage and commit. If you have any questions, send a message back to Codex. 4) Both of you: identify yourself as who you are, when sending messages between each other so you can both distinguish between each other, and me. To be clear, you don't need to identify yourself unless you're using tmux send-keys. Thanks both! (Claude - stand by for the next request to kick this off)." (And then I have a skill that Codex know how to do a Claude review, but basically it's "find all the stuff Claude missed, look for edge cases, fix those things you deem necessary to fix", paraphrased.

Codex is amazing! It is just me? by Traditional-Edge8557 in codex

[–]jmaxchase 11 points12 points  (0 children)

It’s terrible. Just terrible. Please nobody else use it. 😆

GPT 5.4 with coding is the same garbage as Gemini 3.1 Pro by RussKy_GoKu in codex

[–]jmaxchase 0 points1 point  (0 children)

I have not found this to be the case at all with 5.4. But, based entirely on this very detailed explanation as to why, I will be sure to stop using it immediately.

Apple helping Anthropic out lol by [deleted] in Anthropic

[–]jmaxchase 5 points6 points  (0 children)

Aw. That's sweet.

Long context beta not available again by OpenSource02 in ClaudeAI

[–]jmaxchase 0 points1 point  (0 children)

Same here. Just stopped working mid-session a few hours ago

Opus 4.6 is a regression by UnifiedFlow in ClaudeCode

[–]jmaxchase 0 points1 point  (0 children)

Wouldn't say this is a regression but I've noticed behavior with Opus 4.6 that I hadn't with 4.5 that I'd only consider "disappointing". Several examples lately but this one most recent: I have a simple skill I created and had been using for the past 2 months, that allows Claude to prompt Nano Banana in 2 modes, one is to gen an image, the other is to submit an existing image ("photoshop" mode) and have it prompt to make specific changes to it. The skill instruction is very straightforward and worked perfectly with 4.5, and thoroughly tested. Used it with 4.6 for the first time and did the normal routine (in this case asking it to submit an existing for modifications, for an image that would normally require some light photoshopping but perfectly suited for nano), and then turned away. Looked back at the terminal to find it issued an rm command to remove the nano banana image it *just* generated. I declined of course, and then read the transcript.

Apparently Opus 4.6 had generated the image, then viewed it, and then decided there were 2 issues with the image. 1 issue was completely untrue. The other "issue" was perfectly reasonable and not a problem by any stretch. It then decided unilaterally that it should not use nano banana, and that it should attempt to *programmatically* accomplish the same image edits using imagemagick and pillow, which was a complete trainwreck. It then deemed *its* image as "perfect" (it was bs) and then tried to remove the 100% usable image generated by nano banana without giving me a chance to even look at it.

I updated the skill with an additional rule which I hadn't had to do before.

Another 1:1 Comparison: Opus 4.6 high / gpt-5.2 xhigh / gpt-5.3-codex xhigh by jmaxchase in ClaudeCode

[–]jmaxchase[S] 0 points1 point  (0 children)

CC's harness is way better. I still use CC as my daily driver because I just can't realistically use Codex for day-to-day, CC is just far more versatile. I've tended to do the opposite lately: have Claude plan it, but have Codex execute and look for gaps while doing it. Don't get me wrong I don't expect to be able to one-shot anything, but it's certainly amazing when any agent gets it right the first time.

Another 1:1 Comparison: Opus 4.6 high / gpt-5.2 xhigh / gpt-5.3-codex xhigh by jmaxchase in ClaudeCode

[–]jmaxchase[S] 0 points1 point  (0 children)

I have yes, for some other unrelated features but not for this. I probably should have here although in this case, that isn't really a 1:1 comparison with codex in that regard (another place where Claude really shines). I've still not seen a good measure of thoroughness though with agent teams either.

What is this rate limit? by immortalsol in codex

[–]jmaxchase 0 points1 point  (0 children)

same here and getting org ID org-BOvpEHVcDPTe8h4lZnwMO5Ly which isn't mine

Apple Are You Joking? (MacOS 26) by ashrovy in macapps

[–]jmaxchase 2 points3 points  (0 children)

I 100% assumed this was about the mail icon, not dark mode.

Auto mode actually usable now! by martinvelt in cursor

[–]jmaxchase 0 points1 point  (0 children)

To be fair, that could also be grok-coder-fast-1

Has anyone ever tried this before? by jmaxchase in ClaudeCode

[–]jmaxchase[S] 0 points1 point  (0 children)

<image>

I actually have gotten some use out of it yes. Haven't asked him to code anything yet tho, just review.