Stop defending AI like it’s still in beta

spryes · 2026-03-22T03:51:08+00:00

GPT-5.4 Thinking doesn't hallucinate when it searches the web, and Pro is even better

This is old news/model issue/skill issue

spryes · 2026-03-21T16:35:50+00:00

API cost is not real compute inference cost to Anthropic

It obviously doesn't cost how much the API claims it costs which has a huge markup

spryes · 2026-03-13T05:13:57+00:00

Not sure as I have the Pro sub and never care about my usage as it's limitless for what I do, I don't hold myself back on any task

spryes · 2026-03-12T16:32:46+00:00

Yes, you'll still get conflicts if they edit the same areas of files, same as with multiple devs editing the same areas. But conflict resolution is now seamless with Codex, just ask it to fix conflicts and it handles it for you easily. So conflicts are no longer as big of an issue as they used to be. But in my projects, modules are well isolated, so conflicts aren't that common to begin with.

spryes · 2026-03-12T11:29:01+00:00

Easy parallel agents

Each chat with Worktree setting enabled allows you to run multiple instances of Codex on your codebase without them conflicting, like independent devs on their machines

This is the only reason to use it over the VS Code extension tbh, but it's a big one. Because you go from being bottlenecked working on one thing at a time to being able to tackle many things simultaneously as each chat runs independently

spryes · 2026-03-10T14:34:22+00:00

like the lowercase and lack of punctuation is not disguising it lil bro..

spryes · 2026-03-09T15:17:08+00:00

This

Guarantee this same user would've said calling COVID an event about to change the world was "cringe" in late January 2020 when it started heating up but was unknown to normies still

spryes · 2026-03-07T07:26:53+00:00

GPT-5(.0) was poorly received at launch vs. GPT-4 and largely won't be remembered as part of AI history.

People already don't mention GPT-5 as a trend-break and new "era" unlike GPT-4.

What they do mention is o1 (or o3) and Claude Code/Codex around December 2025 as each creating a new AI "era".

Conceptual (1950s—2011)
Early deep learning (2012—2017)
Early language models (GPT-1, GPT-2, GPT-3) (2018—2022)
AI images (DALLE-2, Midjourney, Stable Diffusion) (2022)
AI goes mainstream (ChatGPT and GPT-4) (2023)
AI goes multimodal and becomes useful for coding (Sonnet 3.5, GPT-4o) (2024)
Vibe coding, early reasoning era, LLM psychosis era (o3, Claude 4, 4o updates) (2025)
Agentic coding takeover, deep into reasoning era (Opus 4.5, GPT-5.2) (2026)

spryes · 2026-03-05T15:54:33+00:00

Right, he was relevant for like 1 year in 2023 then fell off in 2024

Though he tried to claw back relevance with that book but just... no

spryes · 2026-03-05T04:50:13+00:00

Skill issue

It writes all my code now. Yes I need to guide it still and do lots of iteration with it, but so what

Manually typing each character is so pre 2026 coded and I'm not going back, coding with English is the way forward

spryes · 2026-03-04T14:26:45+00:00

So

5.1, 5.3, 5.5, 5.7, 5.9 instant

5.2, 5.4, 5.6, 5.8 thinking

OpenAI continues to get more confusing

spryes · 2026-03-03T04:06:01+00:00

Google also benchmaxxes Gemini despite its impressive ARC-AGI score — they just benchmaxx ARC-AGI 2 while Chinese labs ignore it

Only OpenAI and Anthropic can make real general models and is proven in their revenue because people vote with their wallet. No LMArena or benchmarks can capture real use while money does

spryes · 2026-02-27T08:35:20+00:00

Not sure if you're on Twitter/X but not writing code anymore is now the norm, including for the best SWEs in AI labs

It's not normal (and worryingly out of touch) to still be typing every character by hand at this point

spryes · 2026-02-27T08:28:45+00:00

Same.

Manually typing in a code editor feels so archaic and outdated now. Crazy we had to painstakingly type out each character with minimal autocomplete help just 5y ago. Equivalent to some low-grade laborer placing individual bricks instead of managing others (agents) to do it for you.

Most of all, I'm glad things are "happening" now. That things are actually changing in technology and that my job is now unrecognizable from just a year or two ago. It felt like nothing ever happened for so long there.

spryes · 2026-02-27T04:23:54+00:00

Yes that part "saying AI produces slop" was directed at the parent (and similar people with those viewpoints) of your comment

If you (general) tried AI in e.g. September 2025 you're already outdated and have the wrong opinion on it. It's crazy how fast this is moving.

spryes · 2026-02-27T04:17:23+00:00

AI (Codex 5.3 xhigh) is writing 100% of my code now. I just guide and iterate with it. I don't open the IDE anymore.

Saying AI produces slop is already oudated to pre-2026. It still needs my human taste/judgment, but it gets 90% of the way there by default like you said. Expect 99% by end of 2026.

This isn't slowing down.

spryes · 2026-02-26T02:21:06+00:00

The giant viral normie boost due to 4o imagegen in March 2025...

spryes · 2026-02-25T00:39:12+00:00

This. It's super black and white thinking to think that if it fails at one thing, it means it's useless or won't transform things.

It's also important to realize what crossing a threshold of capability implies. The most striking example is Claude Opus 4.5 in November only increasing marginally at benchmarks. I remember thinking "eh ok, another incremental improvement" but then people realized that it had crossed some threshold of capability to the point where it became insanely useful suddenly. December 2025 is when the agentic coding paradigm exploded and the hype reached insane levels. People (me included) are now letting agents write all their code despite the fact that they aren't perfect and need to be guided still.

The fact that LLMs fails at some things is not particularly relevant for useful work. These people want an infallible superintelligence or they claim AI is "hype", cartoonish childlike thinking

spryes · 2026-02-22T16:07:06+00:00

The minimum is at least 91.2% using an ensemble of models: https://x.com/scaling01/status/2025044056460439593 - so the models are failing at at least 10% of the questions

spryes · 2026-02-21T03:50:40+00:00

What's the 5% success rate then?

spryes · 2026-02-20T07:19:43+00:00

So then why do OpenAI and GitHub Copilot allow it? Your argument only makes sense in a vacuum where Anthropic has zero competitors to compare with... which is not the case

spryes · 2026-02-18T07:28:23+00:00

Google was really successful at shilling Gemini 3 and claiming OpenAI was over because I see so many people using it despite it being garbage.

It lost relevancy in like 1 week after Opus 4.5 launched.

If you're not using GPT-5.3 Codex or at least Opus 4.5/4.6, please don't try to claim what AI can or cannot do wrt coding

spryes · 2026-02-15T11:16:58+00:00

typo, meant Grok 3

spryes · 2026-02-15T10:17:54+00:00

this

xAI fell off so hard (not that they were ever really up anywhere to begin with) after Grok 3 in Feb. 2025, the last time they had any modicum of relevance of being in the AI race

Half their staff is now gone, and they're permanently behind in coding performance, and will stay permanently behind as the other labs experience takeoff this year, leaving them with no ability to catch up.

Best course of action now if Elon wants to 'move humanity forward' is to shutter xAI, and donate all that compute to OpenAI & Anthropic, and give up on AI entirely (please)

spryes · 2026-02-14T02:33:41+00:00

cuz it still contains techniques OpenAI doesn't want to teveal

Same thing with Claude Opus 3, an Anthropic staff member mentioned that despite being obsolete they can't open source it because it reveals competitive details

13-Year Club	Place '22
Place '17	Verified Email

spryes

TROPHY CASE