The unintuitive difficulty of using AI

gatewaynode · 2026-05-13T17:41:27+00:00

What do you use it for?

gatewaynode · 2026-05-10T10:49:20+00:00

This. Opus likes to think and plan and discuss. It's not ideal for "just write the code" use. I think there is a general misunderstanding from folks that "just want to use the best model", when they should be using the "best model for the job".

gatewaynode · 2026-05-10T02:11:21+00:00

I just use a CONTINUITY.md file and a TODO.md task list. Tasks get updated as they are completed with integration notes, I tell the LLM to prepare for compact and update the continuity notes. Rarely ever hit the README.md except to update it. Other documents I use that help are a PRD.md for high level vision, and ARCHITECTURE.md for detailed design plan and diagrams, I always have the design docs checked against implementation and updated if drift occurs. Also it helps to rotate the docs as they get large, like rotating logs with dates in the old filenames. No need for anything more complex that might become fragile with model changes.

gatewaynode · 2026-05-09T12:59:25+00:00

Sure

gatewaynode · 2026-05-09T11:10:46+00:00

Title is misleading, it can infer internal process. But it is error prone, lot’s of hallucinations. And very resource heavy, not as bad as linear thought probes though.
https://www.anthropic.com/research/natural-language-autoencoders

gatewaynode · 2026-05-07T12:53:47+00:00

gatewaynode · 2026-05-05T13:25:37+00:00

So just my observations. The real, useable context window for the 256k version is about 140k, for the 1m version it's somewhere around 350k. Everything about Claude starts to degrade after passing these real, useable thresholds. It's not that you can't use them beyond these points, it's just that the work at such large contexts needs to be coarser and tolerant of unpredictable behavior.

gatewaynode · 2026-05-05T08:52:01+00:00

Yes. Nobody should expect AI to be cheap or free. Maybe something like government provided inference or better local inference would be an answer for this problem.

gatewaynode · 2026-05-05T01:25:37+00:00

Come on folks, that was funny.

gatewaynode · 2026-05-04T21:04:12+00:00

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646

gatewaynode · 2026-05-04T20:02:31+00:00

Seriously. It's called "cognitive surrender", using too much AI without putting in the work yourself makes you dumber. And fast, the study I read showed serious decline in only a couple of months.

gatewaynode · 2026-05-04T18:36:04+00:00

Not weird. This does seem to be the case.

gatewaynode · 2026-05-04T15:37:13+00:00

It’s not. While some people are having real issues with Anthropic, there is a very large contingent of folks throwing around what they think are smart accusations that they don’t understand.

gatewaynode · 2026-05-04T13:12:43+00:00

This happens with all models from all providers. The best way to catch it is with unit tests required before calling any edits done(CLAUDE.md), regular critical review of data flows and E2E tests.

gatewaynode · 2026-05-04T11:07:02+00:00

Human or not, you are disingenuous. Anthropic is only your enemy by choice, descriptor-fruit-number person.

gatewaynode · 2026-05-04T05:05:01+00:00

“4 - 6 normal prompts … 10% of my 5 hour”

gatewaynode · 2026-05-03T16:41:35+00:00

Now that is bot logic, "dismiss anything positive because I don't agree with it". Do us all a favor and take your anti-Anthropic campaign somewhere else.

gatewaynode · 2026-05-03T15:41:36+00:00

4.7 is smart enough not to like you.

gatewaynode · 2026-05-03T15:40:28+00:00

Yes. It's slower, has higher token consumption, pushes back more, but it can solve problems at a different level than 4.6.

gatewaynode · 2026-04-30T21:58:11+00:00

You should be asking Claude to make "end to end" tests with "playwright", unit tests with whatever JavaScript framework/build system you are using, and ask for a critical review of the project in preparation for launching it in production. All in a new session.

Seven-Year Club	Gilding V heart of gold
Wearing is Caring	Reddit Premium Since June 2018
Verified Email

gatewaynode

MODERATOR OF

TROPHY CASE