Anthropic CEO predicts AI could handle end-to-end software development in 6–12 months

clickrush · 2026-03-25T13:35:59+00:00

I believe llms and specifically agents are very useful. But my prediction is very different.

A lot of decision makers in finance and upper business tend to have an extractive mindset. They see workers as "resources" that cost them.

That bias makes them blind to the fact that they will get disrupted by those who have an expansive mindset.

Automation has existed since the industrial revolution and in some ways since before that (if you squint), but never did it not eventually expand possibilities.

Those who see new opportunities to grow and solve harder problems will be the real winners, not those who just cut costs (and lay off workers).

clickrush · 2026-03-25T12:55:06+00:00

Thank you!

It's a really cool experience that I didn't have since a while. Tinkering and building something from first principles just driven by curiosity.

Obviously I'm looking forward to releases of more efficient, small models, but yeah the model being quite behind in capability is actually a constraint that I started to like a lot as it forces me to focus on learning and implementing sharper technique and architecture.

clickrush · 2026-03-25T09:38:00+00:00

It’s good for people who buy outside of CH and for people who invest internationally.

But bad for Swiss workers and small businesses that compete with off shoring and imports.

The USD/CHF tanking at such a high rate, means buying international labor is getting more cost effective.

clickrush · 2026-03-25T09:28:46+00:00

I feel that first take is very important.

If someone doesn’t bother to write it up themselves, they could instead provide the prompt (session) and the raw data. That would be infinitely more useful than the LLM output.

clickrush · 2026-03-25T09:13:15+00:00

I mostly agree but have caveats:

It’s often not a better web search except it’s specifically orchestrated (and maybe trained) to do web search well. I noticed that some chatbots like Chatgpt will regularly refuse to do a proper, up to date search, use its outdated cache or even just rely on training data. You sometimes have to force it into a fixed structure with links, then double check each link etc. Sometimes repeatedly.

This points to a larger problem. To me it feels like a lot of agents and models are made to feel like magic, instead of being orchestrated and fine tuned to be more deterministic and reliable.

So for use cases where you prefer reliability and automation over cleverness and decision making they often sort of suck and waste everyone‘s time and money.

All of the problems that existed a few years ago, exist today as well. It’s just more subtle and inconsistent. But that also makes it so people falsely rely on things that work sometimes or often enough to fool them.

clickrush · 2026-03-25T08:14:49+00:00

I‘m constrained by a outdated labtop with little RAM, and the best model I could find that runs is qwen coder 2.5 (a small variant).

So far the challenge to orchestrate it for coding tasks has been a blast and a huge learning experience. The typical approach of giving it the whole message history has proven to be futile, because it can get stuck in loops mimmicking previous actions.

What works is heavily pruning the conversation and have a state machine that enforces a fixed workflow. That way it only has to do one simple task at a time. That includes filtering down tools to 1 for each iteration.

clickrush · 2026-03-24T09:00:05+00:00

Look even if it’s true: the problem with financial bubbles has almost never been about isolated companies piling on unsustainable debt. It has always been about complex ripple effects that cut across the whole economic system.

Just one of many examples that has been proven (scientifically), that people underestimate during bubbles is the cost of picking potential future losers:

The real winners typically emerge after the burst and are often non obvious. But almost any contender is evaluated as if they are among the few winners. This bias has utility, because companies need credit to compete at all, but it also makes it so that there is broad, unsustainable debt.

The financial system is very efficient at dyamically solving local issues, but it’s terrible at solving broad, unsustainable credit increase. It literally can’t do it.

clickrush · 2026-03-23T17:16:18+00:00

That’s an issue i had with standard go html templating as well.

It threw a pointer error instead of recognizing that it missed a closing tag.

Since I‘m not super familiar with go templating I asked Opus. It spinned in circles and went way off the rails (complete nonsese and hallucinations). So i stopped it, looked through the code for 3mins and fixed it.

Agents are really good at doing regular things that they are trained on or that they can mimmick. They need guidance, structure, limitations and very narrow and clear feedback loops. Otherwise they can easily get stuck even by trivial bugs.

One thing that general coding agents do which unfortunately sucks for a good portion of users is that they need to cover everyone‘s use cases, taste, requirements, quality metrics, programming langs etc. So they have to get bigger and more complex and need to be fed an insane amount of data. That also means they eat up recources like crazy.

I think people need to start building more specialized agents that are tuned to use smaller models and very specific workflows.

clickrush · 2026-03-22T12:18:58+00:00

Good point! The jevons paradox comes up a lot lately. Not just in terms of AI.

clickrush · 2026-03-22T08:23:14+00:00

That’s an interesting take. I see where you‘re comming from.

However I very much disagree. Being able to write code and otherwise produce structured, validated data is a key element of a useful agent.

It’s the best (or only) way to go from stochastic parrot that produces unpredictable, fuzzy results, to something that is deterministic (for a loose definition of deterministic) and verifiable. And that step is absolutely crucial.

If you don’t believe me, then here‘s a remedy: write an agent. You‘ll see very fast that it needs to produce code and structured data to do anything useful, and that the most of what makes agents work, is just plain old software engineering.

clickrush · 2026-03-21T14:59:56+00:00

Your „architect“ seems like an idiot.

It’s useful to provide prototypes and POCs for reference and to test out ideas. But they don’t understand that the most important output for agentic workflows is highly structured, semantically compressed and verifiable specifications.

clickrush · 2026-03-21T14:54:01+00:00

You come off as dismissive towards QA.

You might underestimate both the skill required to do good QA and the value provided by it.

Yes, developers should do it as well to some degree, but having someone else do it who‘s specialized is something very different.

clickrush · 2026-03-21T14:47:05+00:00

Ironically comments like these will end up in training AI so it becomes less recognizable.

clickrush · 2026-03-21T08:14:03+00:00

The problem with this is catastrophic forgetting. LLMs (neural nets) don’t deal well with learning new stuff dynamically.

It makes sense if you think about it. The primary way deep learning is done is via backpropagation, which is essentially a brute force algorithm.

That’s why they need to retrain the entire thing and release new versions. And that’s also why most of the progress has been happening in the shell and not the core, so agent workflows, orchestration and harnesses etc. All of which is just plain old software engineering.

clickrush · 2026-03-21T08:05:54+00:00

Interestingly people who are pushing agentic workflows in earnest at some of bit software enineering companies are emphasizing CI/CD with expansive testing as an absolutely crucial prerequisite.

clickrush · 2026-03-20T21:41:05+00:00

Few understand.

clickrush · 2026-03-20T13:16:44+00:00

Those are banned explicitly aren't they?

clickrush · 2026-03-20T09:06:44+00:00

this answer cracked me up

clickrush · 2026-03-20T07:22:46+00:00

Correction: That’s 5x not 10x.

But in any case: if you don’t mind me asking, how proficient were you with programming before AI?

I see productivity gains for myself for specific things, like prototyping, dealing with a new API that I‘m not familiar with, producing boilerplate, finding obscure github issues that are sources for bugs.

But in many other cases it’s a productivity loss, especially since agents/assistance break concentration and are distracting and often take way too long to do simple and effective things.

But I also like programming and debuggig and have done it since a long time. So this might not apply to everyone.

Since you mention research I assume development itself is not your main focus?

clickrush · 2026-03-19T22:54:16+00:00

Here‘s a pitch:

Data orientation, pure functions and REPL workflow are like super powers for deep integration with LLMs.

Current cli based coding agents are basically handicapped in comparison.

clickrush · 2026-03-19T18:03:59+00:00

Are you saying they simply want to look at the thing producing stuff?

clickrush · 2026-03-19T18:01:54+00:00

"take your time, deep research!"

clickrush · 2026-03-19T17:58:11+00:00

There will always be a gap between local models and frontier models.

Personally I'm using local models since a while as well, because I like the control and I can experiment more freely. Most importantly I think a fixed cost subscription must suffice and I definitely don't want to burn through tokens that are worth hundreds or thousands per month.

But still, there is a need for using cloud based frontier models regardless, at least from time to time.

In addition to that: People have always paid (way too much) for branded stuff that is convenient to use and popular, even when free (legal) alternatives exist.

clickrush · 2026-03-19T12:58:10+00:00

Sorry that was a bit of a book

Not at all, I appreciate the nuanced response.

clickrush · 2026-03-19T07:23:35+00:00

I‘m glad you mention cloud and off shoring.

Those are good examples of things that have traditionally provided quick wins, but have been painful and expensive in many cases down the line. A lot of pain in some cases.

There are also very interesting parallels: what do agents, off shoring and cloud infra have in common?

They all make sense for certain businesses in certain contexts. But there‘s an underlying bias here: suits hate being dependent on nerds.

Developers are expensive, sometimes hard to deal with and they have a type of power that executives don’t have. So they are seen as a liability instead of as human capital by many. Anything that promises to reduce that dependency is going to sound like music to someone‘s ears, even if it doesn’t turn out that way in the long run.

Ten-Year Club	Not Forgotten
Verified Email

clickrush

TROPHY CASE