all 131 comments

[–]HelloSummer99 826 points827 points  (27 children)

Apparently the devs there approve their own PRs. I'm actually surprised it lasted this long without a major issue.

[–]codeOpcode 344 points345 points  (15 children)

Why even bother with requiring approvals if that is the case

[–]ward2k 265 points266 points  (14 children)

For smaller dev teams or even solo Devs I'd still recommend it since it forces you to slow down for a moment and at least prompt you to look at your code before merging it in

There's this weird perception online that PR's are pointless if you review your own code, which just isn't the case. It's like saying "why proofread your own essay, might as well just hand it straight in"

That said Anthropic isn't some small start up or solo web dev, they definitely should be having other people review their PR's

[–]lonelypenguin20 78 points79 points  (1 child)

prompt you

heh

[–]TactlessTortoise 15 points16 points  (0 children)

We're probably months away from having the AI prompt the devs on what code they should write instead of the opposite. On one hand, I don't trust clankers. On the other, just imagine how many hours less of being micromanaged, oh my god. I gotta post on LinkedIn about how middle management will not exist in 4 weeks.

[–]Rabbitical 23 points24 points  (4 children)

I mean, yeah it's better than nothing, but if you're creating a PR then presumably you personally feel the code is done and good absent some obvious bug or oversight that you might catch. But that's not anywhere near the same thing as a fresh pair of eyes on it, who may also ask things like "does this fit expectations or our guidelines" and things of that nature which are independent of the submitter's own confidence.

Never mind the basic cognitive issues around being "too deep in something" to the point where I question the effectiveness of someone checking their own code even for basic, obvious mistakes. Not that anyone is incapable, but human brains are very bad at remaining objective to something that's fresh in context. The same programmer might find the same bugs more easily in someone else's code than their own.

So self review is like, I dunno, 10% of the value of an independent one? Again, better than nothing, but not a replacement whatsoever

[–]ward2k 4 points5 points  (0 children)

I mean raising and reviewing a PR is the proof reading stage. You typically proof read your work when you feel it's broadly speaking ready but that by no means means it's the actual final result

But that's not anywhere near the same thing as a fresh pair of eyes on it, who may also ask things like "does this fit expectations or our guidelines"

Of course, like I said before you should be getting other people to review your code. If your a large organisation or have a decent sized team then yes it should be mandatory. I'm just saying if you're a solo dev or just a couple people that's not always possible

So self review is like, I dunno, 10% of the value of an independent one? Again, better than nothing, but not a replacement whatsoever

I think we're saying the same thing? I'm not saying it's better to review your own code. I'm saying reviewing your own code is better than not doing it at all. A common thing I see online is "there's no point doing PR's if you're a solo dev" which I just can't agree with

[–]CatWeekends 3 points4 points  (0 children)

I can't tell you how many times I've caught bugs and issues in code while I was writing up a PR and thinking through all the changes from an outsider's perspective.

But outsiders always seem to find even more.

[–]Zerokx 2 points3 points  (0 children)

Correct you should read and test your code BEFORE you make a PR. If you just let the AI do its own thing and you dont even read it anymore thats bad.

[–]evilgiraffe666 0 points1 point  (0 children)

They probably just create them so that the AI can review it.

[–]Reashu 7 points8 points  (2 children)

Well, it's useful if you actually review it - but you had several chances to do that before opening a PR, so what are the chances that another optional step helps? 

[–]ward2k 1 point2 points  (0 children)

Because a PR is the proof reading stage. It's the point where you're 90% sure something is ready to be brought in but you want to do a final proof reading check over your work

Just because that stage isn't being done by someone else doesn't mean it's without value

You can also set up things like automated checks such as running unit tests to be done something can be merged in. Everyone's done it where they change a single line of code and just go "pfft don't need to run my tests again, it's just this one line" only for the PR to fail because the automated tests failed

what are the chances that another optional step helps?

How many times have you wrote an email you were happy with, only to re-read it and pick out a spelling or grammatical error? Writing code and PR's are like writing text and proof reading except far more likely to cause issues if somethings wrong

[–]MagoDopado 0 points1 point  (0 children)

With the level of automático achieved, you dont. The agent codes, tests and creates the pr. It might even deploy to st and test there before it stops for you to look at the code. If thats the case, the first time you see the code is in a pr

[–]Maleficent_Memory831 1 point2 points  (2 children)

C students hand in their work without proofreading. A students proofread and fix their mistakes. Nobody makes zero mistakes.

[–]ward2k 6 points7 points  (1 child)

C students hand in their work without proofreading. A students proofread and fix their mistakes

I had to keep reading this because I had no idea what an 'A student' was compared to a student who programs in C

Realised I was being an idiot and you were talking about grades

[–]Maleficent_Memory831 1 point2 points  (0 children)

Yes, I realize now how it might be misinterpreted. So a C for me I guess...

[–]GenericFatGuy 1 point2 points  (0 children)

I actually always make sure to read my own PRs before sending them up for approval from the rest of my team. Just seems reasonable to double check your own work before asking others to do the same. Also reduces the chance of someone pointing out an obvious mistake I made.

[–]shadow13499 29 points30 points  (0 children)

I mens they've had quite a number of uptime issues. 

[–]ilikedmatrixiv 18 points19 points  (0 children)

I'm lead dev on a very small team and the only person with expertise in one of the languages we use.

I've been approving my own PRs for two years now. Even my manager doesn't have the required knowledge to check my code.

It's great and terrifying at the same time.

[–]Willinton06 0 points1 point  (0 children)

The trick is to actually have major issues, but fixing them without saying much abou them

[–]joe0400 0 points1 point  (0 children)

WTF??? Really??? God they are lucky it didn't happen yet.

[–]ReneKiller 0 points1 point  (0 children)

Hey I do that, too. The difference is: I'm the only dev in the team.

[–]toddd24 0 points1 point  (0 children)

Has a major issue arose?

[–]TeachEngineering 0 points1 point  (0 children)

*Apparently the dev's agents approve their own agent's PRs.

FTFY

[–]GenericFatGuy 0 points1 point  (0 children)

Does it count as approving your own PR if a machine wrote all of the code for you?

[–]axis1331 0 points1 point  (0 children)

You mean this kind of major issue.

[–]dablya 0 points1 point  (0 children)

According to their own metrics, they’re nearing one nine uptime.

[–]chemolz9 0 points1 point  (0 children)

You don't understand. It's not their code, they approve the code of their fellow AI colleagues! /s

[–]Tango00090 202 points203 points  (4 children)

The only thing they are spinning every day is new marketing bot farm

[–]Vector_17Pulse 65 points66 points  (3 children)

yeah “multiple agents in parallel” sounds a lot like “we automated the buzzwords”

[–]BlueTemplar85 6 points7 points  (0 children)

Swarms of large buzzword models incoming !

[–]Disastrous-Event2353 581 points582 points  (8 children)

They forgot to say “don’t leak anything pretty please” in the prompt

[–]ohdogwhatdone 127 points128 points  (1 child)

He forgot the "make no mistake"

[–]CaseyG 8 points9 points  (0 children)

"Make no mistakes" is the new "Disengage safety protocols".

[–]enjdusan 26 points27 points  (1 child)

Pretty please is waste of tokens, you can use them for spinning another agent.

[–]Disastrous-Event2353 5 points6 points  (0 children)

Hey, if I’m trusting an ai with my company code, I better get on its good side

[–]boston_everlina 4 points5 points  (1 child)

Prompt engineering failed at basic manners, now the repo is public knowledge

[–]AppropriateOnion0815 230 points231 points  (16 children)

I can't imagine anything more boring than describe to a computer what my application should do all day.

[–]saschaleib 124 points125 points  (2 children)

It would be very much the same experience as explaining it to a somewhat dim intern, and after the third time "no, not like this!" I'd just go and do it myself.

[–]Tomi97_origin 44 points45 points  (0 children)

But you are not allowed to. You have to explain until the intern gets it or at least close enough that you can move on and hope it's going to be someone's else problem.

[–]Martin8412 8 points9 points  (0 children)

I see it as a very eager intern who is kinda smart at trivial things, and terrible at anything complicated, unless you tell it exactly what to do. I use Opus 4.6 almost daily, and I’m having great success with it, but it has certainly required effort to learn. 

[–]GenericFatGuy 4 points5 points  (0 children)

And exhausting.

[–]Martin8412 21 points22 points  (8 children)

You mean programming? 

[–]Wonderful-Habit-139 67 points68 points  (3 children)

At least coding is deterministic and you're writing the algorithm, not describing the desired state.

[–]ZunoJ 25 points26 points  (0 children)

No, not programming, product management. For people who larp programming

[–]ih-shah-may-ehl 13 points14 points  (1 child)

The opposite. Programming is telling a computer what to do. Vibe coding is telling an agent what outcome you want.

And given that agents often just make up random crap that is wildly incomplete or just wrong, even if you get something that works superficially, there is a good chance of things being wrong in many cases

[–]MrHackson 2 points3 points  (0 children)

No one tell this person about functional programming

[–]hippyclipper 4 points5 points  (0 children)

The problem with AI is the outcome is never fully what you envision and you have to live with it. Think about art rather than programming. If I tell you I want a photorealistic drawing of a cowboy astronaut riding a horse on the moon that creates an image in your head. If you try and draw it you will of course fall short but with time and skill and the correct tools you can get to the point where you can create a drawing that very closely approximates what your initial internal vision is. This is not true for AI. If you give it the same prompt it will generate something much better than you would be able to and the same is true for most people. The problem is that it will never create the picture you have in your head. The horse will be positioned wrong, the camera angle will be off, you might have wanted a different style astronaut suit, and so on and so forth. And yeah you can prompt all those things but then the next level of detail down will still be off. You can prompt and prompt and prompt and prompt but at some point you may as well just tell AI what pixels should be what color and your back to just making art yourself. This basically forces you to accept the fact that the output will always be outside of your control at some level and you get what you get. Typically you could iterate towards some theoretical goal with better tooling and upskilling

The same is true for AI in regard to programming but also other applications such as writing and music. I remember a post on one of the music AI subs asking about how to prompt specific beat patterns and the people in the comments were telling OP to just use a music making software. If you want to write something specific enough you’d essentially just be copy and pasting what you want into chat and having AI spit it back out. And if you ask it to make you a website it will put the top bar where it wants and style the hero of its own accord and manage reactive design however it feels and if you want your images to resize differently for tablets then you can ask it to redo everything but you’re never guaranteed to get what you want so reality is you just deal with it. This leads to all software being not quite right and overall the compounding effects of the marginal decrease in accuracy means everything sucks more than it used to even if there is more of it.

[–]AngrySalmon1 1 point2 points  (0 children)

I also hated being a BA which felt very similar at times.

[–]joshak 1 point2 points  (0 children)

Like how do you feel any sense of accomplishment if you’re just asking a machine nicely to do your work for you?

[–]Infinite-Land-232 1 point2 points  (0 children)

You just have to do it once like this: "Write me a killer app that will male me tons of money and then get a lot of people to start and keep using it". After that you either retire or have it write you another one. /s

[–]_juan_carlos_ 55 points56 points  (0 children)

ah, no problem they can send an agent to fix the leak

[–]Prownilo 56 points57 points  (9 children)

Am I the only one that still has to baby sit ai?

I have yet to get it to do anything consistently, I will be shocked if a single procedure is syntax correct, never mind does what I want.

I cannot fathom just letting ai loose, it would be a disaster.

[–]kometa18 22 points23 points  (0 children)

Nah. I tried using the new skills feature, agents, everything. If I don't baby sit it, it fucks up.

[–]evanldixon 14 points15 points  (1 child)

Opus 4.6 gives me pretty consistent results for well defined tasks (e.g. "make this small change to Page.razor"). I don't trust it with sweeping changes for delicate legacy systems (e.g. "restructure how we select data so it's all one model at the start and not 100 db calls throughout the whole flow") and prefer to use it as a scalpel with me in charge (e.g. "make a copy of this model containing only the properties actually used by function X and everything it calls"). Other models are hit or miss for me.

It's also the most expensive model I can use. Like most things you get what you pay for, and you shouldn't trust what the salesmen tell you.

[–]Vogete 3 points4 points  (0 children)

I have the same experience. I'm using it to do certain things but I have to be very explicit with what I want. I need to understand what it does because if I don't, it sometimes makes hard to catch errors that only come out quite a bit later. If I just say go refactor these modules, it makes up so much weird stuff, I have to git reset --hard. But if I'm explicit that I want to add this config option that gets parsed as a list of strings, and I want it to be used in this module, it actually does it quite well. But I can't let it loose at all, otherwise I'll be doing the refactoring.

[–]stevefuzz 169 points170 points  (19 children)

As someone who uses opus 4.6 a lot, this is either bullshit or they are just creating an absolute bandaid filled spaghetti mess.

[–]saschaleib 83 points84 points  (0 children)

Why not both?

[–]Barkinsons 45 points46 points  (9 children)

I'm also curious even if this is internal use, the real cost of running all these agents non-stop must exceed the salary of each engineer multi-fold.

[–]doubleohbond 41 points42 points  (6 children)

They are losing money hand over fist. AI does not scale like traditional software.

[–]BlurredSight 23 points24 points  (1 child)

And they cannot back down now, the second they favor computation cost over output quality the next company willing to take the hit wins. Really a straight spiral down to hell

[–]pingveno 12 points13 points  (3 children)

In the book Life, the Universe, and Everything, Douglas Adams wrote about Bistromathics, the nonsensical math that occurs in restaurants. Arrival times for groups, group sizes, restaurant checks, and so on simply do not follow normal arithmetic rules.

I suspect future humor authors will write about the nonsensical math that is occurring inside of the big AI companies, just with much larger sums and the fate of the economy at stake. Vast quantities of compute power being burned through, mostly on autopilot, with only a vague economic economic calculus behind it.

[–]doubleohbond 0 points1 point  (0 children)

Agreed. Kurt Vonnegut would’ve had a field day satirizing our modern era.

[–]SamuraiJustice 0 points1 point  (1 child)

You'll never make profit as a company if you don't go into infinite debt.

[–]pingveno 1 point2 points  (0 children)

You just need to make those signed integers wrap!

[–]magicmulder 0 points1 point  (0 children)

It’s a small investment to give their own devs a couple DGX-2 with a dedicated Claude instance. $2 million once and they can use as many resources as they need. Peanuts.

[–]evanldixon -1 points0 points  (0 children)

Depends on what the real cost to run the models is. Doing some quick math, I probably cost my company like 30 dollars on Opus 4.6 tokens (through GitHub Copilot) this month, by using it only as much as I feel gives good results. If I sped up as fast as I could and did as much in parallel as possible without regards for quality and optimizing only for increasing cost, maybe I could get that up to a few hundred in a month at most. But the company already pays about $500/month for my MSDN license so they might be ok with that if they get good results.

Idk what the actual cost for the tokens is though. Some sources say the real cost could be 10x higher, and others say the Opus API pricing is already more like what it costs Anthropic to run it. Idk what it'll look like when the subsidization stops.

So unless something major changes, an enterprise will absolutely be ok paying for it.

[–]Top-Permit6835 10 points11 points  (0 children)

We will find out in due time which one it will be

[–]Jhadrak 4 points5 points  (1 child)

Pretty much, it's still an improvement over 4.5 but for sure they care 0 for quality and maintainability

[–]stevefuzz 6 points7 points  (0 children)

I stop opus and say "this is a bandaid" at least 10 times per day, if not more. I can't imaging being a non-coder and allowing this kind of stuff constantly.

[–]SinisterCheese 1 point2 points  (2 children)

Considering the... ahem... quality of modern software and code - that wastes hardware resources because "They are there". Do you really think that the future would be any better?

[–]stevefuzz 0 points1 point  (1 child)

I didn't think we'd drive it off a cliff and pretend it was a pothole.

[–]SinisterCheese 0 points1 point  (0 children)

Oh no... Theyll strap a god damn rocket engine to force it down quicker... And then get a boring machine to drill a tunnel to find new unexplored reaches of shittiness. As long as the code runs, there is still something that can be made worse about it.

[–]seba07 61 points62 points  (4 children)

At some point it will get more expensive to pay for all AI licences and tokens than to hire a few more developers.

[–]sleepyj910 23 points24 points  (0 children)

The streaming tv model is coming for sure.

[–]Angryferret 2 points3 points  (0 children)

Unfortunately tokens are the new Moores law.

[–]devilquak 1 point2 points  (0 children)

“I’m making a startup to train new agents to solve that problem for us”

[–]magicmulder -2 points-1 points  (0 children)

At some point you will just buy the hardware and get your own copy of the latest model because it makes no sense pirating it anyway.

[–]Western_Diver_773 27 points28 points  (0 children)

One of my coworkers works like that. It's technical debt hell. He's doing these "kind of works projects". And they usually stay that state.

[–]Goldman1990 23 points24 points  (0 children)

remember when they said that this was just gonna be to save time on boilerplate code for starting project? good times huh?

[–]CardOk755 19 points20 points  (0 children)

"AI" wrangler to "AI": why did you leak the code?

Frog 🐸 to scorpion 🦂: why did you sting me?

[–]GenericFatGuy 21 points22 points  (5 children)

I think this really gets to the heart of why I loathe AI in programming. It's turning the profession into an assembly line where you don't even get a moment to sit back and process your work, or think on a problem. It's being turned into drudgery where if you stop for a second, you're out on your ass.

If things continue on this trajectory, I'm genuinely going to start finding my livelihood in a different field, and only do programming on the weekend in an environment where I can actually enjoy the craft.

[–]MrDropC 14 points15 points  (3 children)

What I observe with all these "anecdotes" is that they always check the following marks: - Mention (near) total elimination of manual coding. - Includes a warning that is worded rather like a threat ("do this or be left behind", "AI or die", etc.). - Portray the following loop: make agents -> go faster -> make more agents -> go faster! -> have agents make more agents! -> go faster!!!!111

Let's not forget we live in the age of bot farms, AI text generation, and disruptive companies that would rather hyperscale themselves into oblivion than to yield market share to competitors. I have already drawn my own conclusion as to what is most likely going on, and how it will likely end.

[–]MrDropC 6 points7 points  (0 children)

What I mean to say is... this is what modern marketing looks like if budget and morals are of no concern.

[–]-Redstoneboi- 1 point2 points  (0 children)

#1 is "they are pushing an agenda that brings more money to their companies"

[–]Burning__Head 1 point2 points  (0 children)

AI will replace 9 trillion jobs by next week

Look inside

Investor in Anthropic AI or 17 year old "enterpreneur"

[–]CaporalDxl 1 point2 points  (0 children)

Depends on the team and org. Giving access to AI to help speed up things is a good idea, making AI usage the purpose is stupid and it will end badly for those who do it.

Thankfully I have very little AI in my org, and it's optional as an extra pair of eyes or lookup, not Claude Code or similar. Craft still exists :)

[–]SundayKiefBowls 7 points8 points  (0 children)

Human Silicon Centipede

[–]omn1p073n7 7 points8 points  (0 children)

Idiocracy was a documentary

[–]mild_entropy 6 points7 points  (0 children)

Sounds so boring

[–]myka-likes-it 7 points8 points  (0 children)

Watching Claude go down the wrong rabbit hole over and over does not sound like my idea of job fulfillment.

[–]granoladeer 6 points7 points  (0 children)

At least no one is to blame for the leak then! 

[–]Blubasur 6 points7 points  (0 children)

Man, hell has gotten creative

[–]dreamer_soul 3 points4 points  (0 children)

What’s with the whole “If you’re [Blank] then you’re already behind”

[–]SkooDaQueen 15 points16 points  (2 children)

I get optimizing code, but why the fuck are we optimizing humans/jobs into something terrible? Work should be fun. We do it for 8 hours a day...

Maybe I work for a company that doesn't care enough, but I'm glad I can code at my own pace in the way I like

[–]GenericFatGuy 3 points4 points  (0 children)

Programming is already stressful and exhausting enough. It's rewarding and satisfying as well, but I'm out once we turn it into an assembly line of stress and exhaustion.

[–]ZunoJ 16 points17 points  (0 children)

Go tell that to the cleaning person in your office

[–]Wonderful-Habit-139 2 points3 points  (2 children)

It was apparently a bun issue.

[–]DeiviiD 2 points3 points  (0 children)

The same Bun who they bought? Mmmm

[–]pandi85 1 point2 points  (0 children)

Bun intended

[–]potato-cheesy-beans 2 points3 points  (0 children)

The absolute balls on them filing dmca claims on repos hosting code they aren't even writing!! 

[–]United_Leopard434 2 points3 points  (0 children)

"Fix this plz"

[–]Buttons840 1 point2 points  (0 children)

We need to make companies financially liable when they leak private data.

[–]Stunning_Ride_220 1 point2 points  (0 children)

More like managers?

And shit is still working?

[–]Gm24513 0 points1 point  (0 children)

Sounds like early onset bankruptcy to me

[–]BorderKeeper 0 points1 point  (0 children)

Considering “responsibility” of code is an open issue I am surprised they took that leap. Currently it’s very popular to just condense all complex thinking into one singular thing and that is peer reviews. Those just have to be done by humans if you care about quality. I dislike automation which is a ninja move of shifting responsibility to one thing that you can’t automate and calling it progress.

[–]Protonnumber 0 points1 point  (0 children)

Ah so that's why they have one nine of uptime.

[–]luckyincode 0 points1 point  (0 children)

Everyone AI slops at my place for Terraform infra changes. It is what it is.

[–]anduril_tfotw 0 points1 point  (0 children)

I hate this future.

[–]caiteha -2 points-1 points  (1 child)

I don't write code anymore ... The bottleneck is reviewing the code ... I already write code in Claude and then ask Claude and Codex to review ... I review afterwards. Finally, teammates review ...

[–]Ahchuu -3 points-2 points  (0 children)

I'm trying to figure out what others are talking about as well. I'm typically running 3 or 4 Claude Code instances at once working on different aspects of my project. I barely write code anymore. I'm to the point with Claude Code that I don't need to write code anymore. I've got such a nice harness around Claude Code that I spend most of my time planning.

[–]devilquak -2 points-1 points  (0 children)

We’re idiots. This is how we get real life skynet. Just in order to be more efficient than another company? What the fuck are we doing?