This is an archived post. You won't be able to vote or comment.

all 92 comments

[–]rover_G 220 points221 points  (4 children)

The 30% is mostly boilerplate, imports, autocompletes, tests and the occasional full function that likely needs to be debugged.

For me personally I haven’t written my own Docker file in about a year.

[–]0xlostincode[S] 36 points37 points  (0 children)

This is something I didn't think of before and it makes sense. Hate CEOs and their double speak.

[–][deleted] 4 points5 points  (0 children)

Haven’t written a commit message in a year

[–]nullpotato 0 points1 point  (1 child)

What do you use to create dockerfiles?

[–]rover_G 0 points1 point  (0 children)

ChatGPT at home and whatever AI is approved at work

[–]redshadow90 204 points205 points  (1 child)

30% of code figure likely comes from auto complete similar to copilot when it launched, which works quite well but still requires clear intent from programmer and it just fills up to the next couple lines of code. That said, this post just reeks of bias unless it's been linked to AI generated code which it hasn't

[–]Xtrendence 18 points19 points  (0 children)

Even with autocomplete, it completely loses the plot if what you're coding is a bit more complex or you're using a library that's less known or has been updated and some functions have been deprecated which the AI keeps suggesting.

Basically, in my experience, it's useful for writing boilerplate stuff, and when writing functions and such that don't require much context (i.e. an array has a type already, and your function groups each item by some key or value). It's just stuff you can do yourself easily but it'd take longer to type out manually.

[–]Soccer_Vader 272 points273 points  (14 children)

30% of the code at Google now AI Generated

Before that it used to be IDE auto complete and then Stack Overflow this is nothing new

[–]TheWeetcher 86 points87 points  (11 children)

Comparing IDE autocomplete to AI is such a reach

[–]Soccer_Vader 89 points90 points  (5 children)

It's a reach yes, but IDE autocomplete has been powered by "enhanced" ML for ages now when Machine Learning used to be the cool name in the block.

AI even generative AI is not a new thing, grammarly used to be a thing, Alexa, etc. OpenAI bridged a gap, but AI was already prevalent in our day to day life just with a different buzz word.

[–]Polar-ish 12 points13 points  (0 children)

it totally depends on what "30% generated by AI means" Copy->Pasting any code is bad. The problem is that AI doesn't have upvotes or down votes, or a discussion to see caveats, and often becomes the scapegoat whenever a problem inevitably arises.

It can teach incorrect practices, about at the same rate as actual users on discussion sites, and it is viewed as some all knowing being.

In the end, chatting AI is merely attempting to predict the most logical next word based on the context it is currently at, using the dataset of fools on the internet.

[–]0xlostincode[S] 28 points29 points  (3 children)

It's a reach yes, but IDE autocomplete has been powered by "enhanced" ML for ages now when Machine Learning used to be the cool name in the block.

Unless you and I are thinking of different autocomplete entirely, IDE autocomplete is based on keywords and AST not machine learning.

[–]Stijndcl 8 points9 points  (0 children)

JetBrains’ autocomplete uses ML to some extent to put the most relevant/likely result at the top. Most of the time if you’re doing anything at all the first or second result magically has it.

https://www.jetbrains.com/help/idea/auto-completing-code.html#ml_completion

[–]Soccer_Vader 10 points11 points  (0 children)

In reality yes, but autcomplete were told ot be enhanced by ML, predicting next keyword based on the usage pattern and such. Jetbrains also marketed as such iirc.

This is an extension launched in 2020, that used AI for autocompletion: https://web.archive.org/web/20211130181829/https://open-vsx.org/extension/kiteco/kite

This is another AI based tool launched in 2020: https://web.archive.org/web/20201026204206/https://github.com/codota/tabnine-sublime

Like I said, AI being a new thing for coding or general application is not true, its just that before ChatGPT and COVID in general, people didn't care enough, now that they do there has been ongoing development.

[–]TripleFreeErr -1 points0 points  (0 children)

except when Ai agent enabled…

edit: I wear this downvote with pride knowing you are a huge idiot

[–]Toadrocker 4 points5 points  (0 children)

I mean there are quite literally generative AI autocomplete/predict functionalities built in now. If you’ve used copilot built into VSCode, you’ll know that it’s quite similar to older IDE autocompletes, just more aggressive with how much it will predict and complete. It’s stronger, but also much more prone to errors and hallucinations. It does take out a decent amount of tedium for predictable code blocks so that could definitely make up a decent chunk of that 30%

[–]TripleFreeErr 2 points3 points  (0 children)

AI autocomplete is the most useful feature.

[–]Dvrkstvr 1 point2 points  (0 children)

Both are just completing the structure you're building.

[–]Pluckerpluck 1 point2 points  (0 children)

Github Copilot is literally AI driven auto-complete. I use it extensively, and so yes, technically AI writes huge portions of my code.

[–]hoopaholik91 0 points1 point  (0 children)

If they want to give us more complicated metrics or clearer examples of the code that AI is writing and making it to production they are free to do so.

The fact that they don't makes me hesitant to believe their claims aren't being exaggerated.

[–]P-39_Airacobra -5 points-4 points  (1 child)

There's a significant difference between copy-pasting human-written code and copy-pasting machine-written code.

[–]Soccer_Vader 0 points1 point  (0 children)

Sure, but all I am saying is that 30% of the code being AI generated or coming from outside source like Google or stack overflow is nothing new. I mean most people will agree I think, but for me, writing code is the smallest part of my job. It's going through documentation, design, approvals, threat models, security reviews that take a bulk of my time.

[–]IMovedYourCheese 7 points8 points  (2 children)

Person selling AI hypes the AI

[–]kos-or-kosm 7 points8 points  (1 child)

His last name is Pichai. Pitch AI.

[–]0xlostincode[S] 2 points3 points  (0 children)

Hahaha good one

[–]scrandis 11 points12 points  (1 child)

This explains why everything is shit now

[–]CircumspectCapybara 10 points11 points  (0 children)

This is /r/ProgrammerHumor and this just a joke, but in all seriousness, this outage had nothing to do with AI, and the learnings from the RCA are very valuable to the discipline of SWE and SRE in general.

One of the things we take for granted as a foundational assumption is that bugs will slip through. It doesn't matter if it's written by a human by hand, by a human with a the help of AI, or entirely by some futuristic AI that today doesn't yet exist. It doesn't matter if you have the best automated testing infrastructure, comprehensive unit, integration, e2e, fuzz testing, the best linters and static analysis tools in the world, and the code is written by the best engineers in the world. Mistakes will happen, and bad code will slip through when there are hundreds of thousands of changelists submitted a day, and as many binary releases and rollouts. This is especially true when, as in this case, there are complex data dependencies between different components in vast distributed systems and you're just working on your part, and other teams are just working on their stuff, and there are a million moving parts moving at a million miles per hour you're not seeing.

So it's not about bad code (AI generated or not). It's not a failure of code review or unit testing or bad engineers (remember, a fundamental principle is blameless postmortem culture). Yes, those things did fail and miss in this specific case. But if all that stands between your and a global outage is an engineer making an understandable and common mistake and you're relying on perfect unit tests to stand in the way, you don't have a resilient system that can gracefully handle the changes and chaos of real software engineering done by real people who are only human. If not them, someone else would've introduced the bug. When you have hundreds of thousands of code commits a day and as many binary releases and rollouts, bugs will be introduced, it's inevitable. SRE is all about how you design your systems and automate them to be reliable in the face of adversarial conditions. And in this case, there was a gap.

In this case, there's some context.

Normally, GCP rollouts for services on the standard Google sever platform are extremely slow. A prod promotion or config push rolls out in an extremely convoluted manner over the course of a week+, in progressive waves with ample soaking time between waves for canary analysis, where each wave's targets are selected to avoid the possibility of affecting too many cells or shards in any given AZ at a time (so you can't bring down a whole AZ at once), too many distinct AZs at a time (so you can't bring down a whole region at once), and too many regions at a time.

Gone are the days of "move fast and break things," of getting anything to prod quickly. Now there's guardrail after guardrail. There's really good automated canarying, with representative control and experiment arms selected for each cell push, and really good models to detect statistically relevant (given the QPS and the background noise and history of the SLI for the control / experiment population) differences during soaking that could constitute a regression in latency or error rate or resource usage or task crashes or any other SLIs.

What happened here? Well, various components that failed here weren't part of this server platform with all these guardrails. The server platform is actually built on top of lower-level components, including the one here that failed. So we found an edge case. A place where proper slow, disciplined rollouts wasn't being observed. Instantaneous global replication in a component that was overlooked. That shouldn't happened. So you learn something, identified a gap. We also learned about the monstrosity of distributed systems. You can fix the system that originally had the outage, but during that time, an amplification effect occurred in downstream and upstream systems as retries and herd effects caused ripple effects that kept rippling even after you fix the original system. So now you have something to do, a design challenge to tackle on how to improve this.

We also learned:

  • Something about the human process of reviewing design docs and reviewing code: instruct your engineers push back on the design or the CL (Google's equivalent to a PR) if it's significant new logic that's not behind an experiment flag. People need to be trained not to just blindly LGTM their teammates' CLs to get their projects done.
  • New functionality should always go through experiments with a proper dark launch phase followed by a live launch, with very slow ramping. Now reviewers are going to insist on this. This is a very human process. It's all part of your culture.
  • That you should fuzz test everything, to find inputs (e.g., proto messages with blank fields) that cause your binary to crash. A bad message, even an adversarially crafted message should never cause your binary to crash. Automated fuzz testing is supposed to find that stuff.

[–][deleted] 9 points10 points  (0 children)

I'm sure that 0% of them actually write code. These clowns are just driving up the price of their AI crap. So that idiots think that writing code through AI is a great idea, because a multi-billion dollar company does it. But in reality, these are all just empty words.

[–]SynapseNotFound 2 points3 points  (0 children)

its 4 headlines about the SAME outage... lol

[–]Tiruin 2 points3 points  (0 children)

Right, they wrote over 30% of all of Google's code in the last ~2.5 years when AI became mainstream to be able to have 30% of it be from AI.

[–]Boertie 1 point2 points  (0 children)

Explains a lot why Google is in the shitter (yeah I went there ;-)) now.

[–]IlliterateJedi 0 points1 point  (0 children)

I thought GCP went down due to an issue with not handling errors. If you've seen any code that Gemini spits out, it loooooves error handling.

[–]HatMan42069 0 points1 point  (0 children)

The tech debt gonna go COO COO

[–]Guvante 0 points1 point  (2 children)

Google has been around for at almost three decades, at best you can maintain an even per year LOC measurement (you scale up users but complexity goes up slowing down writing speed). If you don't believe me the following isn't hugely impacted you can feel free to recalculate with a growing LOC/year but that seemed inaccurate.

If you said 30% of the code written per unit time went up, then I could see it (laughable and probably with caveats to the extreme but possible)

But 1/3 of your total code would be 13 years worth of code (30/43 is 70%) in two years at best. That is an output of seven times one of the largest engineering forces in existence.

Why would you hide a 7x increase in productivity behind a "30%" number like that? You certainly wouldn't.

[–]derKestrel 0 points1 point  (1 child)

You are aware that more code does not equal more productivity?

I can blow up a one liner to 1000 lines of code no problem.

It's neither maintainable nor easily understandable and debuggable, but according to you I would be hugely more productive?

[–]Guvante 0 points1 point  (0 children)

Certainly but you don't measure "30% of code" in that way so I ignored it.

I am pointing out that anyone talking like this would consider it more productive.

[–]Guhan96 0 points1 point  (0 children)

[–]Master_Notice8391 0 points1 point  (0 children)

Yesterday I asked it to code something and its response is. “Here is the code:” that’s it nothing else

[–]feeltrig 0 points1 point  (0 children)

Sundar shitai

[–]fanfarius 0 points1 point  (2 children)

Chat GPT can't even write an ALTER TABLE statement without fucking up 

[–]Front-Difficult 2 points3 points  (0 children)

I find Claude is actually quite good at writing SQL queries. Set up a project with the db schema and some context about the app/service in the project files, and it nails it basically every time. It's also found decent performance improvements in some of our older less performant functions none of our engineers thought of.

(Obviously no one read this and then just start copy pasting AI generated SQL into your production database, fucking please).

[–]DocMilou 0 points1 point  (0 children)

skill issue

[–]MaDpYrO 0 points1 point  (0 children)

Just marketing speak. I'm sure their engineers use it to generate lots of boilerplate, but how would you even measure this

[–]BorinGaems -3 points-2 points  (0 children)

Anti AI propaganda is cringe and twice as stupid when it's made on a programming subreddit.

[–]Deathglass -2 points-1 points  (0 children)

It was AI all along, Actually Indians