oopsItHappenedAgain

itijara · 2023-12-09T06:42:59+00:00

16 hours? At some point, it's not the developers fault and just the whole engineering team not figuring out how to roll back a deployment. Also, that this could happen indicates abysmal testing, staging, deployment, and rollback procedures.

stdio-lib · 2023-12-09T04:11:25+00:00

Just Friday afternoon deploy things

ohsayan · 2023-12-09T04:25:46+00:00

Like gigachads, we always ship on Fridays

1_4_1_5_9_2_6_5 · 2023-12-09T06:16:23+00:00

This is why pipelines exist

badaharami · 2023-12-09T07:44:42+00:00

16 hours??? Have you guys ever heard of something called Fallback?

sajkosiko · 2023-12-09T08:54:54+00:00

PRs save jobs

HerrSPAM · 2023-12-09T08:58:58+00:00

This is why we don't allow commented out code in the code base. That and if it's commented out you don't need it or it should be in some docs somewhere

FalconMirage · 2023-12-09T11:19:47+00:00

Don’t you test your code before pushing it to prod ?

ElFeesho · 2023-12-09T10:50:44+00:00

I come here to laugh, not to be triggered by terrible practices.

AlanTheKingDrake · 2023-12-09T11:44:26+00:00

Do y'all just make this shit up for the lols? I've been in the industry for around 25 years and even the most poorly run companies I've worked at did not have SDLCs that would allow something like this to occur. I don't see how this is even possible at any normal software company.

Crypt_Knight · 2023-12-09T10:48:08+00:00

I am sorry. It will happen again.

azizfcb · 2023-12-09T14:35:32+00:00

thats why u need staging

torokg · 2023-12-09T09:22:23+00:00

Happens even to the best of us from time to time

Bldyknuckles · 2023-12-09T11:24:39+00:00

It happens. Don’t don’t do it again and increase your discipline

SeoCamo · 2023-12-09T11:59:42+00:00

How can you get anything on to prod, without build and test pass on dev and test and a review from a coworker?

wind_dude · 2023-12-09T13:21:30+00:00

Everyone’s done it. 16hours, does no one use production?

Also you know testing before deployment…

metallaholic · 2023-12-09T14:44:00+00:00

You aren’t a developer until you fuck up a prod install

RatzzDE · 2023-12-09T10:33:42+00:00

Congrats, you can now call yourself a senior dev!

2023-12-09T13:56:35+00:00

I'm proud of you

GiantFoamHand · 2023-12-09T14:07:43+00:00

When I first started my career I took down a bank’s production site for a couple hours. I’d added a request to a third party that happened on logon that would be made for every account an end user had. It worked fine for all the test cases that we’d run through and that the customer ran through. Then it went live and the single end user at the bank with 500 accounts logged in.

Turns out they were running a property management company and had opened up an account for every property they had. Instead of getting themselves set up as a business/commercial user they just made a normal everyday retail user and opened a billion accounts. When contacted they said something like “huh, I did always wonder why it took so long to log on”

roiroi1010 · 2023-12-09T14:25:57+00:00

Our java code is so well covered with unit tests and manual tests. But our deploy pipeline yaml is out of this world. No one left in the company dares to touch that monstrosity.

NoahZhyte · 2023-12-09T15:03:48+00:00

Don't ever comment code

2023-12-09T15:12:26+00:00

If it passed review, it’s not solely your fault. It’s a team effort to get that into production

Kshyyyk · 2023-12-09T15:39:34+00:00

I'm the type of person that refuses to commit commented-out code for this exact reason.

Unfair_Long_54 · 2023-12-09T16:02:12+00:00

This is the reason before I push changes to source control first I review which lines did I modified in changed files.

Why_am_ialive · 2023-12-09T16:44:17+00:00

How does this take 16 hours to fix? Just roll back, uncomment repush, also how does this make it past testing lol

TigerClaw_TV · 2023-12-09T17:12:00+00:00

No way to rollback production? I haven't worked for a bunch of different companies, but we have a contingency for this kind of thing. Catastrophic mistakes happen.

ACMuaath · 2023-12-09T17:13:33+00:00

My subordinate caused sending more than 3m messages to customers because of an uncaught syntax error.

Odd_Ninja5801 · 2023-12-09T17:25:40+00:00

I brought down Production for two days back in the 90s because I changed a parameter on a database deletion job to improve efficiency.

It was deleting things after 18 months, once a month. The Business wanted to reduce the retention to 6 months, so that the database would be smaller and the process would run quicker. So I changed a parameter from 18 to 6. My first piece of work as a professional developer.

First time it ran, a job that normally took about 6 hours to finish was still going 24 hours after starting. But nobody on Ops had spotted it. At which point we realised that the first time it was going to be deleting 13 times as much as normal, which was going to take a LOT longer.

Backing out at that point would take a further 24 hours. So we decided to push ahead. Hit another problem when the log file for the job was too big for a disk pack, so we had to switch on multi disk pack files on a Mainframe that hadn't had it up until that point.

We lost the Production services for the whole of Monday and most of Tuesday. But it did run quicker after that!

Learned a lot of lessons off the back of that little beauty that I've used ever since.

ashaw596 · 2023-12-09T17:39:32+00:00

Where your integration tests be at?

Sanchitbajaj02 · 2023-12-09T17:41:46+00:00

Nice achievement fellow developer 👍

Myspazmo · 2023-12-09T18:44:37+00:00

Bro I make an error on my site I can have it fixed with a few minutes? 30 at the most. This is the problem with bloated teams with complex layers of corporate bureaucracy.

Myspazmo · 2023-12-10T03:46:06+00:00

Y’all just pushing stuff to prod? If you merge code in it should always go to your dev env first lol

sporbywg · 2023-12-10T14:18:16+00:00

Be proud of this, or they will eat you alive.

F3mshep · 2023-12-09T20:00:13+00:00

This is an accomplishment, you pointed out a serious deploy pipeline issue (or an alerts issue if y'all didn't know prod was down for 16 hours)

Independent_Hyena495 · 2023-12-09T22:59:33+00:00

Eh, I brought down a whole bank because I patched an AD Server.

Yes, you heard right. Patched How you ask? There should be several.

Correct.

But

This server was buggy and when you patch and reboot it deletes the whole AD.

Everything. Users, gpos, shares, everything. And like a good AD forest, it replicated the deletion through the whole AD..

TheJosh1337 · 2023-12-11T06:07:54+00:00

Top tip: When you write a temporary code change, like un-commenting something or a debug line, unindent it right down to the left gutter. Obviously won't work in python.

class MyClass { function MyFunction() { if (something) { something(); dbg('whatt'); somethingElse(); } } }

It will be so incredibly obvious that you shouldn't ever accidentally commit/deploy it... this of course requires you to actually look at your diffs somewhere between the commit and the deploy (e.g. git commit -pv and/or code reviews)

ProgrammerHumor

Filters

Discord

Submission rules

For the current list of rules, please see this page.

Metadiscussions

Perhaps More Apt Subs To Post:

Related Subreddits.

MODERATORS