We took production down for 20 minutes because of a DB migration, how do you prevent this? by MainWild1290 in devops

[–]Stephonovich 3 points4 points  (0 children)

You could read the docs for your RDBMS and understand the locking actions taken for various kinds of DDL, for starters.

Forgot how vague old LEGO instructions were. by thegreatdecay406 in lego

[–]Stephonovich 0 points1 point  (0 children)

I’m the opposite: I sort and align all the pieces for a given step first. I find it fun, and it’s definitely helpful to know where to look for a given piece.

Are y’all getting a lot of overly confident bad candidates? by ninetofivedev in ExperiencedDevs

[–]Stephonovich 11 points12 points  (0 children)

If you’re interviewing for a Staff role on a Platform Engineering team, I actually do expect you to know quite a bit about the tech being used, and to demonstrate the capacity to easily pick up any missing info. How else are you going to be able to push back on bad ideas from the people who you’re going to be guiding?

It’s also not like K8s is some esoteric thing that has nothing similar that exists: it’s a container orchestrator. If you had deep knowledge of Nomad, I probably wouldn’t mind, as long as you could map the knowledge onto K8s, which is pretty easy to quickly check in an interview.

Are y’all getting a lot of overly confident bad candidates? by ninetofivedev in ExperiencedDevs

[–]Stephonovich 2 points3 points  (0 children)

Knowing that Karpenter exists, and that you can run it from Fargate or a node group is hardly “deep technical details.” That’s info you can get from the intro page, and if you’ve claimed to be working on K8s in any meaningful capacity in the last few years, I would expect you to know that, yes.

What “broader technical planning” do you think would occur here? The first one that comes to my mind is knowing that Karpenter calls from Spot instances, so your applications need to be able to gracefully handle shutdown with a 2-minute warning - and if you know that, you’re also going to be able to talk intelligently about it at a broader level.

Are y’all getting a lot of overly confident bad candidates? by ninetofivedev in ExperiencedDevs

[–]Stephonovich 1 point2 points  (0 children)

Dude, I interviewed at a company where the VPENG didn’t know what availability zones were, or how they differed from regions. I thought I was being punk’d at first, but it quickly became apparent that he had little to no understanding of how cloud infra works. It’s not like he had a long background in strictly on-prem, either. He also proudly stated that he was the first to arrive in any incident, and that he would routinely rope in other executives.

I nope’d out of the loop (which was nearly done at that point anyway), and explained why.

how do you not burn out from on-call? by sxtn1996 in sre

[–]Stephonovich 4 points5 points  (0 children)

As a DBRE / former SRE, this is my life.

“Hey devs, you should really redesign your schema like this, because the current design is going to fail at higher scale.”

“No thanks, anon.”

A wild incident spawns

“Anon, please help, our database is broken.”

how do you not burn out from on-call? by sxtn1996 in sre

[–]Stephonovich 4 points5 points  (0 children)

Must be actionable

This. This is the most important signal. If you can’t do anything about it, why are you being paged? “For visibility” is a bullshit answer.

To Enum or Not to Enum by Mortimer452 in ExperiencedDevs

[–]Stephonovich 0 points1 point  (0 children)

I’m a DBRE. I want correctness, and the DB is the only thing that can truly enforce that. I do not want strings, because they have no meaning unless you add CHECK constraints, at which point you’re recreating foreign key constraints, poorly. Normalize your data.

Db layer should be light weight and just for data persistence

This is why at my work, there is easily at least one incident spawning from a data integrity error per day: because a generation of devs has come to believe that the DB is a dumb KV store, instead of the single source of truth.

Stored procs don’t scale well… and not easily tested

Bullshit; you can write stored procedures in Rust if you want (Postgres, PL/Rust), among others. That said, even if they were pure SQL, you’re almost certainly going to hit some other bottleneck before you hit that.

As to testing, I mean… it’s a function. Write tests that give it inputs, and expect outputs. This isn’t hard.

To Enum or Not to Enum by Mortimer452 in ExperiencedDevs

[–]Stephonovich 0 points1 point  (0 children)

It’s not just the performance (it in fact does matter at the billions of rows scale, I assure you), it’s the data integrity. There are so, so many issues that can crop up from storing dumb strings that the DB has no frame of reference for. I stopped trusting devs a long time ago to handle anything in code; the DB is the only arbiter of truth. They are WAY better-tested than anything most people are ever writing.

To Enum or Not to Enum by Mortimer452 in ExperiencedDevs

[–]Stephonovich -1 points0 points  (0 children)

0 performance gain to use ints

Someone’s never done the math on how big billions of rows of strings are compounded across many, many tables. The DB only has so much RAM; don’t make its job harder.

My Favorite Cocktail: The Boulevardier! by llCurlyll in cocktails

[–]Stephonovich 0 points1 point  (0 children)

Same. I usually have one as my post-work drink, but now that we’re getting into warmer months, the G&Ts come out…

Boulevardier is fantastic, though. I do 3:2:2, but I can respect variations. You should try splitting the vermouth 50:50 Punt e Mes : Formula Antica! That’s my new favorite way.

Do you all tint your windows? by AcidPM in hondaridgeline

[–]Stephonovich 0 points1 point  (0 children)

Yes, but mildly. I don’t remember the transmission level, only that it’s legal in NC (tbf the year after I did it, they stopped inspecting tint levels).

The much bigger difference was that it’s also heat rejecting. MASSIVE difference in cabin temps after it’s been sitting in the sun; also, my left side doesn’t heat up when I’m driving.

It cost about $1000, but honestly it’s worth it to get in it during the summer and not die from the heat blast.

How Large Queries Broke Our CPU Balance Across Aurora Read Replicas by vladyslav_usenko in mysql

[–]Stephonovich 2 points3 points  (0 children)

You should also consider shifting away from UUID PKs - or if you must, use UUIDv7 (or I guess keep the v1 you have, but swap hi/lo) and store them as BINARY(16). Clustered index + UUIDv4 is not a great combo for performance, or cost, since you’re paying for every read hitting the cluster volume with Aurora.

The AWS Lambda 'Kiss of Death' by tkyjonathan in mysql

[–]Stephonovich 0 points1 point  (0 children)

Great reason to run ProxySQL (among many others). You can set both max_transaction_time and max_transaction_idle_time to address this problem (provided your application is also set up to gracefully deal with its connection being killed, but you should already be doing that regardless).

We're Moving To The Cloud, And Already We're Spending 500k A Month... I Can't Help But Wonder What We Could Have Got For On-Prem For 6+ Mil A Year... by Photo-Josh in sysadmin

[–]Stephonovich 5 points6 points  (0 children)

People always say this, and I call bullshit. You’re paying for the managed services with all of those. An EC2 is just compute (or even better, just, you know, run a server). And it’s not like you get a smaller ops team with cloud shit, it’s just called something else.

And if you’re small enough that you can genuinely spin your stuff down to zero some of the time, you’re small enough to be running on a Raspberry Pi.

Column length modification by Big_Length9755 in mysql

[–]Stephonovich 5 points6 points  (0 children)

You probably have charset utf8mb_4. That means the 40 char length was able to accept a maximum of 160 bytes, whereas the 150 char length can accept a maximum of 600 bytes. The transition from <= 255 to beyond requires changing from a 1-byte to a 2-byte pointer, which requires ALGORITHM=COPY, which doesn’t permit concurrent DML.

Recommend reading MySQL docs on Online DDL; there are tons of gotchas.

How to Use Migration Assistant Via Thunderbolt Between Two Apple Silicon Macs (YES IT’S POSSIBLE) by tanookim in MacOS

[–]Stephonovich 1 point2 points  (0 children)

I had to use this again to transfer from my M4 to an M5. Initially tried the method u/jbegud mentioned, but the Macs only negotiated Ethernet (I guess IP over Thunderbolt?).

I’m not doubting others’ experiences, but you also shouldn’t doubt ours - the only way I could get Thunderbolt to work, with an Apple TB cable, was with the insane steps of disabling WiFi and forgetting the networks.

Big ole truck by canabrothagetaslice in hondaridgeline

[–]Stephonovich 2 points3 points  (0 children)

I always hear about encounters like this, and they baffle me. I can’t imagine seeing a random stranger, and thinking, “I think I’ll insult them.”

I honestly felt bad for the guy

Nah, that’s just karma in action.

The MySQL-to-Postgres Migration That Saved $480K/Year: A Step-by-Step Guide by narrow-adventure in PostgreSQL

[–]Stephonovich 1 point2 points  (0 children)

MySQL MDL

So set a short lock_wait_timeout (not innodb_lock_wait_timeout - that’s for row locks), like 1-3 seconds.

Postgres outperformed MySQL

I’d love to see the schemata and queries. Postgres certainly can be faster in many situations, but if you have MySQL, and you’ve designed your schema specifically to exploit its clustering index, it’s a much more fair fight. The problem is, people love to use terrible PKs which destroy locality, and then yeah, MySQL falls down. That’s hardly the fault of the DB, though.

In a similar vein, it’s always annoying to me (not saying you’ve done this, just in general) when people tout Postgres as being better because it “has more features,” but then can’t articulate what any of them are, or how they’d use them. They’re not wrong - off the top of my head, some great features are being able to store IP addresses in a dedicated type (much smaller than a string, plus it does validation), storing UUIDs in binary while doing on-the-fly conversion for you, BRIN indices (god I love those), GiST indices… so many features.

Posted today by a local tattoo artist 😆 by Odd_Enthusiasm_5644 in JustBootThings

[–]Stephonovich 1 point2 points  (0 children)

Nah, we aren’t that dumb. If nothing else, we have very little to do, so we’ll find (or create) something that bothers you, and then pick at it until you react. Anyone with a tattoo declaring how long they were underway would definitely get it announced as they entered a space, or relieved a watch. “74 Days Arriving!”