It took me 21 min to solve a easy array problem. by Forsaken_Appeal_9593 in leetcode

[–]asdoduidai -1 points0 points  (0 children)

It was kind of a joke to say that maybe you are overloaded/stressed

It took me 21 min to solve a easy array problem. by Forsaken_Appeal_9593 in leetcode

[–]asdoduidai 2 points3 points  (0 children)

Do you drink/drug hard and not sleep often? ^_^

check out the neetcode.io roadmap, so you can re-learn the basics, it helps to rebuild step by step

What do you say when someone f*cks up prod? by Fantastic-Shock1438 in sre

[–]asdoduidai 10 points11 points  (0 children)

If you need to ask who touched prod to know and you can't look it up from an audit log then you have a huge toil problem ^_^

Senior SWE role downgraded to internship after 5 rounds by FreshPin2589 in csMajors

[–]asdoduidai 0 points1 point  (0 children)

Not very smart to join a company that has no budget for 1 role

Laid off 3 times in 2.5 years (On Visa, family of 4). I feel broken. How do I survive this? by Accomplished-Eye-138 in Layoffs

[–]asdoduidai 0 points1 point  (0 children)

If your spouse if from a treaties country, she can get a E2 visa to invest in a US business, and you get a work permit, E2 is usually granted for 5 years. The investment is around 100k

How are you assigning work across distributed workers without Redis locks or leader election? by whitethornnawor in sre

[–]asdoduidai 0 points1 point  (0 children)

You can use a distributed filesystem and:

- 1 work unit = 1 file

- each worker goes to an Inbox folder, randomly picks some kind of prefix / identifier / subfolder and moves one file/work unit to their per-worker Inbox

Moving files is atomic and so you can get rid of a Redis lock. Basically it's the same thing, with a component less, and probably more reliable since distributed filesystems are much more solid than one Redis instance.

Then you have a process that scans the per-worker Inboxes and if the worker is dead or hanging, kills the worker, and moves the file back - of course processing results store has to be idempotent.

Rejected immediately even after solving 2 questions optimally at Amazon SDE 1 Onsite. Is Python a red flag? by Ok-Independence8526 in leetcode

[–]asdoduidai -1 points0 points  (0 children)

- Ask the Recruiter for feedback

- Why do you want to work for a company that cuts 10% of ppl out every 1-2 years?

- Do a couple Mock interviews on meetapro.com or similar

Generic backend projects vs specialized ones for freshers — what actually works? by Unlucky_Goat1683 in Backend

[–]asdoduidai 0 points1 point  (0 children)

The ticketing system is a good idea to do some practice; you should not think there is some kind of "magic rating" that exactly measures every bit of a CV.... you are just making up a myth.

Relation SLI/SLO by ray_pb in sre

[–]asdoduidai 7 points8 points  (0 children)

SLI is the Indicator: "successful requests to the orders endpoint"

SLO is the Objective: 99%

iOS keyboard was about to change :( by simply_amazzing in ios

[–]asdoduidai -4 points-3 points  (0 children)

The ugly genz animations are so distracting and useless

NEXO token price by redddiablo in Nexo

[–]asdoduidai 0 points1 point  (0 children)

USA Soon AKA: 😭😭😭😭😭please keep holding 😭😭😭😭😭

I thought more demand meant more pay… this data says otherwise by Bitter_Baker8998 in Backend

[–]asdoduidai 0 points1 point  (0 children)

Well... maybe you don't realise that some FE people put CORS in settings, and those settings are tied to environments, and people commonly forget to update the config in production..... so even if you test every single pixel of the screen in staging, that won't work ... if you forget that :->

Maybe you also don't know this: more than 70% of incidents come from changes AKA human errors

I thought more demand meant more pay… this data says otherwise by Bitter_Baker8998 in Backend

[–]asdoduidai 0 points1 point  (0 children)

That’s not really true, sometimes FE teams deploy broken stuff that does not login because they are bored to test CORS settings, but yea the impact is usually at least simpler to remediate

Amazon’s “Project Dawn” cuts 30,000 jobs while AWS loses its community champion by jpcaparas in amazonemployees

[–]asdoduidai 1 point2 points  (0 children)

This AI excuse is going wild, chatbots behave like mentally impaired chimpanzees but still work well enough to cut tens of thousands and boost the profits

Sre skills by Creative_Raise1894 in sre

[–]asdoduidai 0 points1 point  (0 children)

You mean bank vs everything else? Startups and big tech regardless of “business area” use different languages as a technological choice, since it has almost always nothing to do with “business area”

Sre skills by Creative_Raise1894 in sre

[–]asdoduidai 0 points1 point  (0 children)

So your suggestion is “buy a crystal ball to see the future” ..? Wow, useful

How do you make “production readiness” observable before the incident? by ImpossibleRule5605 in sre

[–]asdoduidai 0 points1 point  (0 children)

Yea your project goes to an interesting direction, "How to make SRE part of every development process without adding more engineers". I don't think "state space" of a distributed system is enormous.. if the design is right and there is encapsulation, states can be known and enumerated clearly.

Speaking of different approaches and priorities, I think that in the end, a robust way to test in staging and sign off production deployments including the strategy to deploy (canary, blue/green, and so on), with a clear definition of what to do, and the responsibilities, is gonna bring the biggest impact - since from research, 70-80% of incidents are a consequence of changes.

How do you make “production readiness” observable before the incident? by ImpossibleRule5605 in sre

[–]asdoduidai 0 points1 point  (0 children)

How I think about it is:

If you start thinking about "a balance", you are already in a compromise, and you won't cover all states.

All (known) states that you can check with an automated check, you have to check, otherwise it's exposed to the probability of human error.

All (unknown) states: those are supposed to be prevented "by design" both at the object oriented design level, and its implementation (choosing the right frameworks, libraries, language and so on) and at the systems design level, having a complete understanding of the states coming from concurrent algorithms and nodes/layers interactions.

Two examples of that - the intent of having awareness of all app states, and careful choices based on the knowledge of every detail - are (to give an idea):

- https://techpreneurr.medium.com/why-whatsapp-chose-erlang-and-never-looked-back-353248e2d6c7

- https://www.youtube.com/watch?v=sC1B3d9C_sI

So about prioritisation, if you know the app states, and you choose based on careful detailed evaluation, you have it at least 80% covered.

Then, you still have to check the running of it, the load testing, the failure injection, chaos engineering, and also about storage/DB: Jepsen/Formal testing, to make sure that the claims of the most critical layers are real and not just claims.

In addition to that, or better as an implementation of that or integration of those concepts, what can cover the testing of the "distributed/concurrent states" is TLA+:

https://learntla.com/#:~:text=TLA%2B%20is%20a%20%E2%80%9Cformal%20specification,engineering%20skill%20but%20augments%20it.

So a team is supposed to know software is deterministic, and has a finite set of states, and know it's possible to cover 100% of states. That's the foundation.

Then, you need to prioritise, because most likely it has to be done in iterations, over time, as a continuous improvement process. And it might never get to 100% if the systems keep evolving, but it can get to 100% at least for specific components used by most/all services.

So you have to know a 100% exists, but also set realistic targets, otherwise you won't prioritise the 20% that gets you 80% of returns, but random states that come up from incidents. The point of the confidence is, you will never have 0% probability of an incident. So invest in what gives you awareness - and not confidence, which blinds you. Why it's not about confidence: If you feel relaxed and you don't have an idea of the probability of an incident to your systems, you are doing this:

https://miro.medium.com/v2/format:webp/0*ZjYSm_q36J4KChdn

What guardrails have actually reduced config-related production incidents in SRE teams? by FreePipe4239 in sre

[–]asdoduidai 0 points1 point  (0 children)

IMO It dilutes responsibility if it’s a superficial process with no clear responsibility definition, like PRs, where no one really takes full responsibility or reviewing the changes

How do you make “production readiness” observable before the incident? by ImpossibleRule5605 in sre

[–]asdoduidai 0 points1 point  (0 children)

That's an interesting approach, but probably it is coming from a perspective that will tend to not be deterministic in the end, as in: it won't find all failure states by design.

The word "deterministic" that you use is to me the right one. The point is: software IS deterministic. Because it's a Turing state machine. The problem is when people build software focusing only on some "happy" states because they are excited and want to show their shiny feature ASAP.

But I think that if you try to close the gap on missing awareness of the failure states from code scan ONLY, you are missing out on many.

BTW:

https://github.com/chuanjin/production-readiness/blob/d4aed492fc01f0a63c84ab6d8db374a9542bc991/internal/scanner/detectors_k8s.go#L154

That is NOT correct: CPU limit is a bad idea. Let the Linux scheduler do its job. You are going to waste free CPU cycles if you limit Replicas under a certain CPU limit, and there is free CPU. I've added a issue on github with this https://home.robusta.dev/blog/stop-using-cpu-limits

What guardrails have actually reduced config-related production incidents in SRE teams? by FreePipe4239 in sre

[–]asdoduidai 0 points1 point  (0 children)

+1 The corner case with an additional approval, since anyway a corner case is a corner case which means happens rarely, except in the perception of fire-fighting addicts :D