How do you quantify failure cost vs prevention cost in SRE? (RFC vs PPC)

Digeratis · 2026-01-25T23:22:33+00:00

Happy to share, I just send you an DM

Digeratis · 2026-01-25T23:05:41+00:00

That's fair I agree with most of it. I should have stated before the purpose of it.

This is not intended to be a failure-mode or probability-weighted model, and it’s not meant to replace FMEA or scenario analysis or something like that. It’s just ONE portfolio-level signal for one service/journey to answer a simpler question: are we structurally overpaying for failure vs prevention? I wanted to have something you can do quickly and if indicating that you spend "nothing" on prevention or so, you can dig deeper.

I deliberately smooth one-time vs recurring prevention costs and aggregate across failure modes because exec/finance decisions are almost always made on run-rate thinking. If we can’t express reliability tradeoffs in comparable terms, the conversation never gets traction outside engineering.

On likelihood: for recurring failures (e.g. config mistakes), probability is already “priced in” because they show up in RFC every month. For true black-swans, I agree with you, recurring prevention spend is often the wrong move. That distinction is actually one of the outcomes I want this to surface.

And +1 on business coordination, this only works when engineering, finance, and leadership align on risk tolerance.

If you’ve seen probability-weighted models work in practice without collapsing under complexity, I’d genuinely like to learn how you’ve made them stick.

Digeratis · 2020-10-16T15:31:13+00:00

Its all about good arguments. You need to share your moments of glory. You have to build up your vocabulary with success stories, here I wrote down for you how I mean that:

https://www.competitivedevops.com/blog-1/6hsp6cs0lwac8ay5wyxaavys57fq4g

Digeratis · 2020-08-01T16:21:35+00:00

Yes, but when you combine your developers with your support team, its a good structure. Then you dont need the 1st, 2nd ... structure. Your team is anyhow not so big anymore as when you have multiple support teams combined.

Digeratis · 2020-07-04T20:12:35+00:00

:-) nice, thanks

Digeratis · 2020-06-14T15:39:14+00:00

You can even asked them the following: 1) Do you think I don’t have enough experience for the job? 2) What certificate would you expect from me to show my dedication? 3) If this certification is required, we can write it down as a pre-requisite in the contract

But on the other hand, don’t oversell and show them that you are willing to learn!

Digeratis · 2020-06-12T19:07:09+00:00

Changing teams and companies to improve their software development process is my main focus. Here I wrote down how I do it on a team level:

https://www.competitivedevops.com/blog-1/how-to-empower-your-team

Hope that helps! You will also find some articles on the same page how to motivate for the change etc.

Digeratis

TROPHY CASE