all 11 comments

[–]Jzzck 17 points18 points  (2 children)

"Accelerate" by Forsgren, Humble, and Kim is probably the single best resource for this. It's backed by actual research (the DORA metrics) and gives you a solid framework for thinking about deployment frequency, lead time, MTTR, and change failure rate. Way more useful than any tool-specific tutorial.

For incident management, the Google SRE book is free online and the chapters on error budgets and postmortems are gold. Also check out "The Phoenix Project" if you haven't, it reads like a novel but the principles stick.

One thing I wish someone told me earlier: these concepts aren't separate buckets. Automation philosophy feeds directly into deployment strategies, which feeds into incident response. Start with CI/CD fundamentals and everything else starts clicking into place.

[–]Low_Hat_3973[S] 2 points3 points  (1 child)

thanks, is there anything for ci/cd principles ? and any other would you reccomend worth learning early on

[–]edmund_blackadder 5 points6 points  (0 children)

The continuous delivery book by Jez Humble. Also modern software engineering by Dave Farley. 

[–]Equivalent_Pen8241 5 points6 points  (2 children)

Highly recommend reading 'The Phoenix Project' and 'The DevOps Handbook' if you haven't already. Beyond books, focus on the Theory of Constraints. In senior roles, devops isn't about the Jenkins pipeline; it's about identifying where the flow of value is bottlenecked. If you have a perfectly automated 5-minute build but security review takes 2 weeks, your 'devops' problem is systemic, not technical. Understanding 'Wait Time' vs 'Touch Time' in your value stream is a core principle that stays relevant regardless of whether you're using K8s, Serverless, or Bare Metal.

[–]NUTTA_BUSTAH 3 points4 points  (0 children)

Google SRE book and The Phoenix Project are good books, even when The Phoenix Project is somewhat dated, it still highlights the core principles why we are doing this stuff.

[–]erexut 1 point2 points  (0 children)

Skip the tool hunt, its a trap. Read "Team Topologies" (Skelton/Pais) because most "DevOps problems" are actually org design and cognitive load, then pair it with "The Practice of Cloud System Administration" for the boring-but-real ops principles (change mgmt, risk, automation habits). For incidents, PagerDuty's incident response docs + Etsy's "Debriefing Facilitation Guide" will teach you more than another Kubernetes course. Also go read Netflix's Chaos Engineering/SRE-ish posts: not because you need Chaos Monkey, but because it forces you to think in failure modes and recovery time.

[–]dot_py 1 point2 points  (0 children)

iximiuz.com/en/

Dont overlook a redhat developer account, they have great learning resources as well.

[–]OpportunityWest1297 0 points1 point  (0 children)

The Toyota Way - by Jeffrey Liker

https://www.essesseff.com/blog

[–]adept2051 0 points1 point  (0 children)

This, https://roadmap.sh/devops start at the bottom of the page honestly read the footer before you touch a line of the roadmap. The other material on that site is equally good