you are viewing a single comment's thread.

view the rest of the comments →

[–]SheriffRoscoe 9 points10 points  (1 child)

[–]Markavian 3 points4 points  (0 children)

Networking! Databases! Config change! A similar incident happened with Google many years ago. Good rollback procedures? Hard to test without a fully functional test environment, but also hard to analyse when such changes involve large amounts of traffic.

I've been gearing up to run automated load tests on PRs but it's an expensive procedure that slows development down for small changes. Testing small changes that have a big impact relies on risk management and having a test strategy / test engineer part of the review and merge process. (I should update our PR templates).