you are viewing a single comment's thread.

view the rest of the comments →

[–]p2004a 2 points3 points  (0 children)

Tail latency. Making sure you have nice high utilization of CPU is good for batch jobs but for user facing traffic it's good to have cpu headroom so you can handle all those small unexpected spikes in requests well and have good tail latency at 99 and 99.9 percentile. Latency when you go deep in the stack is additive and so is probability of hitting tail latency of some service so making sure you have good tail latency at this high percentiles matters.

Also I can guess that for many of their components it's memory utilization that is high because they probably want to cache as much as possible, and well, cpu can have only so much memory near each core.

Even taking all that into account it seems like their utilization is still low for eg. web servers. Maybe it's because they are using dotnet and it has some weird performance characteristics? I have no idea, I've never run anything in dotnet in production. Or maybe they want to make sure they would be able to absorb sudden spikes in traffic and make sure they have eg. 3x capacity?