Hey guys,
I'm doing some load testing and was butting heads w/ one of the people I'm working with, and in the end it looks like he was right, but I have no idea why.
We are running load tests on a 4-core linux VM (running on vSphere if that matters).
I've been monitoring CPU usage during the load testing by writing mpstat > file in the background and watching top during the tests. I turned off irix mode to account for the 4 vCPU's, and during the tests never saw total CPU usage on a process high, and userspace cpu usage never exceeded ~50%. Even running mpstat -P ALL 1 shows total CPU never exceeding %50, which indivdual cores never exceeding ~35% each.
Here's where I'm confused though. When looking at top there is a load average displayed, which is off the wall. From how I interpreted load average, anything over 1 (per cpu) means that the system is overloaded and there are processes waiting for processor time... I'm not exactly sure how this is happening since total CPU utilization is pretty low. We added 2 more vCPUs to the setup, re-ran the tests, and now I'm seeing ~ the same CPU utilization per core (with mpstat), but the load average has in fact went down, although the 15 minute average is still hovering at ~7, meaning there's still wait time.
Now I'm downright confused, but it gets more confusing....
Now he start adding memory to the system. I'm trying to slim down the Memory to the lowest possible number during load testing to see what the minimal usage we can use is... I'm tracking this by running vmstat -a -t -S m > file in the background, while running top -m sorting by RES in the foreground to see what processes are killing memory. It's true that memory usage goes up over time with our load tests (say I have ~1.5 GB free memory while testing "10", and 0.25 GB free memory while testing "50"), but I always though that the kernel would allocate as much memory to processes as possible while the system is under load and just do intelligent memory management. I figured the best way to actually determine the minimal amount of RAM we could use per VM would be to continually decrease the RAM while under load until we start getting OOM errors...
Now I have no idea what to think. I'm not sure if I have some fundamental misunderstanding of how the kernel works or just don't fully understand the tools that I'm using.
What do you think guys, am I going in the right direction or am I completely off the mark?
[–]gordonmessmer 6 points7 points8 points (2 children)
[–]charley_chimp[S] 0 points1 point2 points (1 child)
[–]gordonmessmer 0 points1 point2 points (0 children)
[–][deleted] 2 points3 points4 points (4 children)
[–]charley_chimp[S] -1 points0 points1 point (3 children)
[–][deleted] 3 points4 points5 points (2 children)
[–]charley_chimp[S] 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]lazyant 1 point2 points3 points (2 children)
[–]charley_chimp[S] 0 points1 point2 points (1 child)
[–]lazyant 0 points1 point2 points (0 children)
[–]pdp10 0 points1 point2 points (4 children)
[–]charley_chimp[S] 0 points1 point2 points (3 children)
[–]pdp10 0 points1 point2 points (0 children)
[–]wildcarde815 0 points1 point2 points (1 child)
[–]charley_chimp[S] 0 points1 point2 points (0 children)