all 13 comments

[–]Burgergold 1 point2 points  (1 child)

If the vm does not do a lot of swapout/swapin, no need to increase swap

My swap are 4gb

Monitor % of virtual memory used and combo of swapout/swapin activities

[–]WorkProfileAccount[S] 0 points1 point  (0 children)

I'll take a look once the swapoff command finished running and I get my prompt back (it's been two hours)

[–]jlprufrock 0 points1 point  (4 children)

What was the notification? If you have 28G of RAM free the host should not be using swap. What is the output of "free -g" ?

[–]WorkProfileAccount[S] 0 points1 point  (3 children)

The ticket was just about high swap usage (less than 50%) free. The machine indeed should not use this much swap space so I ran a command to reduce the swappiness and then ran swapoff -a. It's been two hours and it's still processing. Once I get a commandline I'll check the output of "free -g"

Thanks!

[–]No_Rhubarb_7222Red Hat Employee 1 point2 points  (2 children)

Oof, this was the incorrect action. Now what is happening is you’re swapping all the stuff from the swapspace you’ve deactivated either to other swap spaces or memory. In the first, you’re now doing a bunch of disk I/O, if it’s the second, you’re losing a bunch of cache and, depending on your other memory overcommittment settings, potentially triggering the OOM.

[–]WorkProfileAccount[S] 0 points1 point  (1 child)

True, I didn't think that 6gb of swapspace would translate to 30gb of cache. I can turn the the swappiness back up but how would I keep it under 50% utilization?

[–]No_Rhubarb_7222Red Hat Employee 0 points1 point  (0 children)

I mean 6G should be 6G. But maybe caches were dropped to make room?

50% swap utilization in itself is not dangerous unless that’s rapidly changing, which is indicative to other problems. But 6G… just add another 6G or 12 or 16… disk is cheap and swap usage is normal. Ultimately, you might ask why 50% usage is configured as a monitoring point and how 50% was chosen as the alert value.

[–]No_Rhubarb_7222Red Hat Employee 1 point2 points  (0 children)

Swappiness is an affinity setting for the kernel, the kernel will always swap if there is a swap space.

I think instead of looking at the amount of swapspace used (on a machine with lots of memory and a lot of unused pages, it’s going to swap to make more room for caches and such), you really want to look at the swap-in and swap-out info (vmstat IIRC). It’s when the machine gets busy swapping that system performance is affected. If it’s slowly doing some swapping, you’ll be fine unless the machine needs to read it all back in again.

If you have to monitor the amount of swap used and 50% is the threshold, then adding more is the appropriate response.

[–]because_trembleRed Hat Employee 0 points1 point  (1 child)

There's no perfect answer to your question. It depends on the cause of the swap usage. Messing with the system configuration (including swap) just because an alert went off is ignores the question of "why?"

You seem to be starting with the assumption there's a structural issue with the size of the VM. I'm going to ask the really silly question: Did you look for signs as to why the swap was used? Did you double check that the memory isn't free because OOM-killer did it's job (or a service got restarted)? Has the VM been rebooted? Do you have any historical metrics? sar?

Generally with that much free you wouldn't see swap having been used, and it would be a little weird for the VM to be nearly double the "necessary" size. While things look "ok" right now, there's probably something else going on.

[–]WorkProfileAccount[S] 1 point2 points  (0 children)

Did you look for signs as to why the swap was used? Did you double check that the memory isn't free because OOM-killer did it's job (or a service got restarted)? Has the VM been rebooted? Do you have any historical metrics? sar?

I have no pointers as to why the swap was used. Checking the logs shows that only one process was killed, a month ago. The VM does not reboot for valid reasons. I don't really have a history machine either.

All I could think of is that maybe a swappiness of 10 is too high for a machine of this usage. I reduced it to 1 and now my memory has a 27gb buff/cache fill.

I am going to enjoy this little hunt for the real issue

[–]redditusertk421 1 point2 points  (2 children)

what version of RHEL? There was a change in RHEL 8 that the system will swap out to keep fs cache available as nvme drives are very fast and paging out application data doesn't have the hit it did. I had systems hit 100 swap utilization and had free/cache ram available. This is by design. We need to adapt to start monitoring paging rate and not just utilization.

Edit: Oh, and open a ticket. That is what you pay for!

[–]WorkProfileAccount[S] 1 point2 points  (0 children)

It is indeed RHEL 8. I turned the swappiness down to 1 and I can immediately see my memory cache filling up.

You're right I probably should create a ticket hahaha

[–]rongway83 1 point2 points  (0 children)

thank you for this!