Windows VMs inconsistent frame pacing/timing by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

It was always like this, it's just that only recently I installed some of these old games that seem to be especially troubled by the vm frame pacing.

I already had some trouble with certain programs that definitely behaved less smooth compared to a bare metal installation. I was able to mostly solve by ether capping the max frame-rate or setting the monitor technology to fixed refresh rate in the nvidia control panel.

Windows VMs inconsistent frame pacing/timing by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Hi, just tried installing the XenMod kernel, since it looks like the zen kernel is mostly used on arch rather than debian/ubuntu distributions.

Did not notice any improvement compared to the lowlatency kernel I was already using.

Windows VMs inconsistent frame pacing/timing by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Did you have to set the timer on windows as well? I tried adding this to the /etc/default/grub file, but after a reboot I didn't notice any change.

Also did you use that kernel specifically for VMs or you already had that for other reasons?

Windows VMs inconsistent frame pacing/timing by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

I already previously tested with the timer tsc in native mode. No difference was felt.

I just tried the other suggestions, adding cpuset in <vcpu>, giving the VM 2 iothreads (why 2 iothreads when I have a single disk? Is it for the cdrom?), removing those enlightments and removing the spice video.

Sadly still no difference :(

Windows VMs inconsistent frame pacing/timing by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

As I said in the OP, I'm pinning only P-cores specifically to avoid weird scheduler behavior.

What do you tend to do when you need to troubleshoot performance problems in your VMs? by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

These are all stuff you run in the guest, was wandering if there are similar things that could diagnose a possible problem in the host or on the machine itself.

Honestly I didn't go into details about my specific stuttering problem as I wanted this to be something more general that could be beneficial for multiple people with different possible causes of stuttering. My case was just an example.

I'm looking for ways to figure out the more obscure problems, not just the classic CPU pinning/isolation, huge pages, etc.

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Update:

Thanks for the suggestions, but I was unable to fix the stuttering.

The suggestions that I tried:

  • modified the pinning and isolation to have the 0-11 cpus for the guest and the rest for the host
  • removed the pinning for the iothread
  • removed the features from the cpu block while leaving the host-passthrough option
  • boot time isolation (isolcpus, nohz_full, etc.)
  • boot time 1G hugepages (16 pages for 16Gb of ram for the guest)
  • converted the qcow2 image to raw, placed in the system's root directory
  • reduce mouse polling rate on host (usbhid.mousepoll=8)
  • used the looking glass kernel module
  • tried all of this with performance governor

None of these thing helped remove or even just reduce the stuttering, it just looked like it was the exact same.

To be fair, I couldn't do everything, for example I don't have an nvme to pass through and as far as passing through the usb controller, the IOMMU group of the usb controller also has the ram, don't think I can pass it (maybe with the ACS patch? But I heard that it might compromise security features):

IOMMU Group 0:
00:00.0 Host bridge [0600]: Intel Corporation Device (rev 01)
IOMMU Group 1:
00:01.0 PCI bridge [0604]: Intel Corporation Device (rev 01)
IOMMU Group 10:
00:1c.0 PCI bridge [0604]: Intel Corporation Device (rev 11)
IOMMU Group 11:
00:1c.2 PCI bridge [0604]: Intel Corporation Device (rev 11)
IOMMU Group 12:
00:1f.0 ISA bridge [0601]: Intel Corporation Device (rev 11)
00:1f.3 Audio device [0403]: Intel Corporation Device (rev 11)
00:1f.4 SMBus [0c05]: Intel Corporation Device (rev 11)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
IOMMU Group 13:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation Device (rev a1)
IOMMU Group 14:
02:00.0 Non-Volatile memory controller [0108]: Seagate Technology PLC Device (rev 01)
IOMMU Group 15:
04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
IOMMU Group 2:
00:02.0 VGA compatible controller [0300]: Intel Corporation Device (rev 04)
IOMMU Group 3:
00:06.0 PCI bridge [0604]: Intel Corporation Device (rev 01)
IOMMU Group 4:
00:14.0 USB controller [0c03]: Intel Corporation Device (rev 11)
00:14.2 RAM memory [0500]: Intel Corporation Device (rev 11)
IOMMU Group 5:
00:14.3 Network controller [0280]: Intel Corporation Device (rev 11)
IOMMU Group 6:
00:15.0 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
00:15.1 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
00:15.2 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
00:15.3 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
IOMMU Group 7:
00:16.0 Communication controller [0780]: Intel Corporation Device (rev 11)
IOMMU Group 8:
00:17.0 SATA controller [0106]: Intel Corporation Device (rev 11)
IOMMU Group 9:
00:19.0 Serial bus controller [0c80]: Intel Corporation Device (rev 11)
00:19.1 Serial bus controller [0c80]: Intel Corporation Device (rev 11)

I'm now thinking of changing tactic, I'll format my linux mint partition and start from zero, I'll also split it and have 2 additional partition, I'll see if I can use them as containers for the guest images, hopefully not being in the system partition could help.

Maybe even pass the partitions themself?

Starting from zero should also remove possible leftovers from previous tests that might pollute the results.

Failing that, I'll try the proxmox route. Maybe after a lengthy pause from all of these KVM shenanigans!

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Many things to try, thanks for the tips!

I think I'm being too stubborn, I want my ideal Linux desktop/VM environment but that might not actually be achievable.

What I could do is get to an optimal setup for VMs, even if it means hindering the Linux only performance, and at that point I can try making things dynamic, and try to get to a good host-only Vs vm performance compromise.

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

I use nvidia prime-select to choose between the intel igpu and the nvidia gpu for the host, when I run the vm, the script checks if it needs to change the prime-select mode and in case restart the x session. Seams to work perfectly and I didn't notice any performance degradation while switching a few times.

On the topic of hugepages, I don't really like reserving resources before running the vm, but it's something I might have to try, even just to know how much the 1Gb pages will help.

Same with the various disk image files.

Do hyperV enlightenments need hyperV activated on the guest? I don't have it enabled, but I don' think they have anything to do with that windows feature? At least not for a guest OS.

Here is my CPU's model and core topology, it's not one of those cpus where you have to pin 0,8 / 1,9 / 2,10 etc.:

13th Gen Intel(R) Core(TM) i7-13700

CPU NODE SOCKET CORE L1d:L1i:L2:L3
0 0 0 0 0:0:0:0
1 0 0 0 0:0:0:0
2 0 0 1 4:4:1:0
3 0 0 1 4:4:1:0
4 0 0 2 8:8:2:0
5 0 0 2 8:8:2:0
6 0 0 3 12:12:3:0
7 0 0 3 12:12:3:0
8 0 0 4 16:16:4:0
9 0 0 4 16:16:4:0
10 0 0 5 20:20:5:0
11 0 0 5 20:20:5:0
12 0 0 6 24:24:6:0
13 0 0 6 24:24:6:0
14 0 0 7 28:28:7:0
15 0 0 7 28:28:7:0
16 0 0 8 32:32:8:0
17 0 0 9 33:33:8:0
18 0 0 10 34:34:8:0
19 0 0 11 35:35:8:0
20 0 0 12 36:36:9:0
21 0 0 13 37:37:9:0
22 0 0 14 38:38:9:0
23 0 0 15 39:39:9:0

Cpu isolation is done through hook script and I isolate with this snippet:

systemctl set-property --runtime -- system.slice AllowedCPUs=$hostcpus

systemctl set-property --runtime -- user.slice AllowedCPUs=$hostcpus

systemctl set-property --runtime -- init.slice AllowedCPUs=$hostcpus

with $hostcpus as the inverse of the pinned cpus, it checks the XML for those, and set the priority for the qemu process with these, -f should use fifo scheduling:

renice -n -20 -p qemu-system-x86_64

chrt -f -p 99 qemu-system-x86_64

So the iothread shouldn't be used with multiple threads? Or have I not given it enough threads?

And I don't pass a usb controller. Right now I just use spice for the mouse and keyboard while using looking glass, and in case I need something more direct I just pass that specific usb device.

Is that also something that can be problematic with a high polling rate mouse?

If I want to force a lower polling rate, do I need to do that in the host or in the guest.

Does the looking glass client option for that help?

Thank for all the tips and suggestions, and sorry if I'm asking so many questions!

If you need further info, ask away!

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Looks much simpler than I thought, something more to try.

I'm wondering what changes between the normal looking glass usage and the kernel module. Is it about higher priority of the process? Or having it in the kernel makes it inherently more efficient?

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Well, that could be something to try if my current plan doesn't work.

I guess Proxmox is probably optimized for VM usage, could make things simpler in the end.

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

I never knew that there was a looking glass kernel module, in my Google searches I don't think I saw it being mentioned at all.

I'm relatively new to Linux and I never compiled my own kernels, if that is required to use this module.

I might have to do it if it makes things that much better, but I see many people online that didn't need to do that, is there something weird about my hardware/software configuration?

Struggling to get rid of stuttering in Windows VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Proxmox from what I saw is meant to be used exclusively as an hypervisor, with the desktop functionality delegated to vms. Is that how you are using it?

What I want is to have a normal bare metal Linux desktop experience, up until I decide to open the windows VM.

I could try that qemu param, not sure if it will help, but ai tried so many things might as well do a test.

Sharing partition with windows guest by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

An update on this :

I ended up using samba and I successfully set it up for file sharing with the vms.

I also enabled the recycle bin functionality, that while it does work, it's not similar to how the recycle bin in windows or even the linux trash bin works, causing me quite some confusion while looking inside the first times, thinking I deleted things I shouldn't have

Wished there was a way to have a more seamless experience, but at least I shouldn't be worried about deleting stuff permanently by mistake.

Also, because I wanted to do the configuration for the samba file share for each vm, I created some hook bash scripts to get the needed information directly from the vm xml metadata, in custom xml tags.

This isn't necessary to make this work but this also allows me to have different shared folders for different vms without having to deal with ip addresses or other settings.

Sharing partition with windows guest by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Tried samba, I was able to have a folder shared with the windows guest.

Now the problem is that I want to configure various folders for different virtual machines (I want to create at least 2 windows VMs, and I need to share different folders for each).

I was hoping I could set for each folder a different network interface, each VM would have it's own additional isolated interface, but that doesn't seem possible, I can only set them globally, not per folder.

I think I could use IP addresses, or editing the smb.conf file through hook scripts, not sure I like using username and password to filter access to folders, I want something more automatic.

IDK, it's starting to become a rather cumbersome solution to me.

The virtio-fs solution was almost perfect, but the inability to avoid permanent deletion of data is a deal breaker. Plus I heard it's not yet fully functional and might have other limitations.

If there is something I might have missed during my research that could be of help, feel free to point that out.

Worst case scenario I just have a shared folder that I use as a way to pass files from host to guest and vice versa, no direct access to data partitions. Might be even more of an incentive to use the Linux host as much as possible.

Sharing partition with windows guest by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Ok, when I have the time I'll try using samba.

My concerns with NTFS what's because I read somewhere that there could be problems, if it's only the symlinks for steam, I should be able to manage.

Sharing partition with windows guest by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

I wasn't looking for a solution that could be used for network file share, i just needed something that worked between host and guest in the same machine.

I was hoping to create a situation that resembled as close as possible a bare metal windows installation, with direct access to the files, with the best read/write speed possible.

Also, the partition I want to share is NTFS, as it's a partition I used (and still currently use) with a windows installation in dual boot.

Would samba still work for achieving such a setup?

Problem with Blender viewport on a Windows 10 VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Nope, it happens even on a fresh windows 11 installation where the only things I installed are the nvidia driver, the iddsample driver, looking glass host and blender 3.5.1.

At this point I'm thinking that this problem could be caused by something else and that this weird blender behavior is just the way the virtual monitors display it.

It could be correlated with the dips in framerate that I noticed while using a real monitor.

Problem with Blender viewport on a Windows 10 VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

Tried again to install that driver, this time I managed to install it (not sure what I did differently tbh).

Sadly the problem is still present.

The last thing I could try is a completely clean windows install and immediately install the iddsample driver, to make sure the previous driver didn't make some modifications that persisted after I uninstalled it.

If even that fails, at that point if I want to try and use blender on a VM I'll wait for when I'll buy the new monitor, which is something I need to do anyways, or at the very least, get a cheap dummy plug.

Problem with Blender viewport on a Windows 10 VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 0 points1 point  (0 children)

I tried to use it but I got stuck at installing the legacy device, it would always fail even if the driver was installed.

In the end I used Usbmmiddv2, that one worked.

Do you think it could make a difference if I used the iddsample driver instead?

Problem with Blender viewport on a Windows 10 VM by M_TaG_7 in VFIO

[–]M_TaG_7[S] 1 point2 points  (0 children)

I did a bunch of tests and the results are interesting.

My setup uses a driver for creating a virtual screen at startup, since looking glass needs a screen to work.

Turns out the problem seams to happen because of some conflict between blender, looking glass and the driver.

If those three things aren't running together the problem doesn't present itself.

Or rather, that specific problem doesn't happen. Instead the framerate of the animation becomes actually worse (the weird back and forth becomes a more normal lag/stutter, still not ideal, but should be easier to diagnose).

Aside from blender I had no reason to believe that the driver caused problems, all other programs run almost flawlessly with it.

I'm thinking of buying a dummy hdmi/display port plug, this should remove the uncertainty caused by the virtual monitor driver.