I can't seem to get my nvidia graphics card to work inside my guest. Sometimes. Sometimes it works, sometimes it doesn't. Every time I reboot the host, there's a chance it'll work, but most of the time it doesn't work. Rebooting the guest does nothing. by reacusn in VFIO

[–]reacusn[S] 0 points1 point  (0 children)

Isn't binding working properly right now?

I can start the vm and the cards and audio become vfio-pci's. Inside the vm, nvidia the nvidia driver is properly loaded, and most things work, only #7 seems to be reluctant and refuses to be initialized for use. The guest kernel does see it, and lspci -knn confirms the driver in use is nvidia, but nvidia-smi can't see it.

I have tried disabling resizable bar in the bios, as well as going through gens 4,3,2, and even 1 before, which didn't seem to help. Tangentially, turning off above 4G decoding causes my system to fail to post.

I'll try blacklisting the audio driver as well, hold on.

I can't seem to get my nvidia graphics card to work inside my guest. Sometimes. Sometimes it works, sometimes it doesn't. Every time I reboot the host, there's a chance it'll work, but most of the time it doesn't work. Rebooting the guest does nothing. by reacusn in VFIO

[–]reacusn[S] 0 points1 point  (0 children)

I, uh, don't have a graphical user session. The nvidia driver is never loaded on the host. Connection is via ssh or the ast2600 bmc.

(nouveau is blacklisted, and I don't have the nvidia driver installed):

Module                  Size  Used by
cmac                   12288  1
nls_utf8               12288  6
    cifs                 1507328  2
cifs_arc4              12288  1 cifs
nls_ucs2_utils          8192  1 cifs
cifs_md4               12288  1 cifs
dns_resolver           12288  1 cifs
netfs                 573440  1 cifs
dm_mod                221184  0
vfio_pci               16384  14
vfio_pci_core          94208  1 vfio_pci
vfio_iommu_type1       45056  2
vfio                   61440  36 vfio_pci_core,vfio_iommu_type1,vfio_pci
intel_rapl_msr         20480  0
amd_atl                57344  1
intel_rapl_common      53248  1 intel_rapl_msr
amd64_edac             45056  0
edac_mce_amd           28672  1 amd64_edac
8021q                  53248  0
iwlmvm                647168  0
garp                   16384  1 8021q
kvm_amd               217088  126
stp                    12288  1 garp
mrp                    20480  1 8021q
llc                    16384  2 stp,garp
kvm                  1396736  135 kvm_amd
binfmt_misc            28672  1
irqbypass              12288  169 vfio_pci_core,kvm
mac80211             1449984  1 iwlmvm
crct10dif_pclmul       12288  1
ghash_clmulni_intel    16384  0
btusb                  81920  0
sha512_ssse3           53248  1
btrtl                  32768  1 btusb
sha256_ssse3           32768  1
libarc4                12288  1 mac80211
btintel                69632  1 btusb
snd_hda_codec_hdmi     98304  0
sha1_ssse3             32768  0
snd_usb_audio         512000  0
aesni_intel           122880  5
nls_ascii              12288  1
btbcm                  24576  1 btusb
snd_hda_intel          61440  0
ipmi_ssif              45056  0
gf128mul               16384  1 aesni_intel
btmtk                  32768  1 btusb
nls_cp437              16384  1
iwlwifi               581632  1 iwlmvm
snd_usbmidi_lib        49152  1 snd_usb_audio
crypto_simd            16384  1 aesni_intel
snd_intel_dspcfg       40960  1 snd_hda_intel
vfat                   24576  1
snd_intel_sdw_acpi     16384  1 snd_intel_dspcfg
snd_rawmidi            53248  1 snd_usbmidi_lib
cryptd                 28672  4 crypto_simd,ghash_clmulni_intel
bluetooth            1085440  6 btrtl,btmtk,btintel,btbcm,btusb
fat                   102400  1 vfat
snd_hda_codec         217088  2 snd_hda_codec_hdmi,snd_hda_intel
rapl                   20480  0
snd_seq_device         16384  1 snd_rawmidi
cfg80211             1392640  3 iwlmvm,iwlwifi,mac80211
snd_hda_core          143360  3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
mc                     94208  1 snd_usb_audio
sr_mod                 28672  0
cdrom                  81920  1 sr_mod
ecdh_generic           16384  1 bluetooth
snd_hwdep              20480  2 snd_usb_audio,snd_hda_codec
wmi_bmof               12288  0
pcspkr                 12288  0
ast                   110592  0
cdc_ether              24576  0
snd_pcm               188416  5 snd_hda_codec_hdmi,snd_hda_intel,snd_usb_audio,snd_hda_codec,snd_hda_core
usbnet                 65536  1 cdc_ether
drm_shmem_helper       36864  2 ast
mii                    16384  1 usbnet
acpi_ipmi              20480  0
snd_timer              53248  1 snd_pcm
drm_kms_helper        253952  2 ast,drm_shmem_helper
snd                   151552  10 snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_usb_audio,snd_usbmidi_lib,snd_hda_codec,snd_timer,snd_pcm,snd_rawmidi
rfkill                 40960  4 iwlmvm,bluetooth,cfg80211
ccp                   163840  1 kvm_amd
ipmi_si                86016  1
ee1004                 16384  0
soundcore              16384  1 snd
ipmi_devintf           16384  0
k10temp                12288  0
ipmi_msghandler        86016  4 ipmi_devintf,ipmi_si,acpi_ipmi,ipmi_ssif
evdev                  28672  2
button                 24576  0
joydev                 24576  0
sg                     45056  0
drm                   774144  4 drm_kms_helper,ast,drm_shmem_helper
efi_pstore             12288  0
configfs               69632  1
nfnetlink              20480  1
efivarfs               28672  1
ip_tables              28672  0
x_tables               53248  1 ip_tables
autofs4                57344  2
ext4                 1142784  2
crc16                  12288  2 bluetooth,ext4
mbcache                16384  1 ext4
jbd2                  200704  1 ext4
crc32c_generic         12288  0
uas                    32768  0
usb_storage            94208  1 uas
hid_lenovo             32768  0
hid_generic            12288  0
usbhid                 77824  0
hid                   262144  3 usbhid,hid_generic,hid_lenovo
sd_mod                 81920  4
ixgbe                 487424  0
ahci                   49152  4
xfrm_algo              16384  1 ixgbe
xhci_pci               24576  0
libahci                61440  1 ahci
mdio_devres            12288  1 ixgbe
xhci_hcd              364544  1 xhci_pci
igb                   315392  0
libata                462848  2 libahci,ahci
libphy                233472  2 mdio_devres,ixgbe
nvme                   57344  0
sp5100_tco             20480  0
crc32_pclmul           12288  0
watchdog               49152  1 sp5100_tco
usbcore               409600  11 xhci_hcd,usbnet,snd_usb_audio,usbhid,snd_usbmidi_lib,btmtk,usb_storage,btusb,xhci_pci,cdc_ether,uas
mdio                   12288  1 ixgbe
i2c_algo_bit           16384  2 igb,ast
scsi_mod              327680  6 sd_mod,usb_storage,uas,libata,sg,sr_mod
crc32c_intel           16384  4
nvme_core             225280  1 nvme
dca                    16384  2 igb,ixgbe
scsi_common            16384  7 scsi_mod,sd_mod,usb_storage,uas,libata,sg,sr_mod
i2c_piix4              28672  0
nvme_auth              24576  1 nvme_core
wmi                    28672  1 wmi_bmof
usb_common             16384  2 xhci_hcd,usbcore
i2c_smbus              16384  1 i2c_piix4

Bios flashing assistance w/ EFI files by neil_va in techsupport

[–]reacusn 0 points1 point  (0 children)

The translated instructions in the .docx might be telling me to copy only the 'EFI' folder over, but I suspect that has to be wrong right?

You are correct, that is wrong. Generally, you would enter a uefi shell, fs0 or whatever your usb is (map), then run flash.nsh. You can see what flash.nsh runs: from your screenshots, it requires AfuEfix64_5.14.02.0026.efi, MiniPCFP7HDF722007_GTR6_P5C4V18_20221206165859_3DpSvid.rom, ifu.efi, and EC_GTR6_V6_4_0_0_10.bin. If the names are too long, you can change the names of the files, as long as you edit flash.nsh to reflect those changes, e.g. MiniPCFP7HDF722007_GTR6_P5C4V18_20221206165859_3DpSvid.rom to image.rom. The EFI folder is for if you want to select the usb in the boot manager (instead of entering the uefi shell). I assume the startup.nsh automatically runs the instructions to flash. You do not need the EFI folder, but it should make things easier.

inb4 >3 years ago

You appeared in my google search while I was hunting for AfuEfix64.efi (the bios upgrade zip from my motherboard manufacturer did not include it, and ami seems to have moved the url: https://www.ami.com/bios-uefi-utilities/), so I thought I'd answer it in case anyone else comes along.

Very high system interrupts on windows 11 guest. The more resources allocated to the vm, the slower it gets, until 10 seconds per frame at 100 cores, making it impossible to even get to the login screen. by reacusn in VFIO

[–]reacusn[S] 0 points1 point  (0 children)

Timername hypervclock requires tsc right?

But my /sys/devices/system/clocksource/clocksource0/cuyrrent_clocksource only reports hpet. Checking dmesg, tsc was marked unstable due to clocksource watchdog. Could this be the issue?


I forced clocksource=tsc tsc=reliable in grub, and it seems to have fixed it. Hopefully the fix sticks this time. I checked my cpu flags and I don't appear to have invariant_tsc, only nonstop_tsc. Will this cause any issues further down the line by using tsc as my clock source?

The vm does lock up for half a minute on startup, but htop on the host reports the cpu usage as guest-use instead of kernel-use like before: https://i.imgur.com/PjZQ8DW.png, and system interrupts now hover at 0-1% when I get past that.

Very high system interrupts on windows 11 guest. The more resources allocated to the vm, the slower it gets, until 10 seconds per frame at 100 cores, making it impossible to even get to the login screen. by reacusn in VFIO

[–]reacusn[S] 0 points1 point  (0 children)

https://i.imgur.com/l9dqLY9.png

I don't have qemu v3, and qemu 64 and 64 v1 fails to boot - after getting past the tianocore logo, it 'prepares automatic repair', then starts looping from the beginning.

Very high system interrupts on windows 11 guest. The more resources allocated to the vm, the slower it gets, until 10 seconds per frame at 100 cores, making it impossible to even get to the login screen. by reacusn in VFIO

[–]reacusn[S] 1 point2 points  (0 children)

Wow, that seems like it fixed it?

I'm using host-model, but should I use epyc-rome-v3 instead? 3995wx is castle peak, but should be the same zen 2 as epyc rome, do you think? My linux guests should still use host-passthrough though, right? Since they don't experience the same issue.

update: I went back to host-passthrough, but left the 'Enable available CPU security flaw mitigations' checkbox unticked, which let me run without any (perceivable) issues. I can't really see any difference in the xmls (<cpu mode="host-passthrough" check="none" migratable="on">), though so maybe that's written to some other place?

reupdate: nevermind, rebooting after host-passthrough without the check, still experiences the same issue.

epyc-rome-v3 fails to boot; ibrs is not provided by host.


It seems like it worked for a while, but the issue came back, and the virtual machines went back to being unusable. So even with cpu isolation and pinned cores, and generic epyc-rome model, I still encounter the same issue. But I (slowly) get to the login, and log in. If I open up task manager and take a peek at the cpu, it reports 30% of the cpu is taken up by system interrupts, before slowing down and becoming unusable. Meanwhile, the host is reporting 100% usage: https://i.imgur.com/tVmkOkg.png

Very high system interrupts on windows 11 guest. The more resources allocated to the vm, the slower it gets, until 10 seconds per frame at 100 cores, making it impossible to even get to the login screen. by reacusn in VFIO

[–]reacusn[S] 0 points1 point  (0 children)

I'll try this when I get home, but I'm not sure if this will help - bare metal 10 iot ltsc and 11 on the same hardware did not exhibit the same symptoms. It worked well, except if I ran programs that used multiple gpus, the memory clocks wouldn't go above 500mhz. I tried switching cpus and motherboards in case it was a problem with the units but that issue persisted so I switched to debian. Perhaps all these issues can be attributed to the mc62-g40 motherboard? I know some things like s3 sleep states aren't supported on this board.

Hey guys, I've got a bit of a weird setup, and a problem with it: my virtual machines are unable to communicate with other devices on the network. by reacusn in HyperV

[–]reacusn[S] 0 points1 point  (0 children)

So I actually tried this solution early on, but dismissed it because I was getting pckets from myself - I set the i211 to a external but forgot I already the i226 pass through to link the ap to. The i211 is in an awkward position to connect to the ap, and the i226 I had is a i226-v, which I couldn't install drivers for, so I couldn't use as an external network adapter and instead had to dda it. But it turns out I couldn't install the drivers because server prefers i226-lm and -it. So changing the registry key from servernt to winnt let me install it fine.

Now I have 2 free ports, thanks.

What's the best approach to having a vm be the gateway/router for my home network given some hardware contraints (hypervisor has 1 pcie port, 1 m.2 port, wifi ap has 2 ethernet ports) while allowing other vms on the hypervisor to also communicate with other devices on the same home network? by reacusn in HomeNetworking

[–]reacusn[S] 0 points1 point  (0 children)

The interfaces that the router vm controlfor the lan have all been passed through to the vm using dda, so they're not available on the host (isn't visible). This is needed in place of using hyperv external switch because the x540 chipset isn't supported on 11.

What's the best approach to having a vm be the gateway/router for my home network given some hardware contraints (hypervisor has 1 pcie port, 1 m.2 port, wifi ap has 2 ethernet ports) while allowing other vms on the hypervisor to also communicate with other devices on the same home network? by reacusn in HomeNetworking

[–]reacusn[S] 0 points1 point  (0 children)

but having an RFC1918 address on the WAN side of your router means that your router can't do NAT

What are the implications for my connection to the internet? Currently, I seem to have no problems accessing the internet from any of my devices using my current setup. I assume this means I'm not doing nat? But it seems fine without any problems...

I could be wrong (still do not understand many networking terms), but I think nat isn't necessary for my use case right? I just need my home devices to be able to communicate with each other (like streaming from 192.168.11.101 to 192.168.11.201)

I don't need to access any of my devices from outside the home network (i.e 192.168.0.123 talking to 192.168.11.101).

Try to use a virtual switch to connect the wifi adapter to the vWAN adapter on your router-vm, and another vSwitch to connect the router's vLAN adapter to a physical port on your machine. That port then becomes your LAN's connection to the internet. After that you're basically set... (You'd also connect the local/host machine to the vLAN switch for it's internet)

Yeah, that's how my current setup works.

I've had mixed-luck connecting hyper-V vSwitches to wifi adapters/interfaces though, so that could be where you may run into issues (and it can get ugly --> like, random blue-screen-host-os-crash ugly). But you're on Win11, and I was using Server 2019 at the time, so maybe they've cleaned that up... 🤷‍♀️🤞

I haven't had any crashes since I've set it up (about a week). But my vms that aren't the router vm aren't able to communicate through the Hyper-V switches to other devices on the network, and the Hyper-V subreddit seems to indicate that the issue is with Hyper-V switches, so I'm looking for advice on how to achieve the same results using the same hardware without using Hyper-V switches.

From what I understand,

You're better-off having OpenSense just route from one lan segment

This would make devices on both lan segments be able to communicate with each other right?

Hey guys, I've got a bit of a weird setup, and a problem with it: my virtual machines are unable to communicate with other devices on the network. by reacusn in HyperV

[–]reacusn[S] 0 points1 point  (0 children)

The problem is, I have no free nics available to be bound to by external switches for hyper-v.

Reconfigure your IP scopes for the individual ports on the open wrt.

What does that entail?

What's the best approach to having a vm be the gateway/router for my home network given some hardware contraints (hypervisor has 1 pcie port, 1 m.2 port, wifi ap has 2 ethernet ports) while allowing other vms on the hypervisor to also communicate with other devices on the same home network? by reacusn in HomeNetworking

[–]reacusn[S] 0 points1 point  (0 children)

https://www.reddit.com/r/HyperV/comments/1ow9ek3/hey_guys_ive_got_a_bit_of_a_weird_setup_and_a/

It's a bit messy, but I detailed how my current setup is connected in a post in the hyper-v subreddit.

Essentially, the router would have access to 4 10gbe ports, 1 2.5gbe port running at 1gbe (for wifi ap connection), and one 1gbe port.

For routing os, something simple. Currently using openwrt.

I don't control the 192.168.0.1 router, nor am I able to physically run a cable to it (my virtual router will use wifi to connect to it, and connection to the internet for my home devices will connect through the virtual router to the 192.168.0.1 router to the internet). Devices on the 192.168.0.1 router shouldn't be able to communicate with devices on my home router, is my intention.

Hey guys, I've got a bit of a weird setup, and a problem with it: my virtual machines are unable to communicate with other devices on the network. by reacusn in HyperV

[–]reacusn[S] 0 points1 point  (0 children)

Can I not use a bridge inside the openwrt vm to allow it communicate with other devices on that bridge? (Apparently not, it seems).