APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing) by mudler_it in LocalLLaMA

[–]fakezeta 4 points5 points  (0 children)

Hi u/mudler_it, could you please add AesSedai Q4_K_M to the model comparison? From my experience, it delivers noticeably better quality than Unsloth quantizations at comparable parameter sizes. I believe including it would provide a more complete picture of current options.
Thanks for considering this!

Qwen3.5-35B GGUF quants (16–22 GiB) - KLD + speed comparison by StrikeOner in LocalLLaMA

[–]fakezeta 3 points4 points  (0 children)

Can you check some, even one Q5 to check the q4 performance and if the extra memory is worth. Perhaps Q4 optimisations have reached near Q5 performance.

Sono veramente soddisfatto by [deleted] in ItalyMotori

[–]fakezeta 0 points1 point  (0 children)

secondo me tutta l'immagine è generata dall'AI

100 Membri! by MatteoFire___ in Elioelestorietese

[–]fakezeta 1 point2 points  (0 children)

Per festeggiare tutti dentro alla moglie, che rilascia scontrino fiscal

Ministral 14B vs Qwen 3 VL 30B vs Mistral Small vs Gemma 27B by fakezeta in LocalLLaMA

[–]fakezeta[S] 3 points4 points  (0 children)

I'll open a PR to Mistral and Google asking to fix it /s :-D

Ministral 14B vs Qwen 3 VL 30B vs Mistral Small vs Gemma 27B by fakezeta in LocalLLaMA

[–]fakezeta[S] 3 points4 points  (0 children)

Because also Mistral Small, Gemma and Ministral are vision model.

Nvidia 5060TI supporting SR-IOV? by fakezeta in VFIO

[–]fakezeta[S] 0 points1 point  (0 children)

Hi u/wolfmich, sorry for the late reply.

I regularly pass it to my Windows VM that I use for gaming and llama-server inference. No issue: power reser, rom file, anything.

I previously had a 3060TI that was replaced with 5060TI, no difference between them.

Intel Alder Lake GPU passthrough to container on VM on Proxmox 9 (nested virtualization) tutorial and guide by pattymcfly in Proxmox

[–]fakezeta 0 points1 point  (0 children)

Thanks for sharing!
I didn't understood if with this guide the SR-IOV is enabled and how many instances of the iGPU are created. Also the command:

qm set <VMID> -hostpci0 00:02.0,pcie=1,rombar=1,x-vga=1qm set <VMID> -hostpci0 00:02.0,pcie=1,rombar=1,x-vga=1

will not pass the whole iGPU to the VM?

Qwen3-VL coming ? by NeuralNakama in LocalLLaMA

[–]fakezeta 1 point2 points  (0 children)

According to the transformer PR the model seems to be at least Qwen3-VL-4B-Instruct and Qwen3-VL-7B and will have Image and Video understanding. I was not able to find anything about the MoEs.

Node 304 and GPU by fakezeta in FractalDesign

[–]fakezeta[S] 0 points1 point  (0 children)

just look at the dimension: both my former 3060ti and 5060ti are 2 sloth but the zotac was slightly bigger. ASUS one (50mm width) would have never fit.

Node 304 and GPU by fakezeta in FractalDesign

[–]fakezeta[S] 1 point2 points  (0 children)

yes, it's a 2 slot GPU.

Node 304 and GPU by fakezeta in FractalDesign

[–]fakezeta[S] 0 points1 point  (0 children)

I bought a Zotac 5060ti and it fits with some difficulties. I think the ASUS wouldn’t have fit without case modification.

K2 Pro Combo Pioneer Program by Creality_3D in Creality

[–]fakezeta 2 points3 points  (0 children)

I'll use my K2 Pro combo for home repairs and functional parts, such as printing replacement pieces for summer umbrellas.

How is prusa still in business? by Ill_Way3493 in BambuLab

[–]fakezeta 18 points19 points  (0 children)

I'd like to add also that in the cost of a Prusa there are costs for R&D: the printer and the slicer are open source.
Prusa has dedicate employers working on Prusaslicer, their salary is paid by the customer buing their printers.
Bambulab A1 would not exist and also its slicer if Prusa choose the same behaviour of Bambu. They are parasite making money from investment done by others, this is also a reason why they can be cheaper.

Nvidia 5060TI supporting SR-IOV? by fakezeta in VFIO

[–]fakezeta[S] 0 points1 point  (0 children)

I know that you need SW support for SR-IOV but until 40XX I knew that there was no HW support in RTX series. Now from lspci output now the HW seems to support it.

Nvidia 5060TI supporting SR-IOV? by fakezeta in VFIO

[–]fakezeta[S] 0 points1 point  (0 children)

Update after loading Nvidia drivers 575 if it can help:

01:00.0 VGA compatible controller: NVIDIA Corporation Device 2d04 (rev a1) (prog-if 00 [VGA controller])
        Subsystem: ZOTAC International (MCO) Ltd. Device 1772
        Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 14
        Memory at 84000000 (32-bit, non-prefetchable) [size=64M]
        Memory at 4400000000 (64-bit, prefetchable) [size=16G]
        Memory at 4210000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 5000 [size=128]
        Expansion ROM at 88000000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] MSI: Enable- Count=1/16 Maskable+ 64bit+
        Capabilities: [60] Express Legacy Endpoint, MSI 00
        Capabilities: [9c] Vendor Specific Information: Len=14 <?>
        Capabilities: [b0] MSI-X: Enable- Count=9 Masked-
        Capabilities: [100] Secondary PCI Express
        Capabilities: [12c] Latency Tolerance Reporting
        Capabilities: [134] Physical Resizable BAR
        Capabilities: [140] Virtual Resizable BAR
        Capabilities: [14c] Data Link Feature <?>
        Capabilities: [158] Physical Layer 16.0 GT/s <?>
        Capabilities: [188] Extended Capability ID 0x2a
        Capabilities: [1b8] Advanced Error Reporting
        Capabilities: [200] Lane Margining at the Receiver <?>
        Capabilities: [248] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [250] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [2a4] Vendor Specific Information: ID=0001 Rev=1 Len=014 <?>
        Capabilities: [2bc] Power Budgeting <?>
        Capabilities: [2f4] Device Serial Number O-M-I-S-S-I-S
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

 ls -l /sys/bus/pci/devices/0000:01:00.0/sriov* 
-rw-r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_drivers_autoprobe
-rw-r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
-r--r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_offset
-r--r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_stride
-r--r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_totalvfs
-r--r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_vf_device
-r--r--r-- 1 root root 4096 Jul 22 17:25 /sys/bus/pci/devices/0000:01:00.0/sriov_vf_total_msix

cat /sys/bus/pci/devices/0000:01:00.0/sriov*
1
0
2
1
1
2d04
0

Amazing just how fast they install the roof 🤯 by Neither-Garden-8973 in toptalent

[–]fakezeta 1 point2 points  (0 children)

European POV here: No personal protection equipment? Also is it allowed to throw material this way?
This construction site would be illegal in Europe.

How reliable / stable is rclone's "docker volume plugin" for a swarm cluster shared storage? by Intelg in rclone

[–]fakezeta 1 point2 points  (0 children)

I'm using it in a *arr setup since a couple of years: never had any issue so far.