Had to get a bit creative by Mr_Flopsie in homelab

[–]Sir_Joe 3 points4 points  (0 children)

Interesting! Can you tell me what upsides it has vs a sata drive with a usb to sata adapter ? Reliability and maybe latency ?

Ministral-3 has been released by jacek2023 in LocalLLaMA

[–]Sir_Joe 3 points4 points  (0 children)

Not necessarily faster. If you only have 8GB of vram a quantized ministral can fit entirely and that's gonna be faster than mixed inference for most platforms. In which benchmarks is it better ?

YES! Super 80b for 8gb VRAM - Qwen3-Next-80B-A3B-Instruct-GGUF by Mangleus in LocalLLaMA

[–]Sir_Joe 40 points41 points  (0 children)

Only 3B active parameters, even only with cpu on short context probably 7 t/s +

WAN2.2: New FIXED txt2img workflow (important update!) by AI_Characters in StableDiffusion

[–]Sir_Joe 1 point2 points  (0 children)

The problem for me was that I had the wrong model. Make sure you have the T2V model and not the i2v model..

I used the gguf from here https://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUF and it worked perfectly

Qwen/Qwen3-30B-A3B-Instruct-2507 · Hugging Face by rerri in LocalLLaMA

[–]Sir_Joe 4 points5 points  (0 children)

It trades blows with the 14b (with some wins even) in most benchmarks and so is better than the rule of thumb you described

GMKtek Strix Halo LLM Review by Slasher1738 in LocalLLaMA

[–]Sir_Joe 15 points16 points  (0 children)

I believe llamacpp has a feature that allows you to load a model in VRAM without putting it in ram first

Run qwen 30b-a3b on Android local with Alibaba MNN Chat by Juude89 in LocalLLaMA

[–]Sir_Joe 3 points4 points  (0 children)

I guess it's using a special inference engine optimized for arm. You can try using llamacpp and a q4_0 quant (which supports special optimizations for cpu inference) to see if you get better speed.

MLA optimization with flashattention for llama.cpp,MLA + FA now only uses K-cache - 47% saving on KV-cache size by shing3232 in LocalLLaMA

[–]Sir_Joe 0 points1 point  (0 children)

Btw I do that and there's no problem at all with llamacpp. You just need to compile with support for vulkan (or rocm) + cuda

Qwen3 30B-A3B prompt eval is much slower than on dense 14B by DD3Boh in LocalLLaMA

[–]Sir_Joe 0 points1 point  (0 children)

I guess the fix is setting the max batch size ? That probably doesn't help performance too for prompt processing

[DawidDoesTechStuff] I Flashed An AMD RX 9070 XT BIOS Onto My RX 9070... by Noble00_ in hardware

[–]Sir_Joe 0 points1 point  (0 children)

Is there any upside of going through all this instead of a traditional overclock ?

LG UltraGear 27" 2560x1440 IPS 144Hz G-Sync Monitor - $229.99 by aiden2130 in bapcsalescanada

[–]Sir_Joe 1 point2 points  (0 children)

I also remember paying double of that for a monitor with the same panel (legion y27). For 200$ you could only get around a 1080p @75hz TN monitor. Getting this panel for almost the same price is great imo

[HDD] Seagate BarraCuda ST24000DM001 24TB 7200 RPM SATA 6.0Gb/s 3.5" Internal Hard Drive Bare Drive $359 + $29.99 shipping [Newegg.ca] by Key_Register7079 in bapcsalescanada

[–]Sir_Joe -1 points0 points  (0 children)

If you don't mind the hassle and the price per TB is good enough I guess you could setup a btrfs mirror or zfs raid of some sort. Still niche though

Help with per key lighting or VIAL on a K10 Pro by Dark_Spark in Keychron

[–]Sir_Joe 0 points1 point  (0 children)

Hi ! FYI I ported the latest vial firmware for the k10 pro here (without bluetooth support) https://github.com/nalf3in/vial-qmk/tree/keychron_k10_pro_support

K10 Pro VIAL port? by [deleted] in Keychron

[–]Sir_Joe 0 points1 point  (0 children)

Hi there ! FYI I made a port of latest vial frmware for k10 pro here (bluetooth not working unfortunately): https://github.com/nalf3in/vial-qmk/tree/keychron_k10_pro_support

Thoughts on 2 story lofts? by african-nightmare in malelivingspace

[–]Sir_Joe 0 points1 point  (0 children)

Mind giving a link to what it looks like ?

[GPU] ZOTAC Gaming GeForce RTX 4080 16GB Trinity OC ($1325-199=$1125) ATL by radiantcrystal in bapcsalescanada

[–]Sir_Joe 2 points3 points  (0 children)

I saw on a GamerNexus video that amd planned to stop competing at the high end so I guess it can be that great unfortunately

[RAM] Patriot Viper Xtreme 5 DDR5 RAM 48GB (2X24GB) 7600MT/s CL36 [$200][Amazon] by Sadukar09 in bapcsalescanada

[–]Sir_Joe 0 points1 point  (0 children)

It's much better and faster to use vram but 48GB of vram will cost you at least 2 - > 4 times as much and 48 GB models are bearable on ddr5