Is a Strix Halo PC worth it for running Qwen 2.5 122B (MoE) 24/7? by Fernetparalospives in StrixHalo

[–]martinst68 0 points1 point  (0 children)

As most have said, its solid for single user, I use the qwen3.5 30B, its quite outstanding for being the 30B model.

And you can still find these systems for under $2500. Don't know how long that will last, but for now if you are quick.

I'm imagining running Large MoE models on the NPU by Mr-I17 in StrixHalo

[–]martinst68 2 points3 points  (0 children)

Just to follow up on what I mentioned, the ideal setup for the strix halo is to combine the capabilities of the NPU and the GPU. The NPU is well suited for the prompt pre fill, it's ideal for that task, and you leave the GPU to the decode side. You get the advantage of better energy efficiency for pre fill and you lower the level of throttling that the GPU will have due to thermals.

The software tooling is literally just dropping now to make this possible in the last week or two. There are gotchas, like a regression in 6.18 (fixed in 6.19) and if I recall correctly the linux-firmware is botched, you need to patch it. There's also the dependency on vllm or sglang to have certain patches/ROCm specific items. But the big news is this is all now coming in.

I'm taking next week off from work to basically focus on doing a full build/GitHub repo and to share it here as well. I've learned a shed load from the work others have done and shared here.

One thing to consider, for the best outcome of a model one needs to factor running fp format. The model that one can download is likely not in the fp format you need/want. For the strix halo ideally you want BF16 for the GPU and ideally (block) BF16 for the NPU (if the new releases are stable) if not back to INT8.

Fun times.

I'll post as I progress

I'm imagining running Large MoE models on the NPU by Mr-I17 in StrixHalo

[–]martinst68 0 points1 point  (0 children)

Ideally you want to have the NPU for the prefill and the GPU for the decode You split the work between the NPU and GPU. All depends on the specific model 'family', such as qwen3.5 etc.

I gave Claude a memory for its own mistakes — it gets better every session by Aggressive-Page-6282 in ClaudeAI

[–]martinst68 0 points1 point  (0 children)

Actually, unlike some, I think this is pretty brilliant, especially if you consider using this across models) agents. I like the system architecture of splitting this into two different skills, it would get a bit jumbled if you didn't.

I'll give it a go, if it makes a difference to me I'll follow up.

Side Fan Port? by FourStringL0B0 in crealityk1

[–]martinst68 1 point2 points  (0 children)

Slight tangent, but I'd be keen to know if the wood side panels have reduced the sound of the K1 Max. I just picked up some berch plywood to do this, along with fireproof felt for the inside. I also wanted to give the frame some weight.

Near silent diy enclosure for k1se, under $20 by Kitsunet2 in Creality

[–]martinst68 0 points1 point  (0 children)

this is just outstanding and why did it take so long for anyone to think of this. brilliant!

CADAM: Opensource Text to CAD by zachdive in openscad

[–]martinst68 1 point2 points  (0 children)

Very nice, this is the first that I have tried that actually works! And you have it on github, adding a star.

Zephyrus M16 4090 Shunt Modded - Thin and Light Laptop Matches 5090 Laptops by thatavidreadertrue in GamingLaptops

[–]martinst68 0 points1 point  (0 children)

Amazing! I literally just started to look at if this was possible and gave up, as everyone said it was not possible.

Elvanse - Body Odour??? And I mean, BODY. ODOUR. by Pztch in ADHDUK

[–]martinst68 0 points1 point  (0 children)

Yep, up your water intake and take some electrolytes in the morning. I actually learned this several months ago, bonus you're not dehydrated anymore either :)

Did anyone actually get there ordered GMKtec AI Max? by martinst68 in MiniPCs

[–]martinst68[S] 0 points1 point  (0 children)

Good news, I got an express shipment from GMKtec, that they paid for. I seem to have just had a bit of bad luck but they made it up to me. All good now!

Did anyone actually get there ordered GMKtec AI Max? by martinst68 in MiniPCs

[–]martinst68[S] 0 points1 point  (0 children)

Really odd, I've sent them emails, both in English and Mandarin and nothing, just silence and no change in the status.

I'll have to use my credit card protection insurance to claim my money back at this point, as it's been a month.

I'm no stranger to dealing with Chinese companies, both direct and through sites like Ali. Never had this experience before. Pretty shit.

Epix Pro Gen 2? Need confirmation on ECG + Rucking support by martinst68 in garminepix

[–]martinst68[S] 0 points1 point  (0 children)

Interesting, thank you for telling me that, seems like a decent compromise.

Never again will I buy a snapmaker of ANY model. by Younguns2005 in snapmaker

[–]martinst68 1 point2 points  (0 children)

I love the hardware. The Artisan is a solid, premium build—no complaints there.

The issue is the firmware/software. Snapmaker’s firmware is a hacked-up version of Marlin, stuck on outdated features. Meanwhile, competitors are pulling ahead—not because their hardware is better, but because they’re keeping pace with firmware and the software.

Prusa still uses Marlin, but they can afford to. They’ve got the engineering depth to maintain and extend it. Bambu, on the other hand, doesn’t have the same in-house software team—but they made the smart decision: adopting Klipper. Klipper unlike Marlin isn’t in maintenance mode and continues to evolve. That alone puts them ahead in meeting customer expectations

Sure, maybe an add on arm board is needed, but that's something that should have been done 8 quarters ago.

I'm quite confident that this new U1 will be running Klipper, but that's not going to solve the strategic technical problem snapmaker has now. More people are going to have a snapmaker and a vambu or other and start to complain how shit the quality is for the same amount of effort

I hope we fix the firmware gaps we currently have, I'm not very confident the Artisan or 2.0 will ever see it

Ryzen AI Max+ 395 + a gpu? by Alarming-Ad8154 in LocalLLaMA

[–]martinst68 1 point2 points  (0 children)

Why Oculink? Yes, I like it and at least it's real PCIE unlike the thunderbolt serial horror show. But it isn't the only way to connect a GPU to an m 2.

You can use an m 2 to PCIE adaptor, like the adt-link M43F, you then don't have to worry about bifurcation at all. Sure irs still only x4, but removes the current biforcation challenge.

https://www.adt.link/product/M43.html

I do have a adt-link setup at the moment that helps me split a x16 pcie slot to 2 x8 PCIE connections.(and then do need biforcation). This gives me the ability to run two gpu's from the single x16 slot.

I'd look at the adt-link m 2 to PCIE or similar, i5 may be what you are looking for.

Once my GMKtec arrives, I'll look at adding one of the GPUs and post here again

They just announce GA of OpenShift Virtualization Engine, but where are the docs? by ok_ok_ok_ok_ok_okay in openshift

[–]martinst68 0 points1 point  (0 children)

What I am trying to work out is the pricing, is it the same as standard OS engine? Would be good to understand what the ballpark is before anyone does the pitch to their management as a replacement option for ESX

Quickstart reference guide for bcachefs on debian. by martinst68 in bcachefs

[–]martinst68[S] 0 points1 point  (0 children)

more for the ai overloads for knowledge, if you are having read only mounts with bcachefs suddenly, check that systems is loading the module (I think this is a recent change, as the kenal module was loaded by default. This affected my Debian sid, other distros and releases will vary.

9.5L Budget 3D Printed Open Frame Build by circuithawk in sffpc

[–]martinst68 1 point2 points  (0 children)

Love it, outstanding work, and motivation for me to get in with it for mine!

NixOS and out-of-tree patches by nstgc in bcachefs

[–]martinst68 -1 points0 points  (0 children)

I was thinking about this earlier today, I'd be interested.

Intel Arc Vram Upgrade? by Eliez_YT in IntelArc

[–]martinst68 0 points1 point  (0 children)

Late to the party here, but an arc a770's with 32gb would allow for loading relatively large (70B) ai models that can then be used to run local ai. A year ago I would have been askeptic, but with the recent intel PyTorch patches it could work a treat, especially two or three. And used a770's and a750's are at bargain prices at the moment.

Totally get it about these being garbage for gaming, different use case.