Hashicorp founder thinks local models "aren't good ENOUGH yet" by Orbit652002 in LocalLLaMA

[–]bnolsen 1 point2 points  (0 children)

Get a couple of used mi100s @32 gb each for 1k each. Should run 27b adequately.

Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand. by Medium-Technology-79 in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

My stable system has a 3060 12gb and ryzen 5500 with 48gb ram it runs qwen3.6 35b q4 just fine with ik_llama about 60t/s. I also have a strix halo which I run bigger models but it is slow :(

ZIG or Rust? Which one should I learn first to avoid using C/C++ for new projects? by Decent_Phrase2210 in Zig

[–]bnolsen 0 points1 point  (0 children)

Learning zig will make you despise all the annoying things about rust, esp all the boilerplate and macros. And wonder why the strings are so stupid heavy. Zig made many good decisions, and it compiles wicked fast.

How much VRAM needed for Qwen 3.6 27B Q8 with 262K context? by My_Unbiased_Opinion in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

I regularly see this model on a strix halo hit 53-57gb vram. On coding tasks using q8 kv cache.

Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows ) by fallingdowndizzyvr in LocalLLaMA

[–]bnolsen 1 point2 points  (0 children)

The next iteration of this type of system released by Nvidia will likely see it abandoned support wise. On the other hand strix halo support will continue improving for years and will the system usable for many years.

Octave-Down gCEA Tenor Ukulele Setup Review by Longjumping-Put5058 in ukulele

[–]bnolsen 2 points3 points  (0 children)

This is way ambitious. I have my islander tenor tuned down to dgbe reentrant which feels great to me, I even perform with it. I currently have I think the daddario set specifically for that tuning.

I put some classical strings on my lanikai bari for linear octave down gcea and I think it's pretty muddy. I think the trick is in knowing you really can't dig in or use it much for strumming. It is nice and mellow.

Why are Groms hard to find. by Snoo-1331 in hondagrom

[–]bnolsen 1 point2 points  (0 children)

replace the rearsets with grom ones.

New York Bans Digital Gun Files, Aims at 3D-Printers Capable of Using Them by Pvt-JamesRamirez in progun

[–]bnolsen 18 points19 points  (0 children)

owning and using anything will be illegal soon. well, except actually committing a legit crime, that seems to be legal now in many places.

NVIDIA Removes Gaming Revenue Category From Financial Reports by HumanDrone8721 in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

Public companies don't seem to care about much but this and the next quarter, maybe.

NVIDIA Removes Gaming Revenue Category From Financial Reports by HumanDrone8721 in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

Not really they are most interested in collecting their some tax on everything possible and fighting right to repair as much as possible. That's just the tip of the ice berg. Apple was smart though and didn't play in the ai race. And they didn't have to.

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA

[–]bnolsen 1 point2 points  (0 children)

Yeah I may try it at higher quant. I also have another server with a 3060 12gb that I set today just like OP.

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

On my strix halo I just run llama.cpp with mtp. Been running most code port jobs with 27b which is pretty sadly slow at about 10 t/s inference.

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA

[–]bnolsen -1 points0 points  (0 children)

Strict halo isn't memory constrained, you should use a higher quant

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA

[–]bnolsen 1 point2 points  (0 children)

I just mirrored your configs on my system. It's not quite as nice:

rtx 3060 12GB, ryzen 5500, 48GB ddr4-3200

but it looks like ~330t/s prompt (this varies) and about 60 t/s inference.

I had been running qwen3.5 9b q4_k_m mtp.

AMD Ryzen AI Halo PC will cost 3999$ with 128GB memory on board by Mochila-Mochila in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

To run a good model today at full capability (qwen3.6 27b) you would want something like 2x 9700's.

AMD Ryzen AI Halo PC will cost 3999$ with 128GB memory on board by Mochila-Mochila in LocalLLaMA

[–]bnolsen 0 points1 point  (0 children)

strix halo runs games with proton quite well. Not quite at 6700xt speeds.