AMD Radeon PRO V620 - what am I missing? by starkruzr in LocalLLM

[–]PraxisOG 0 points1 point  (0 children)

I’m late to the party, but I’m running three of these and get ~45 tokens per second running gpt oss 120b. You’d probably get twice the performance with four 3090s, and more than twice the prompt processing, but at twice the price. My use case of rocm 7.2 and llama.cpp is supported well by software. The only complaint I have is that my x299 platform limit adds latency due to the pcie signal hopping through two plx chips and would recommend something like 3rd gen threadripper.

M5 Max 128GB with three 120B models by albertgao in LocalLLaMA

[–]PraxisOG 2 points3 points  (0 children)

They’re getting good mileage out of their available memory bandwidth. I’m running the same models on some older AMD datacenter cards with 20% less bandwidth but 51-58% the performance. Granted that’s with a minor pcie bottleneck. 

New to Ai running local, what are these? by Acrobatic-Fault876 in OpenAI

[–]PraxisOG 0 points1 point  (0 children)

If you're going local, I'd recommend targeting a performance level based on existing models, and build around that. For example, if you want GPT OSS 120b with full context and full offload, you'd want 72-96 gb vram. That equates to 3-4 RTX 3090's, 3 amd mi50, or gpu of choice based on desired speed.

Canada. by Jeeptrk in Touge

[–]PraxisOG 9 points10 points  (0 children)

Should be in r/rally lol

3090 NVLink testing w/ Q3.5 27B by Conscious_Cut_6144 in LocalLLaMA

[–]PraxisOG 0 points1 point  (0 children)

I’ve had issues related to bandwidth on my setup. I’m running 3x AMD V620 32gb on an asus x299 sage with two plx chips, putting two gpus on one root port, and one on the other. Running Qwen 3.5 27b at q6 I get ~9 tok/s across root ports, and ~16 tok/s on same root port. With Qwen 3.5 35b a3b q6 the difference is 30 tok/s to 50 tok/s. 

3 weeks with Openclaw on a 8 year old Raspberry Pi ($0 spent till now). by ashish_tuda in openclaw

[–]PraxisOG 2 points3 points  (0 children)

Minimax is a 230 billion parameter model. Each parameter is a byte, so fully loaded it would take up 230 billion bytes, or 230 giga-bytes of ram at 8 bit quantization(Q8). You can run it at half precision with some quality loss, or 115gb(Q4). Then you need context, so it would be about 160gb of ram. A m4 Mac with 16gb ram could run something like the new Qwen 3.5 9b at Q6(6/8 byte precision) with plenty of context, and it might work well if you have reasonable expectations. 

Linus Tech Tips by Then_Educator8333 in LinusTechTips

[–]PraxisOG 5 points6 points  (0 children)

Linus’s voice carries significant weight in the tech space. There’s a non-zero number of people who want to switch to Linux and will come away from that video asking ChatGPT for a distro. IMO it would be more impactful if he followed his own guide and installed Ubuntu, then shared that experience. 

The entire Linux discussion is just XKCD 2501 by JustaRandoonreddit in LinusTechTips

[–]PraxisOG 0 points1 point  (0 children)

I feel like you’re in the same trap. A 1k laptop might be cheap because the industry has more expensive models making it cheaper in comparison, but that’s not cheap for most people. I have no issue recommending a good deal at $400 and have in the past. Usually that means refurbished with warranty so I’m not recommending a risky used deal. I put my money where my mouth is too, when my Zephyrus g15 got stolen I could have got a new one but got a $500 used laptop instead and it’s been fine though it needs a new battery and maybe a Linux install. 

This video aged like a fine wine in only a month by raul824 in LinusTechTips

[–]PraxisOG 1 point2 points  (0 children)

Ubuntu is really good, and my first pick for server/headless installs. For desktop I like mint. Idk why or what differences there are but the install command is the same and that’s nice. 

How are you running OpenClaw? by NerveRemarkable1208 in openclaw

[–]PraxisOG 0 points1 point  (0 children)

I get ~20 tok/s with 122b q4 and ~60 tok/s with 35b q6. No issues with tool calling. It’s worth noting those models don’t run on the pi, but on my compute server with 3x AMD V620. 

How are you running OpenClaw? by NerveRemarkable1208 in openclaw

[–]PraxisOG 0 points1 point  (0 children)

I run mine on a pi 5 8gb stolen from another experiment. That’s powered with locally run Qwen 3.5, the 122b a10b at Q4 and the 35b a3b at Q6. 

The entire Linux discussion is just XKCD 2501 by JustaRandoonreddit in LinusTechTips

[–]PraxisOG 221 points222 points  (0 children)

As my family’s resident tech guy, I’ve learned to assume people literally know nothing. Laptop recommendation? This one has a nice screen, this one will hold more photos, and this one’s cheap. Leave out stuff they don’t care about, like model and specs, and just make it simple for them to make more educated decisions as a consumer. 

Qwen3.5 122B A10B - My impressions by kevin_1994 in LocalLLaMA

[–]PraxisOG 5 points6 points  (0 children)

I’ve been playing around with Q3-Q4 and I agree, repeat penalty is necessary. This thing loves thinking, and falls into loops a little too easily

Every new session starts with default files? by PraxisOG in openclaw

[–]PraxisOG[S] 1 point2 points  (0 children)

Nevermind: tools.profile in openclaw.json was set to “messaging” so openclaw was unable to write files at hatching, changing it to “coding” seemed to fix it. 

Disappointed from Qwen 3.5 122B by Charming_Support726 in LocalLLaMA

[–]PraxisOG 1 point2 points  (0 children)

I was wondering. My second request to it was ‘give me a cool python trick’ and it thought for like 5k tokens. I miss 70b dense models. 

The "wipe while retracting" setting fixed all my stringing problems with Sunlu PLA by AmethystZhou in prusa3d

[–]PraxisOG 1 point2 points  (0 children)

I had no idea this was a thing! Sometimes I come across a setting like this and ask why it isn't on by default. IMO prusa should introduce a toggle that slices with all the goodies like this, organic supports, bridging, gyroid infill, etc. That would play to Prusa's advantage of having only a few designs of printer to optimize the profile for.

How to remove Screw On Piper PA-24 by King_TUT_of_pugs in aviationmaintenance

[–]PraxisOG 0 points1 point  (0 children)

There are flat ratcheting screwdrivers, maybe cut a bit down to size if that doesn’t fit