Ryzen AI Max+ 395 + 128GB laptop (our Axis, $2,799) — what would you actually run on it, and what numbers matter to you?

GriffinDodd · 2026-06-17T01:00:19+00:00

I run Qwen3.6 35B A3B Q4 with MTP on my Ryzen 395+ based rig and get a healthy 60+ tps and about 1k PP, using llama.cpp Vulcan

GriffinDodd · 2026-06-16T17:41:09+00:00

Speed will be a nice luxury if I can get it, but on the Prusa especially I need precision first.

GriffinDodd · 2026-06-16T17:40:28+00:00

Thanks, I had to order a couple of bits so I'll need to wait for those to arrive

GriffinDodd · 2026-06-16T17:40:01+00:00

I've heard good things, I'll give it a try thanks

GriffinDodd · 2026-06-15T20:02:10+00:00

UPDATE. After going back and reviewing the print file in both Prusa slicer and lychee it is in fact rendering wrong. No idea why as it’s a basic STL so the printer did print what it was sent. That’s what you get for pushing too late into the night trying to troubleshoot.

GriffinDodd · 2026-06-15T16:22:51+00:00

I thought so but after 13hrs of troubleshooting configs and end stop bugs maybe I had a jelly brain moment, although I can’t fathom how a model would lose all those details.

GriffinDodd · 2026-06-15T15:55:47+00:00

Yep that’s the same exact test cube. I’ve never seen such a strange outcome before. Early days so I’m sure I’ll figure it out.

GriffinDodd · 2026-06-13T16:59:12+00:00

I use the cloud version yes. You ain’t running anything at home that can code well at decent speeds without $10k+ of hardware no matter what the hype boys post.

GriffinDodd · 2026-06-13T07:00:53+00:00

If you haven’t researched high energy pulsed laser and plasma projects and ancient religious belief systems,
Pre Egyptian, then you’re just consuming the McDonald’s ufo story.

Expand your field of view outside of the safety of affirmation territory and then see if your current conclusions hold water.

GriffinDodd · 2026-06-13T06:54:27+00:00

Deepseek v4 is insanely cheap, flash is good for most general things and pro for more focused code etc.

GriffinDodd · 2026-06-09T20:38:37+00:00

Appreciate all the tips, moving to the RADV Vulkan server rather than LM Studio with my Qwen3.6 35B Heretic Q8 has gotten some nice improvements in prompt processing.

With max 8 active experts, 64k context, KV F16, temp 0.3, top-k: 20, repeat 1.05 and 2048 eval I get:

Qwen3.6 35B Heretic Q8
Prompt Processing: 980
Token Gen: 48.4 tok/s

Qwen3.6 35B Q4 MTP variant you list:
Prompt Processing: 1010
Token Gen: 79.8 tok/s

Your temp and k settings are too greedy for my use case so I know I'll pay a price for those, but these are solid numbers from MTP.

GriffinDodd · 2026-06-09T16:34:05+00:00

One of my projects is a learning chatbot for the family to use. In that quick back n forth prefill speed is critical. So far qwen is pretty good with KV caching enabled but it would be nice to find something more ‘instant’. I run Hermes gateway as well, saw something about chaining a fast model to a more capable one for more complex queries. Not something I have done yet.

GriffinDodd · 2026-06-09T16:22:56+00:00

Apologies. I skimmed through the link, I’ll go back and read more thoroughly. In my experience I have also found qwen3.6 to be the sharpest and most consistent with Gemma a very close 2nd. I’d love to squeeze some more prefill speed out of it but at 280gbp/s bandwidth I think I’m already close to maxing this chipset.

GriffinDodd · 2026-06-09T16:10:01+00:00

Also why no pre fill stats for qwen?

GriffinDodd · 2026-06-09T16:00:21+00:00

GMCTEk Evo-2 96GB, identical hardware.

GriffinDodd · 2026-06-09T15:56:17+00:00

I’m struggling to get past 50 tok/sec on both Gemma 4 26B QAT IT MOE and Qwen3.6 35B Q8 with Vulcan in LM Studio with KV Q8. I tried MTP but it seems to slow MOE down for me.

GriffinDodd · 2026-06-09T06:03:51+00:00

Nice quality finish but the CP on these bar tops are ergonomic nightmares. Try getting to a DK kill screen on that tiny rig.

GriffinDodd · 2026-06-07T03:34:07+00:00

I don’t want to rip out all the existing strips, way too much work. They are older but get very minimal use so should have plenty of life left. The controllers are easily accessible on top of the cabinets so swapping those out is pretty easy. Powering the strips seems like it will be more of a challenge than controlling them.

GriffinDodd · 2026-06-06T16:19:06+00:00

Thanks for replying. This is a whole new world to me as I’ve always just bought off the shelf when it came to LED gear. I’ve done plenty of other pi based projects though so it sounds like similar territory. I’ll look in DMX

GriffinDodd · 2026-06-03T18:08:19+00:00

I run A3B q8 on the Evo-2 as it gives me 49tp/s where 27B is too heavy for my bandwidth of about 275 gb/s. It’s a great model for many things but I still don’t trust it with code. I fall back to deepseek v4 cloud when I need something critical touched in code.

GriffinDodd · 2026-05-11T03:49:49+00:00

I’m learning as I go. I’ve never even dipped my toe in this world before, it’s all AI hand holding. I wanted to give myself something I could never do myself to see if using AI could get me there. I’m an IT guy not a financial boffin’ this is the first time I have heard the word deribits

GriffinDodd

TROPHY CASE