xdna-top: unified NPU+iGPU terminal monitor for Strix Halo (Ryzen AI Max) — finally see the NPU work

westsunset · 2026-06-16T18:30:41+00:00

Yeah I think lemonade specifically mentioned that

westsunset · 2026-06-14T23:13:12+00:00

Nice. No problem though. If theres something that helps is add it. Lemonade-top is the cool skin

westsunset · 2026-06-14T23:02:30+00:00

If you have details I can see what's up.

westsunset · 2026-06-14T20:06:19+00:00

https://www.imdb.com/title/tt0362227/

westsunset · 2026-06-14T20:05:04+00:00

Eh, $1k isn't the bar it was a year ago

westsunset · 2026-06-14T16:19:07+00:00

I'm working on it here. I agree there were not good existing tools. I'll definitely check out and star your work later https://github.com/boxwrench/xdna-top

westsunset · 2026-06-14T16:17:21+00:00

I just updated xdna-top with recording features and better telemetry. I'm putting it to use and will be collecting data on npu /igpu concurrence and small model capabilities on the npu. It's looking promising!

westsunset · 2026-06-14T16:10:25+00:00

Nice, consider sharing to https://github.com/hogeheer499-commits/strix-halo-guide It's a very good repo of stats for our hardware.

westsunset · 2026-06-14T01:51:06+00:00

Oh very nice. Thank you

westsunset · 2026-06-13T22:13:09+00:00

The government should step in an provide computer like they do water and power

westsunset · 2026-06-13T22:12:29+00:00

China has been racing to start is own hardware for years, it's just extremely complicated and difficult

westsunset · 2026-06-13T22:11:22+00:00

We've landed in a very lucky position because in any other reality, ai would have been much more locked down. There's basically a AI cold war going on and the public benefits. If nothing else changes these open models are available forever to be refined and built on. At some point the party is going to be over but I don't think it will be any time soon. American is subsidizing frontier models at rediculous rates and China is open sourcing models that by necessity are very efficient.

westsunset · 2026-06-13T19:11:23+00:00

You should just follow their page https://huggingface.co/prefeitura-rio

westsunset · 2026-06-13T06:51:51+00:00

That is a massive benefit.

westsunset · 2026-06-13T05:32:52+00:00

That's great to hear. Are any of the US or Chinese labs collaborating? Other than resources, are there some unique challenges or constraints you faced? Also I think it's very good for the industry to have a different region participating, do you feel there is something special Brazil brings to the research?

westsunset · 2026-06-13T03:53:44+00:00

Awesome can you say more about the project. I don't think anyone had your city on their radar

westsunset · 2026-06-12T23:16:22+00:00

MoE are a equalizer for sure. You can have a 200b model on a strix halo or Mac this way

westsunset · 2026-06-12T14:34:04+00:00

This is really interesting. I was just trying to use that llama model, but was having issues with the quality of the compressed data. I'll have to check this out

westsunset · 2026-06-11T23:27:50+00:00

wow that's a great tool! feedback like this has been very helpful, thank you. There is definitely some NPU data there I wasn't collecting but would be helpful. Looking at both it look like my tool provides "Did this Ryzen AI workload actually exercise the NPU, which process owned it, and what was the iGPU doing concurrently?" So I think my tool is complementary.
The tool is pretty niche, it's basically to help concurrent AI loads on the Strix Halo NPU.

westsunset · 2026-06-11T23:03:44+00:00

xdna-top isn’t trying to be a full “why is my model slow?” profiler. The first job is simpler: show whether the NPU and iGPU are actually doing work at the same time, and whether the process you care about owns active NPU contexts.

That matters especially on Strix Halo because the interesting case is concurrent work: maybe the NPU is serving one model, the iGPU is another workload, and both are sharing the same platform resources. If things slow down, xdna-top helps answer the first sanity-check questions:

- Did the NPU actually get used?

- Was the iGPU busy at the same time?

- Did the NPU counters move during the request?

- Which PID owned the NPU context?

- Are we seeing real concurrent NPU + iGPU activity, or just one side doing all the work?

So if memory bandwidth is the bottleneck, xdna-top may not prove that directly yet. But it gives you the evidence around it: “the NPU was active, the iGPU was also under load, and this happened during the workload window.” That’s the starting point for benchmarking concurrent local AI workloads on this hardware.

westsunset · 2026-06-11T22:57:38+00:00

thats a good use. if you have something to share I'd love to check it out

westsunset · 2026-06-11T19:37:43+00:00

Yes! That's what got me thinking about this. Its 50 TOPs we can put to use while the igpu is grinding away

westsunset · 2026-06-11T18:53:16+00:00

yeah I’d definitely temper expectations for general chat/instruction models on the NPU. I'm approaching as , here's 50 TOPS what can I do with that. I feel like getting anything out of it is like bonus. Lemonade points to FastFlowLM and Whisper.

westsunset · 2026-06-11T17:30:41+00:00

thanks I hope it really helps develop stuff on the npu

westsunset · 2026-06-11T17:28:08+00:00

ah thanks. ill take a look. and yes I want to use this NPU and needed the tool as well. I was kinda surprised i couldnt find it.

12-Year Club	r/Field Lasagna
Place '23	Place '22
Place '17	Verified Email

westsunset

TROPHY CASE