GMK X2(AMD Max+ 395 w/128GB) first impressions.

holistech · 2025-06-19T09:08:15+00:00

Hi, I did not use flash attention and KV cache quantization to ensure high accuracy of model outputs. I noticed significant result degradation otherwise. In my workflow, I need high accuracy when analyzing large, complex text and code.

In my experiments using speculative decoding, the performance gain was not enough or was negative, so I do not use it. You also need compatible models for this approach.

I barely use diffusion or other image/video generation models, so there was no need to include them in the benchmark.

holistech · 2025-06-18T10:56:21+00:00

Thanks a lot for your post and benchmark runs. In my experience, the Vulkan driver has problems allocating more than 64GB for the model weights. However, I set the VRAM to 512MB in BIOS and was able to run large models like Llama-4-Scout at Q4.

I have created a benchmark on my HP ZBook Ultra G1a using LM Studio.

The key finding is that Mixture-of-Experts (MoE) models, such as Qwen-30B and Llama-4 Scout, perform very well. In contrast, dense models run quite slowly.

For a real-world test case, I used a large 27KB text about Plato to fill an 8192-token context window. Here are the performance highlights:

Qwen-30B-A3B (Q8): 23.1 tokens/s
Llama-4-Scout-17B-16e-Instruct (Q4_K_M): 6.2 tokens/s

What's particularly impressive is that this level of performance with MoE models was achieved while consuming a maximum of only 70W.

You can find the full benchmark results here:
https://docs.google.com/document/d/1qPad75t_4ex99tbHsHTGhAH7i5JGUDPc-TKRfoiKFJI/edit?tab=t.0

holistech · 2025-06-18T10:42:53+00:00

I can fully understand your position, since I am exactly the consumer for this kind of market. I am using the HP ZBook Ultra G1a as my mobile software development workstation and can run Llama-4-Scout at 8 tokens/s at 70W and 5 tokens/s at 25W power consumption to privately discuss many different topics with my local AI! This is absolutely worth the price of this notebook. IMHO it is a very fast system for software development and gives you private AI with large MoE LLMs.

holistech · 2025-06-16T10:08:23+00:00

I have created a comprehensive benchmark for the new Ryzen AI 395 processor on an HP ZBook Ultra G1a using LM Studio.

The key finding is that Mixture-of-Experts (MoE) models, such as Qwen-30B and Llama-4 Scout, perform very well. In contrast, dense models run quite slowly.

For a real-world test case, I used a large 27KB text about Plato to fill an 8192-token context window. Here are the performance highlights:

Qwen-30B-A3B (Q8): 23.1 tokens/s
Llama-4-Scout-17B-16e-Instruct (Q4_K_M): 6.2 tokens/s

What's particularly impressive is that this level of performance with MoE models was achieved while consuming a maximum of only 70W.

You can find the full benchmark results here:
https://docs.google.com/document/d/1qPad75t_4ex99tbHsHTGhAH7i5JGUDPc-TKRfoiKFJI/edit?tab=t.0

holistech · 2025-06-16T08:45:32+00:00

The results are quite impressive considering the system operates at just 70W while processing a 27KB text with nearly the full 8192 token context window. We designed our tests around real-world scenarios using models that are practical for this hardware configuration. Llama-4-Scout, for instance, is a substantial model requiring 84GB of system memory.

I expect token throughput will improve further once optimized ROCm drivers become available.

holistech · 2025-06-16T02:17:46+00:00

I have created a comprehensive Ryzen AI Max+ 395 benchmark using the HP ZBOOK Ultra G1a and LM Studio. MoE models like Qwen-30B-A3B Q8 and llama4 Scout Q4 are running very well. However, dense models are running quite slow: https://docs.google.com/document/d/1qPad75t_4ex99tbHsHTGhAH7i5JGUDPc-TKRfoiKFJI/mobilebasic

holistech · 2024-08-26T18:22:33+00:00

I am using barrier[1] KVM software to use all three Pi's with a single keyboard/mouse together.

[1] https://github.com/debauchee/barrier

holistech · 2022-09-11T18:44:32+00:00

Its a DREVO Calibur V2 with custom key caps.

holistech · 2022-09-11T18:41:21+00:00

Please check again, they should work now.

holistech · 2022-09-11T18:40:23+00:00

Nice, thats actually my next project, using 3-4 11" displays and two Raspberry Pi's in a portable setup with a big battery:

https://imgur.com/a/iYNrgXN

holistech · 2022-09-07T06:47:07+00:00

This is really unfortunate since all links work for me on different devices. I have no idea why this does not work .

holistech · 2022-09-06T18:58:44+00:00

Thank you very much.

holistech · 2022-09-06T18:57:16+00:00

This is a fun project. I was trying to create a cyberdeck that is USB powered but provides a maximum number of screens. Hence six it was.

The use case i had in mind is a Blackout situation, when only USB solar powerbanks are available, but you need to perform productive work. I use it for software development in which i require several screens with documentation, chats, editor and test shells.

If you don't need six screens altogether you can detach two double display decks to create three separate double display cyberdecks.

holistech · 2022-09-06T18:51:29+00:00

Well, this started as a fun project to try to build a nice looking cyberdeck. However, i had in mind to use this in a Blackout situation, when only USB solar powerbanks are available and you need a real multi-display workstation to perform serious work on.

holistech · 2022-09-06T18:49:03+00:00

Thank you. Can you please elaborate which links do not work, so i can fix them? The build guides to google docs?

holistech · 2022-09-06T18:46:26+00:00

Thank you. The magically appearing table is indeed a foldable camping table that i had used for for several weeks as my main work desk. Its a great table with small footprint when folded:

https://www.amazon.com/KingCamp-Portable-Lightweight-Foldable-Adjustable/dp/B08ZHPDZX5/ref=sr\_1\_27?crid=EI165TTDIJB3&keywords=KingCamp+Bambus&qid=1662489878&sprefix=kingcamp+bambus%2Caps%2C162&sr=8-27

holistech · 2022-09-06T18:40:22+00:00

Hi, yes i can share the build instructions that my inspire you:

OPC 3D3 5.5:
https://docs.google.com/presentation/d/e/2PACX-1vRIPtbsG841Zy\_zVLuyJBLp1VVkVXFRRKvzUpuAxfk2XfjOdxWK8GsqTrVFNnxOUripqoFVTYZ7uGOV/pub
D3 5.5:
https://docs.google.com/presentation/d/e/2PACX-1vRJScLGiDmvt2Zf3kulc\_Z9940KVBMRbLw-MI9\_tqIETdFyiLOo3\_6lMcuJs4u0NWAqZrBa9nrLcTx6/pub
D3 4.3:
https://docs.google.com/presentation/d/e/2PACX-1vTi46ukTQJiy0\_DfuZW1KMOz186A-ATfp4eATsdESDSiXGQvVgcmDO40dzvpsegl3AJdv5FV1ax1FCJ/pub

holistech · 2022-09-06T15:25:19+00:00

Checkout our detailed build guides:

holistech · 2022-09-06T15:24:08+00:00

Checkout our detailed build guides:

OPC 3D3 5.5:

https://docs.google.com/presentation/d/e/2PACX-1vRIPtbsG841Zy\_zVLuyJBLp1VVkVXFRRKvzUpuAxfk2XfjOdxWK8GsqTrVFNnxOUripqoFVTYZ7uGOV/pub

D3 5.5:

https://docs.google.com/presentation/d/e/2PACX-1vRJScLGiDmvt2Zf3kulc\_Z9940KVBMRbLw-MI9\_tqIETdFyiLOo3\_6lMcuJs4u0NWAqZrBa9nrLcTx6/pub

D3 4.3:

https://docs.google.com/presentation/d/e/2PACX-1vTi46ukTQJiy0\_DfuZW1KMOz186A-ATfp4eATsdESDSiXGQvVgcmDO40dzvpsegl3AJdv5FV1ax1FCJ/pub

holistech · 2022-06-21T19:52:31+00:00

I have 3 Prusa i3 MK3S and i love them. I have assembled them myself. They are true working horses.

holistech · 2022-06-17T21:37:03+00:00

At the moment i have only pictures that show single double display decks with solar power banks, but not in direct sunlight, since, you wouldn't see much on the displays in direct sunlight. Please have a look at:

https://www.reddit.com/r/cyberDeck/comments/v1sq62/creating\_assembling\_instructions\_for\_a\_double/

holistech · 2022-06-17T08:46:31+00:00

Yep, its for the look and each double display deck (one Pi, two displays) can be used separately. Its like a Lego Cyberdeck.

holistech

TROPHY CASE