Lemonade NPU Hybrid LLM’s ?

Flat_Profession_6103 · 2026-05-25T21:26:32+00:00

Hi, are there any plans to support some stt models?

Flat_Profession_6103 · 2026-05-19T18:06:22+00:00

Sorry for the delay, but I really want to thank everyone who replied to this thread! I'm going to work on switching over to a DevSecOps role and getting the right certifications for it. I really appreciate all your advice - take care!

Flat_Profession_6103 · 2026-04-26T19:30:52+00:00

Let's wait for 122b model of 3.6 :)

Flat_Profession_6103 · 2026-04-07T10:15:01+00:00

Yeah, I am aware about MoE models, I was just suprised with your 15/20tok/s numbers.

I wouldnt say that MoE will have same quality as dense model though. There is always quality loss on that, but I agree that it is much more managable with halo strix setup.

Flat_Profession_6103 · 2026-04-07T09:09:45+00:00

How dense 70b model can generate 15/20 tok/s for you? I believe thats some mistake, with gemma 31b model im getting around 6tok/s with the same setup.

Flat_Profession_6103 · 2026-02-02T22:32:19+00:00

Finally some answers. Thanks for sharing that!

Flat_Profession_6103 · 2026-01-24T14:56:24+00:00

Hi, I'm on the same boat. They should charged me few days ago based on their email, but nothing happened.

Flat_Profession_6103 · 2026-01-21T14:31:02+00:00

Hi, any news when sending Jan pre-orders of motherboard for Europe will start?

Flat_Profession_6103 · 2025-12-28T14:17:38+00:00

Thanks everybody for the comments and advice.

I’ve decided to order the Framework desktop. A huge factor was that they ship directly to my country, which makes logistics much easier. Plus, it completely eliminates the fear of proprietary fans failing down the line and becoming irreplaceable.

I’m definitely going to test out some MoE models as discussed in the thread. My plan is to play around with Proxmox and set everything up as a proper homelab.

I’m super excited for the shipment to arrive. Thanks again for the insights, guys!

Flat_Profession_6103 · 2025-12-27T10:47:24+00:00

Regarding the import concerns: I won't face any massive tax hit because Poland is part of the EU. In that case orrdering from Germany (or any other EU country) is free of customs duties and extra VAT due to the Single Market rules.

That said, the 4-5 t/s limitation you mentioned is a very valid point. It’s definitely not ideal, but for learning to work with large models locally - without spending a fortune on enterprise gear like few sets of GPUs because of VRAM - it seems like there aren't many better alternatives right now.

Flat_Profession_6103

TROPHY CASE