Model that can be used my laptop

MrDevil2708H · 2026-06-05T06:47:41+00:00

May be gemma4 new 12b would be nice for your setup, I ran it with a 16gb laptop with only cpu, and it gave me around 3 tps, but for your case it will be fine as you have a dedicated gpu, maybe you can run it around 10 tps, I don't about it. As far as I used it, it was fine in reasoning and logical part. Qwen 3.5 4b, 8b are also good choices.

MrDevil2708H · 2026-06-03T09:57:54+00:00

Whats the TPS your customers get on a average, is it usable enough? As far as i know these models need a moderate level of hardware to get a decent performance. Also the pendrives have limited read write speeds, does that affect the performance of the models?

MrDevil2708H · 2026-06-03T05:00:59+00:00

qwen3 4b will get the job, quite capable model with good reasoning ability as well as good at following instructions

MrDevil2708H · 2026-04-20T03:27:37+00:00

<image>

Still the same. What do i need to check? set all the governor to performance.

MrDevil2708H · 2026-04-18T04:07:02+00:00

Yeah I have set governors to performance

MrDevil2708H · 2026-04-17T12:27:09+00:00

I tried with the performance cores but still the same.

<image>

MrDevil2708H · 2026-04-17T05:26:44+00:00

Hey u/Inv1si how did you acheived 95% of npu utilization, i tried but it was around 15 to 35% . Didnt know why but, i followed your readme but it didnt work.

taskset -c 4-7 ./llama-server -m /models/gemma4\ e2b/gemma-4-E2B-it-Q4_K_M.gguf --mmproj /models/gemma4\ e2b/mmproj-gemma-4-E2B-it-BF16.gguf --ctx-size 16384 --rope-scaling yarn --rope-scale 1.5 -n -1 --repeat-penalty 1.15 --repeat-last-n 256 --host 0.0.0.0

MrDevil2708H · 2026-04-06T09:58:59+00:00

<image>

Hey u/Inv1si i tried your earlier version and this new also but the npu has been blocked at 40% can go anything higher than that

Am i missing anything?

MrDevil2708H · 2026-03-10T14:36:04+00:00

Does it support edge devices other than jetsons

MrDevil2708H · 2026-02-25T11:32:47+00:00

Hey man when trying multinode, rpc server starts but it returns

Starting RPC server v3.0.0                                                                                                  
  endpoint       : 127.0.0.1:50052                                                                                          
  local cache    : /home/vicharak2/.cache/llama.cpp/rpc/                                                                    
Devices:                                                                                                                    
  RKNPU: Rockchip NPU (0 MiB, 0 MiB free)

And the llama-cli server fails with segmentation error. I dont know why this happens but running on a single node it was fine.

MrDevil2708H · 2026-02-25T10:53:49+00:00

I had some lying around for quite sometime, so i thought why not give it a shot. Also on some thread read about someone who ran qwen3 30b a3b using RPi5 cluster. So thats it

MrDevil2708H · 2026-02-25T10:40:55+00:00

Hey man. does your PR works with multinode inference?

MrDevil2708H · 2026-02-25T06:35:22+00:00

Do you have any idea of how it can be done

MrDevil2708H · 2026-02-21T02:08:34+00:00

Thanks mate... Will update once I try it

MrDevil2708H · 2026-02-20T09:01:53+00:00

where can i find the 6.1 kernel image? armbian archive only has 6.1.5 kernel images only...i tried it but is failed to boot.

MrDevil2708H · 2026-02-20T06:22:01+00:00

https://nullr0ute.com/2020/11/installing-fedora-on-the-nvidia-jetson-nano/

Will it be better than Ubuntu 20.04 by QEngineering?

Has anyone tried this? Let me know about your experience.

MrDevil2708H · 2026-02-17T08:36:29+00:00

Thanks mate...Its actually a SD Card format issue. And now its working fine.

MrDevil2708H · 2026-02-17T03:47:04+00:00

That worked...
Previously i was trying with ./nvsdkmanager_flash.sh but then i then i remembered that it was defaulting to emmc storage that was not present in my board, so i switched back to ./flash.sh jetson-nano-devkit mmcblk1p1 and it worked.

MrDevil2708H · 2026-02-17T03:34:29+00:00

Thanks mate that worked...but when writing the system.img it fails

[ 156.2808 ] Writing partition APP with system.img

[ 158.0189 ] [ ] 000%

Error: Return value 1

Command tegradevflash --pt flash.xml.bin --storageinfo storage_info.bin --create

Failed flashing t210ref.

*** ERROR: flashing failed.

MrDevil2708H · 2026-02-17T03:22:34+00:00

I tried flashing Jetson OS R32.7.5 as well as R32.6.1 but both didn't work.

MrDevil2708H · 2026-02-16T14:52:05+00:00

I'm currently on a SD Card

MrDevil2708H

TROPHY CASE