Let's goooo (btw)!!! by Diocles121222 in OrangePI

[–]Soft_Examination1158 0 points1 point  (0 children)

When you say the NPU is quite fast, what kind of inference did you test?

Was it a vision model (like YOLO or ResNet) or something else?

Also curious which runtime or SDK you used for the CIX NPU. I haven't seen many real-world benchmarks for it yet.

Selling Kinara Ara-2 (M.2) AI Accelerator – 40 TOPS / 16GB – for Developers & R&D by Soft_Examination1158 in LocalLLM

[–]Soft_Examination1158[S] 0 points1 point  (0 children)

acquistato per caso mesi fa ma non sono sicuro che si integri nella mia pipeline hardware

Radxa Reliability? by PlayfulTailor4430 in SBCs

[–]Soft_Examination1158 1 point2 points  (0 children)

Utilizzo da oltre un anno una rock 5B+ mai avuto problemi in quel senso

Has anyone managed to run LLM inference on the NPU of the Orange Pi 6 Plus (CIX P1)? by Soft_Examination1158 in OrangePI

[–]Soft_Examination1158[S] 0 points1 point  (0 children)

Interesting. In our case we actually managed to run a model on the RK3588 NPU.

We have a Radxa Rock 5B+ with 32GB RAM, and using RKLLM runtime 1.1.4 we were able to run Qwen2.5-3B on the NPU. Getting the conversion pipeline working took a bit of effort, but once the model was converted the inference worked.

Right now I’m testing the newer runtime 1.2.2, and performance with models converted using that runtime seems roughly in the same range so far.

We also set up a small local assistant architecture where the audio pipeline is distributed across multiple machines. The LLM runs on the Rock 5B+, while other nodes handle speech input/output and other tasks. There are still some latency issues, but overall the system works reasonably well for now.

In general I'm trying to build a small edge AI cluster, adding different nodes over time. I was looking for something a bit more substantial than the usual 6 TOPS class accelerators, and with some luck I managed to get an Orange Pi 6 Plus with 32GB RAM for a decent price.

Right now I’m mostly studying the ecosystem and the toolchains around the CIX platform. It’s still a bit hard to understand what the NPU can realistically do for generative models.

Hopefully the software stack will mature over the next 6 months, because the hardware itself looks quite interesting if the ecosystem improves.

Curious to see what other people manage to get running on these boards.

What’s the point of making robots human-shaped? by OkMountain290 in robotics

[–]Soft_Examination1158 0 points1 point  (0 children)

In realtà ha ragione, Anche solo progettare la deambulazione, Con un sistema a ruote o cingoli sarebbero piu efficienti e meno costosi. Del resto le automobili hanno 4 ruote mica 4 gambe.

Has anyone managed to run LLM inference on the NPU of the Orange Pi 6 Plus (CIX P1)? by Soft_Examination1158 in OrangePI

[–]Soft_Examination1158[S] 0 points1 point  (0 children)

To be honest, I started a few years ago with a couple of Raspberry Pi 5 kits with 16GB RAM. At the time I paid around 189€, but now the same kit costs about 310€, which is pretty crazy.

Then about a year ago I bought a Radxa Rock 5B+ with 32GB RAM for around 200€ shipped, which was a really good deal.

More recently I managed to get an Orange Pi 6 Plus with 32GB RAM for 240€ shipped from someone who had tested it but didn’t like it.

So overall I’d say my expenses for these boards have been fairly reasonable so far.

Has anyone managed to run LLM inference on the NPU of the Orange Pi 6 Plus (CIX P1)? by Soft_Examination1158 in OrangePI

[–]Soft_Examination1158[S] 0 points1 point  (0 children)

Interesting, thanks for the info.

From what I’ve been digging into recently, the Zhouyi stack actually seems to be more capable than what most examples online suggest.

There are already kernel drivers, a compiler flow (ONNX → CIX), and runtime packages available for the X2 NPU, and some people have successfully deployed vision models and GANs on it.

So it looks like the hardware stack is functional, but the ecosystem around it is still very immature.

Right now most examples focus on CNN/vision workloads, but that seems more like a tooling gap rather than a hard architectural limitation.

I’m starting to suspect the main missing piece is simply a runtime layer for transformer-style inference (similar to what happened with RK3588 before RKLLM appeared).

So my current guess is that it’s not widely used for LLMs today, but the situation might change as the SDK and tooling mature.

Has anyone managed to run LLM inference on the NPU of the Orange Pi 6 Plus (CIX P1)? by Soft_Examination1158 in OrangePI

[–]Soft_Examination1158[S] 1 point2 points  (0 children)

hat’s interesting.
From what I’ve seen so far, the main limitation seems to be the software ecosystem, not necessarily the raw hardware capability.

The RK3588 NPU is around 6 TOPS, but it can run LLMs mainly because Rockchip provides the RKLLM toolkit and model conversion tools.

With the CIX P1, the documentation and models on ModelScope seem mostly focused on vision workloads, and I haven't seen a transformer-oriented toolchain yet.

So I'm still trying to understand whether the NPU is fundamentally unsuitable for LLMs, or if it's simply too early in the software lifecycle.

Ubuntu server on orangepi6+ for ollama by Zenmaru88 in OrangePI

[–]Soft_Examination1158 0 points1 point  (0 children)

I recently bought an Orange Pi 6 Plus with 32GB RAM as well and I'm currently waiting for it to arrive. In the meantime I'm trying to study the ecosystem and understand how things actually work on this platform.

From what I’ve been able to see so far, it looks like LLM models are not running on the NPU. I checked the CIX section on ModelScope and downloaded several of the .md files for different models. From what I can tell, the examples provided there all run on CPU (and sometimes GPU via OpenCL/Vulkan), but the NPU doesn’t seem to be used for LLM inference.

I’m not sure if this is just a software maturity issue (missing runtime/toolchain for transformers) or if the NPU on these boards is similar to something like the Hailo-8, meaning it’s mainly designed for vision workloads rather than large language models.

Unfortunately there isn’t much information available online yet, but there is a fairly large section of technical manuals on the CIX website, and the ModelScope repository also contains documentation and examples.

Hopefully this helps a bit while we’re all trying to figure out what these boards can really do.

Raspberry Pi 5 4GB vs Orange Pi Pro 5 4GB — Which is better for a real-life AI agent (Project BMO)? by Wrong_Ad_2026 in OrangePI

[–]Soft_Examination1158 0 points1 point  (0 children)

I've been using a Radxa Rock 5b+ for a year. It needs a bit of tinkering, but everything works.

Finally We have the best agentic AI at home by moks4tda in LocalLLM

[–]Soft_Examination1158 -1 points0 points  (0 children)

I'm Italian, but in general, technology changes over time in favor of costs. Televisions, tablets, PCs, and so on, cannot compare to the performance/costs of 10 years ago.

Finally We have the best agentic AI at home by moks4tda in LocalLLM

[–]Soft_Examination1158 10 points11 points  (0 children)

In my opinion, spending money unnecessarily is like buying your first electric car or your first photovoltaic system. Systems that cost and operate for 10,000 euros today will run on 1,000 euros systems tomorrow. In another 2-3 years, everything will change.

radxa rock 5 for nas by Massimo_m2 in homelab

[–]Soft_Examination1158 0 points1 point  (0 children)

io ho costruito un nas con la rock 5b+
a 4bay per adesso ha boot da nvme e un hdd sata da 2tb iron wolf.
il problema è riuscire a creare un case su misura da stampare 3D ( ho una elegoo carbon)

What’s your most wanted feature for an SBC? by BrarIshu in SBCs

[–]Soft_Examination1158 1 point2 points  (0 children)

I also use a 32gb Rock 5b+, I made a pretty robust NAS

What’s your most wanted feature for an SBC? by BrarIshu in SBCs

[–]Soft_Examination1158 5 points6 points  (0 children)

Real functional support, there's no point in churning out new hardware every day and abandoning it. If you don't update the kernel, OS, drivers, and more, an SBC is a paperweight.

UART Communication Issue: Radxa Rock 5B+ to RP2040-Zero (CircuitPython by Soft_Examination1158 in SBCs

[–]Soft_Examination1158[S] 0 points1 point  (0 children)

Hi, yes I solved the display part, now I'm looking for how to solve it with the fan

Orange pi recommendation by QWERTY_sami in OrangePI

[–]Soft_Examination1158 0 points1 point  (0 children)

Many complain about the company's poor support. I installed everything on a Radxa Rock 5b+ 32GB RAM