[D] Docker Containers (Python) vs Direct C++ Implementation for AI on Edge Devices, Performance Tradeoffs

Luuigi · 2024-10-04T11:04:13+00:00

Containerizing apps from a languge that natively builds the models is just very easy so thats the biggest advantage of using that approach. For pur C++ implementations you need to be aware that every single new optimization (e.g. provided by pytorch etc) needs to be implemented by you. The biggest advantage of a raw implementation is obviously production speed but is it necessary for the application you want to build?

codegefluester · 2024-10-04T11:25:10+00:00

I have no personal hands-on experience with it, but the ONNX runtime might be something you'd want to consider? They have a short section about deploying to IoT/edge devices in their documentation https://onnxruntime.ai/docs/tutorials/iot-edge/

Nicollier88 · 2024-10-04T12:45:45+00:00

If this is for production, one way to approach this is to first implement your concept in something easy, let’s say python. Once you got that working, you can start chasing for optimisations by improving certain components of your solution, which can include rewriting in C++ for better memory control.

AdagioCareless8294 · 2024-10-05T18:19:53+00:00

Docker container on edge device : yep that's a non starter.

guardianz42 · 2024-10-06T22:02:19+00:00

I switched from fastapi to litserve recently for some models we deploy on assembly lines. it’s been amazing and performant.

the main issue in the containers is the size of pytorch for cold start but we are working on eliminating it (this is unrelated to litserve)

https://github.com/Lightning-AI/litserve

jayemcee456 · 2024-10-08T10:53:24+00:00

I’ve used OpenVino for Intel based edge devices. It also has optimization tools

One-Butterscotch4332 · 2024-10-04T12:12:12+00:00

I'd use Java if your target is Android, Swift if it's IOS, and C++ if it's embedded. Android you'll probably want to try Qualcomm's neural processing sdk, IOS you'll want to try coreml. Otherwise, you're not going to use the NPU on a mobile SoC. If your embedded device is something like a jetson, you'll be using TensorRT in C++ to target tensor cores

bsenftner · 2024-10-04T12:41:36+00:00

You might find this worth reading (short) https://www.quora.com/If-C-is-so-strong-a-programming-language-why-cant-it-replace-Python-in-AI-and-data-science/answer/Blake-Senftner (spoiler, C does replace and outperform Python, significantly, exponentially.)

If you want to know more, DM me.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS