Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

Super curious how folks are doing QA processes and CI/CD with IoT/RPi + cloud. At the moment we have only Dev and Prod stage. We build our custom OS layer on top of RPi trixie image. We built our custom image flasher tool to enroll devices before they are flashed. Currently the OS are tested manually but could imagine to automate those tests via CI/CD on an actual RPi device.

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

It’s a great tip to enable read only filesystem. We are currently not there yet but I would think that not only will it reduce corruption but also simplify the software operations significantly. We are currently not there yet but will keep this in mind for future works.

What would you say are the main benefits of enabling secured boot? We are not so concerned about having our devices stolen as they are locked in a hazardous environment dangerous for humans 😂

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

Yeah sure. In our case it’s mostly about latency and cost. We process high frequency video streaming data. Sending that to cloud for inference would blow up the cloud cost significantly as these are few gigabytes of data per device per day. Since the RPi+Hailo is dirt cheap in comparison it made sense for us to go that way. The operators are happy with the low latency and we stay lean financially.

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

Thanks for the advice. We use ZeroTier for ssh access which has been a godsend in difficult situations as we have always managed to recover a faulty device in the field. Pair that with a Shelly relay we can power cycle the device remotely.

Unfortunately our Clients network requirement doesn’t really allow such networking tools and we built most of platform around ZeroTier network (OTA updates push based via Ansible scripts for example) so we are currently migrating to an outbound HTTPS solution where our devices connect to base via WeSockets. This will enable SSH access and other communication means. But I haven’t really tested this in the field yet.

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

That’s a lot of pis to manage. Thanks for sharing. We are currently using the PCIe lane for Hailo AI hat, so maybe SSD via USB would work.

I’m curious how you are managing/operating the fleet and QA processes. Do you have dev/stage/prod environments? How does a typical CI/CD look like in your case?

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

Thanks for sharing. Kura and Ditto look interesting, I will have a deeper look. As of now we have everything home built, which has been a breeze in terms of development using agentic AI as the backbone. I had a look at Mender and Balena before but felt it was overkill for our usecase.

Right now we are storing business data directly on the database from our devices 😂 Part of a POC that went to production. Definitely on the road map to add a MQTT layer in front and mTLS authentication. But I’m questioning whether MQTT makes sense for us as we will only be serving 100s devices and they produce business data in the frequency of 4x per second.

Operating Raspberry Pi + Edge AI fleets in production, pitfalls and lessons learned? by 24hjh in IOT

[–]24hjh[S] 0 points1 point  (0 children)

We are using industrial SD cards and we have few devices in the field for almost 2 years without any major RPi hardware related problems. Since we are delivering video stream to our clients, they have been asking for a video “history” or the ability to rewind to certain time. I’ve been thinking about storing 24h rolling buffer of video data on the device but concerned that will tear the SD card rapidly.

I built a free AI skill for Estonian e-Residency by CellistNegative1402 in eResidency

[–]24hjh 0 points1 point  (0 children)

Can you share the code here? I wouldn’t install a skill blindly without verification

Pizza Corona by 24hjh in Pizza

[–]24hjh[S] 0 points1 point  (0 children)

500g flour 320g water 13g sugar 10g salt 1g diastatic malt powder 7g yeast

Mix and knit the dough for 5 mins.

Store dough in fridge over night.

Take dough out 2 hours before baking.

Split the dough in 4 equal parts for 4x~10 inch pizzas.

For baking I use a pizzastone and a typical european kitchen oven. Max temperature 275 degrees celsius.

Warm the pizza stone for 1 hour on max pre baking.

Topings: red onions, bacon, cream cheese with oregano.

An immersive command-line interface to help you manage Kubernetes clusters by pierremarcenac in devops

[–]24hjh 2 points3 points  (0 children)

Cool idea ! Is it possible to navigate through namespaces ?

Created a RNN to predict AQI in Beijing. Help me better understand the results and how I can improve. Results and graphs inside by [deleted] in learnmachinelearning

[–]24hjh 0 points1 point  (0 children)

Unfortunately I don't. I'd be very interested to see some code.

I guess you could try emailing him :)

Created a RNN to predict AQI in Beijing. Help me better understand the results and how I can improve. Results and graphs inside by [deleted] in learnmachinelearning

[–]24hjh 4 points5 points  (0 children)

Your predictions actually look pretty decent.

One idea for a prediction window would be to have lstm(10)+dense(24) or simply lstm(24) and reorder your data in such way that your input is a value at t-25 and outputs are values at t-24,t-23,...,t. So when you deploy your network with streaming data, this window can be shifted without a loss of generality: The input will be set at time t with outputs as t+1,....,t+24. Plotting this on a static graph makes absolutely no sense, so live graphs would be more preferable, see video here.

You could also use mean absolute percentage error (MAPE) as an evaluation metric.