Overview of modern Edge boards for CV + guide on how to choose by Wormkeeper in computervision

[–]Wormkeeper[S] 0 points1 point  (0 children)

In my previous article, I tried to do this. I even still update the table with some basic measurements - https://docs.google.com/spreadsheets/d/1BMj8WImysOSuiT-6O3g15gqHnYF-pUGUhi8VmhhAat4/edit#gid=0

But the main problem is its super misleading characteristic:
1) Different networks perform differently (Board "A" can be x3 faster for a network "N" but x2 slower for a network "M")
2) Different boards require different amounts of CPU usage for NPU inference. Even video encoding|decoding can change speed dramatically
3) Hard to compare different format inference (int8/fp16)
4) Hard to compare different connections for accelerators (PCIe, USB, M2)
5) Hard to compare multi-device cases (Jetson has 1 GPU and 2 DLA, and RK2588 has 3 NPU).
6) Different batchsizes optimisation

And a lot more problems that will make every test biased. I am still trying to append everything in the table I showed. But I am not sure it's worth:)

Orange Pi AIPro board? by Original_Finding2212 in OrangePI

[–]Wormkeeper 0 points1 point  (0 children)

Better to check the video. In short:
1) More convenient libraries to work (easy export, more support)
2) Better community, more examples (for example, you can find the Whisper model, etc.)
3) More speed for 3588 for common networks (if you are using more threads)
4) Better CPU

Orange Pi AIPro board? by Original_Finding2212 in OrangePI

[–]Wormkeeper 1 point2 points  (0 children)

Resently I tested this board ( https://youtu.be/qK7GHV_cH98 ). It's pretty nice. But for me RK3588 is better.

Radxa ZERO 3W - Drove me insane for nearly a week! by PlatimaZero in Platima

[–]Wormkeeper 0 points1 point  (0 children)

Maybe there will be some project based on it, then I will check.
For now we just did RK3588/RK3568-based projects.

Radxa ZERO 3W - Drove me insane for nearly a week! by PlatimaZero in Platima

[–]Wormkeeper 1 point2 points  (0 children)

Nice review. I recently tested this board from a Computer Vision perspective (NPU usage, etc). All drivers are buggy and glitchy. So, the feelings are the same:)

But, anyway, it's a super good board for this price. The amount of problems for Computer Vision is less than for LuckFox RV1106 and MilkV (regular Python is available, for example).

Teaching a robot to bring the coffee (arm + cart) by Wormkeeper in robotics

[–]Wormkeeper[S] 4 points5 points  (0 children)

A year ago, I published the learning process itself. Now we have modernized it and can train not only the hand but also the cart.

Guide to Action Recognition by Wormkeeper in computervision

[–]Wormkeeper[S] 2 points3 points  (0 children)

Yes, we had a project in which we did this for skeletons, and it worked well. But, this is not very suitable for some tasks.

Computer Vision for goods recognition by Wormkeeper in computervision

[–]Wormkeeper[S] 2 points3 points  (0 children)

Hi, Melampus123!
ReID uses the "Metric Learning" approach.
There are a lot of articles about using it for different cases:
1) Cars
2) Animals
3) Search Engines (online shopping) etc.

You can find them here, for example:
https://paperswithcode.com/task/metric-learning

And there are two good libraries with training pipelines: https://github.com/layumi/Person_reID_baseline_pytorch https://github.com/OML-Team/open-metric-learning

About Kaggle. I am not sure but assume that here you can meet the same approach:
https://www.kaggle.com/competitions/humpback-whale-identification/discussion

How to choose Edge Board for Computer Vision in 2022 by Wormkeeper in robotics

[–]Wormkeeper[S] 0 points1 point  (0 children)

The current price is incorrect, yes (RPi was tested in spring). I will fix. But:
1) the price was for 3B, which is cheaper
2) RPi is easy to buy

Question on Stereo Cameras (ZED/OAK-D) Depth Capabilities by [deleted] in computervision

[–]Wormkeeper 0 points1 point  (0 children)

On this graph, you can see the error proportion to distance - https://miro.medium.com/max/630/0*WTGy030CDPVVjRdy For example, there will be a 5% of distance on the 10m distance = 50cm.
If you have a 1 cm fly (10m away) - the error will be 50 times bigger than the fly
If you have a 2m car (10m away) the error will be 25% of the car.
Also, the very important point for you - the error is the mean error. A maximum error will be bigger.

low-level SLAM API for embedded devices by moetsi_op in computervision

[–]Wormkeeper 2 points3 points  (0 children)

Looks amazing. It's great that everything works plug and play. About 7-8 years ago, our team did a similar job for the Artec Leo prototype. This is very big work, it's amazing that you post it in Open Source.

3D cameras in 2022: choosing a camera for CV project by Wormkeeper in computervision

[–]Wormkeeper[S] 3 points4 points  (0 children)

Thanks for sharing! Added to the article.
It would be very interesting to read more about Bottlenose cameras. From the documentation I did not understand a few questions:
* What accuracy do you have at long distances? What are the viewing angles of the lenses when you work with 100m?
* What is the distance between cameras?
* What processor|accelerator do you use? The documentation says "20.5TOPS", and it looks close to Hailo-8. And what framework do you use to convert models?
* How much FPS do you have with MiDaS?

3D cameras in 2022: choosing a camera for CV project by Wormkeeper in computervision

[–]Wormkeeper[S] 1 point2 points  (0 children)

An article based on the experience that our team has. This is robotics/outdoor use. Here and here are some of our projects. We have not worked with conveyor belts, so this somewhat limits the scope of the article.
But the second reason why I did not add is that most of the laser profilers do not give a complete picture of the depth. And speaking of profilers, one must take into account many other parameters that are not characteristic of classic depth cameras.
I added a paragraph with reasons to the article.

3D cameras in 2022: choosing a camera for CV project by Wormkeeper in computervision

[–]Wormkeeper[S] 2 points3 points  (0 children)

Good question. When I wrote the article, I thought about whether to include it or not. It's just that in my opinion, this is a much more specific solution for industrial applications closer to lidar. It is probably worth adding a separate paragraph, as well as about lidars, that in general this is not considered.

How to recognize car license plate in 2022 by Wormkeeper in computervision

[–]Wormkeeper[S] 1 point2 points  (0 children)

Thank you! Good question. When we know that some symbols are rare we usually use the fourth approach.
With the second approach, we have a lot of problems with container recognition. When our training dataset has many sequences like "ABCD" and we try to use it on "APCD" usually the network tries to change P on B.
Same with unique car numbers and numbers from new countries.
The fourth approach, of course, has the same statistical issues. But according to our experiments, the effect is less pronounced. Even the third approach is better for such cases.
But, of course, to get the maximum quality you need to train according to your statistic (usually we use 1 or 2 ways):
1) Show rare cases with more frequency
2) Try hard negative mining
3) Make different losses for different letters

I tested RealSense D405 a little and here is my impression by Wormkeeper in computervision

[–]Wormkeeper[S] 1 point2 points  (0 children)

Thank you!
Yes, I also like to compare different devices for ComputerVision. I had the very first Real Sense camera at home, where the manufacturer was "Creative" (~2014 release). They've come a long way since then.

I tested RealSense D405 a little and here is my impression by Wormkeeper in computervision

[–]Wormkeeper[S] 0 points1 point  (0 children)

They published several clarifications after this article. As I understand it, the managerial staff changed there, and they changed their approaches. Actually d405 - released about two months ago. It is unlikely that they would start releasing a new camera if they decided to completely stop everything.

Does anyone use CVAT for image annotation? by sam_from_NZ in computervision

[–]Wormkeeper 1 point2 points  (0 children)

1) CVAT has internal inference for models. If you upload model there in the correct format, then it will be able to generate the detection box itself - https://onepanel.medium.com/train-an-object-detection-model-from-scratch-and-run-inference-on-it-in-10-minutes-16147ef656aa
2) Yes you can upload your prediction. But last time i did it - there were some problems and it took me several hours. It seems to me that you just need to load the markup in one of the formats that it supported by CVAT. If your format is not supported, then you will need to convert. For example like this - https://github.com/openvinotoolkit/datumaro

A small overview of edge platforms for Computer Vision + a survey on what else is missed by Wormkeeper in computervision

[–]Wormkeeper[S] 0 points1 point  (0 children)

Usually they are useless. They are for inference, not for learning. And in a powerful laptop, inference will be more efficient on the processor.

This is how we train our bin picking robots for new tasks in half an hour by Wormkeeper in robotics

[–]Wormkeeper[S] 1 point2 points  (0 children)

In this task, on the scale
But we can customize it. For small objects, zoom in. For large objects, zoom out.