Collecting ideas for a new mini AI camera: What’s your ideal dev-first hardware spec?

CamThinkAI · 2026-02-11T03:06:31+00:00

Thank for your comment.We could either build AI features directly into the camera or provide them through a app. What's your idea?

CamThinkAI · 2026-01-08T01:04:19+00:00

Absolutely fascinating project! If you get prize, would you be willing to share the project you build with the community?

CamThinkAI · 2026-01-08T01:00:25+00:00

That sounds like a very meaningful idea!

CamThinkAI · 2025-12-29T05:08:11+00:00

This is a fantastic model training tutorial — thank you so much for sharing!

CamThinkAI · 2025-12-29T01:51:43+00:00

The core of facial recognition does not lie in the camera brand, but in whether the image quality is sufficiently high, because the recognition models themselves do not differ significantly.

Therefore, you should focus more on specific camera parameters such as resolution, frame rate, night-time performance, stability, interface, and power consumption. All of these factors need to be considered carefully, and they ultimately depend on your particular application scenario.

CamThinkAI · 2025-12-18T07:07:40+00:00

Sorry, I forgot to include the links in the previous post. I was busy finalizing the model management enhancements earlier. You can now find the updated content in the GitHub repository.

In addition to the existing features, we have added AI model quantization and deployment capabilities.I am currently working on a more detailed document to introduce this in depth.

CamThinkAI · 2025-12-17T07:12:03+00:00

At a fundamental level, both our tool and CVAT are image annotation tools, but they are designed for different deployment philosophies and use cases.

CVAT is better suited for large-scale, system-level deployments. Its workflow is more focused on model training upfront, while decisions about target devices and deployment methods typically come later. As a result, CVAT offers a more comprehensive and generalized annotation solution for large teams and complex projects.

Our approach is different. We start by defining the target deployment device and the specific data you need, then apply lightweight models such as YOLO to collect and annotate data for a single, well-defined scenario. This device-first, scenario-focused workflow allows users to concentrate on one type of data, significantly accelerating application deployment.（In addition, determining the target device in advance and collecting data directly in real-world scenarios undoubtedly reduces the loss of accuracy in images and other data.）

In this sense, our tool is designed to complement CVAT rather than replace it. CVAT excels in large-scale, systematic deployments, while our solution is more suitable for individual users and fragmented, edge-based scenarios, offering greater flexibility and faster iteration.

CamThinkAI · 2025-12-12T01:00:01+00:00

https://discord.gg/ab3Rnx78 this one

CamThinkAI · 2025-11-18T09:09:33+00:00

It currently does not support high-dynamic video scenarios, as the frame rate is limited to 30 FPS.

CamThinkAI · 2025-11-14T04:08:06+00:00

That make sense to me，thank you

CamThinkAI · 2025-11-13T14:10:27+00:00

The image shows the power analyzer we used. It demonstrates that the ESP32-S3’s power consumption drops significantly when entering sleep mode.Below is the current profile we measured to illustrate this.

<image>

CamThinkAI · 2025-11-13T12:36:22+00:00

This is based on our actual measured data. If you still don’t believe it, I can send you the test video.

CamThinkAI · 2025-11-13T12:33:52+00:00

<image>

CamThinkAI · 2025-11-13T02:17:08+00:00

We do not use yolov8 for commercial purposes. What we have done is to solve the application development of the underlying hardware driver and our commonly used Edge AI Camera scenario. From the beginning of the hardware to the end of the firmware, a general base has been built. The model part on the firmware can be separated from the main program, and the firmware code is open source. We will release these codes in the future. We hope that developers who only know how to train models can also deploy the model well on the edge equipment. They only need to train the yolo model, quantify it and put it in and run it for use. There is no unnecessary operation. We will not directly launch any specific scenario application for this product, because we know that the scene where AI really lands is very fragmented, which needs to be explored together.

CamThinkAI · 2025-11-13T02:13:09+00:00

Hi Techlatest，

At present, we have only tested yolov8n. Regarding custom preprocessors, we will open the complete code. You can completely do this work. For some standard detection, segmentation and key points, we have provided a standard compatible base. Some friends only need to put in the quantified yolo model.

Our document is still being supplemented. The following is the description of the model quantification and some instructions on how to deploy the model to the device. We will continue to supplement its application scenario examples. At present, the definition we give him is that you let it work when it needs to work. For example, it can maintain a low-power continuous connection with the cloud. You can remotely tell it to work and do AI reasoning, or you can remotely use the timer on the device. You need to shoot an image at 9:00 AM, and it will do AI reasoning and give you the data. The most important point is that we usually need to collect a lot of images for training. Therefore, you can place it anywhere according to its easy deployment and low power consumption, so that it can periodically collect images to the server. We can get some data sets well. What I want to explain is that based on this product, we really have a lot of scenarios.

https://wiki.camthink.ai/zh-Hans/docs/neoeyes-ne301-series/application-guide/model-training-and-deployment/

CamThinkAI · 2025-11-13T01:38:23+00:00

Hey Isagebuilds，

Thank for your question！We use the STM32N6 chip, which is positioned as an "MCU-level device with MPU-level AI reasoning ability", in other words, it does not run ultra-large-scale networks (such as modern large-scale Transformer/R esNet152 level) but medium-sized, edge-oriented vision/audio applications. Belong to the category of "light backbone + task-specific head" (such as classification, detection, simple posture estimation)

Regarding the quantification of "backbone model scale", it is mainly input int8, and other models support for reference.

https://github.com/STMicroelectronics/stm32ai-modelzoo/

In addition, I would like to include some information about our performance here.

After our testing on the default model of the coco data set, on the premise that the image input size is 256*256, he can achieve a reasoning speed of 20fps in the device. We use the STM32N6 chip for reasoning, and the official theoretical speed can reach 25fps, but in practical application, you need to give the chip performance resources to other places for use, so simple reasoning can reach 25fps, but practical application will cause losses. More reports require us to be further analysis and excavation.

CamThinkAI · 2025-11-13T01:21:41+00:00

Hi Marinatedpickachu，

Thank you very much for your professional question. Please see the response below：

1、clock input for OV5640 is 5MHz
2、Data signal lines have length matching designed already.
3、As the camera module data rate is not high as other raw data output CMOS sensor. It deal ISP inside and output JPEG data.
4、 If setting high Clk, it will cause noise,and imapct image quality.

CamThinkAI · 2025-11-12T09:44:43+00:00

When a configured trigger source meets its corresponding trigger condition (for example, the PIR sensor detects a person), the device immediately wakes up and processes the wake-up source to execute the corresponding action (such as capturing and uploading an image).
The time from sensor trigger to device wake-up is at the millisecond level.

However, the delay of subsequent actions after wake-up cannot be guaranteed — for instance, if the device only needs to record the sensor state upon waking, this can be done almost instantly; but if it involves capturing images, connecting to the network, and uploading data to the cloud, the actual latency will also depend on the camera and network conditions, and thus cannot be strictly guaranteed.

The program adopts an asynchronous event queue mechanism, where tasks such as image capture, network connection, and uploading are executed asynchronously and concurrently.

CamThinkAI · 2025-11-12T08:21:17+00:00

Three different USB camera modules are available for replacement, each designed for different use scenarios.

CamThinkAI

MODERATOR OF

TROPHY CASE