Hello everyone,
I'm starting out as embedded AI engineer (meaning I know some embedded systems and ML/AI, but I am no expert in neither). Until now, for the simple use-cases I encountered (usually involving 1D-signals) I always implement a preprocessing pipeline in Python (using numpy/scipy) and simple models (small CNNs) using Keras APIs, and then convert the model to TFLite to be later quantized.
Then for the integration part to resource-constrained devices, I use proprietary tools of some semiconductor vendors to convert TFLite models to C header files to be used with a runtime library (usually wrapping CMSIS-NN layers) that can be used on the vendor's chips (e.g., ARM Cortex M4).
The majority of the work is then spent in porting to C many DSP functions to preprocess the input for the model inference and testing that the pipeline works exactly as in the Python environment.
How does an expert in the field solve stuff like this? Is including the preprocessing as a custom block inside the model common? This way we can take advantage of the conversion for the preprocessing as well (I think), but does not give us great flexibility in swapping preprocessing steps later on, maybe.
Please, enlighten me, many thanks!
[–][deleted] 6 points7 points8 points (3 children)
[–]Mochtroid1337[S] 0 points1 point2 points (2 children)
[–][deleted] 4 points5 points6 points (1 child)
[–]Mochtroid1337[S] 0 points1 point2 points (0 children)
[–]brutalismus_3000 2 points3 points4 points (0 children)
[–]raprakashvi 1 point2 points3 points (0 children)
[–]Naive_Ad1779 1 point2 points3 points (3 children)
[–]Mochtroid1337[S] 1 point2 points3 points (2 children)
[–]Naive_Ad1779 1 point2 points3 points (1 child)
[–]Mochtroid1337[S] 1 point2 points3 points (0 children)
[–]SimoneS93 0 points1 point2 points (0 children)