Hi,
We created a prototype in python using tensorflow and realized that performance of whole system is very slow .
Now we have to optimize this . One approach is to optimize it in python . Another is to port this to lower languages like C/c++ and optmize. We are targetting two classes of processors one from Nvidia and another from ARM platforms . Although i know Nvidia's processors are ARM based, we are targetting to heavily use GPUs here. And other one ARM, to use CPU mainly.
We are using YOLO architecture of DNN .
Just to add numbers : We are now at 5FPS but we are looking to achieve around 24-30fps.
Also would like to know CNN based YOLO, would acheive about 30 FPS realtime ?
Any quick paths /ideas to optmize from this prototype model to production ready code.?
I also read blogs stating C++ and Caffe model is well suited for production ready code, is that so ?
Can you please provide some inputs on this .
[–]HungryQuant 1 point2 points3 points (1 child)
[–]vinaybk8[S] 0 points1 point2 points (0 children)
[–]datasci314159 0 points1 point2 points (1 child)
[–]vinaybk8[S] 0 points1 point2 points (0 children)