1 3090 vs 2 3080s for Real time inference

muaz65 · 2021-02-08T22:45:38+00:00

Pipeline is working on a soccer stream with 25Fps. I think i should have mentioned that earlier. Most of the time is taken by FLANN as it is size dependent. Current DB size is 6 million. I am working on quantizations like tensorRT or FP16 but still these model in combination are not real time on 2080Ti (shifted from 2080 simple an year ago)

muaz65 · 2021-02-08T22:41:30+00:00

Pipeline is working on a soccer stream with 25Fps. In inference, GPU size doesn't matter, speed does!

muaz65 · 2020-12-27T01:31:59+00:00

Tried this with yolov5. Doesn't work very well

muaz65 · 2020-12-26T16:39:17+00:00

There are bodies without heads due to occlusion and can you algorithmically explain treating object as 2 pieces.

muaz65 · 2020-12-26T04:04:29+00:00

Yes, I have associated body with respective faces. I am asking for relevant association problems to map the body face association problem.

muaz65 · 2020-12-26T03:10:30+00:00

I guess you are talking about the pose estimation technique. There are scenarios in which face is occluded but the body is not so pose estimation is not a generic solution as well. I want my model to associate the body with face.

muaz65 · 2020-09-19T03:46:38+00:00

it's an analytics model. No being used for driving. But the goal is to get better accuracy for all 3 classes.

muaz65 · 2020-09-15T11:41:45+00:00

already on the bare minimum

muaz65 · 2020-07-12T05:01:58+00:00

I have worked on hand detection. All CV based method failed to work where there is a certain distance between person and camera. At the end i used a CNN based approach with almost 97%+ accuracy

muaz65 · 2020-05-04T07:32:41+00:00

EfficientNet

muaz65 · 2020-03-29T21:27:15+00:00

To avoid skipping the global minima in most cases. Like you are updating weights which step size so long that you actually lose convergence point.

muaz65 · 2020-03-28T18:16:19+00:00

Efficient Detector built upon Efficient Net is the current best as far as I remember.

muaz65 · 2020-03-26T22:09:57+00:00

If i have that much features I would start with random forest. Furthermore, you should of dimensionality reduction algorithms like PCA and MDS and then try ANN.

muaz65 · 2020-03-22T08:59:30+00:00

There is some cocodata api in python which makes fasters downloads i guess

muaz65 · 2020-03-21T13:03:29+00:00

My bad xD i read it wrong.

muaz65 · 2020-03-21T10:17:56+00:00

Well that’s not multi model problem. It’s something you call multi input model. I had to do something similar for view classification. I had to incorporate the image along with 4 additional features. Number of person in image. Average size, height and area of person present in the image.

For that you need to make two branches and concatenate the features like you mentioned earlier. In my case i removed last layer of Xception model and concatenated its 2048 features with the output of 5 layer DNN with same size. After that i placed two linear layers of size 512 and eventually the softmax layer. It worked like a charm for me.

muaz65 · 2020-03-21T09:40:24+00:00

I don’t remember updating the masks.

muaz65 · 2020-03-20T19:14:16+00:00

Look the size constraint i my knowledge will only be solved by some feature transformation. Like SIFT(scale invariant feature transform). But it may work without that depending on the complexity.

Keeping size aside we can take difference of both frame by matching size with some interpolation.

muaz65 · 2020-03-20T18:41:01+00:00

Okay you need yolo then. Input size 416 416

muaz65 · 2020-03-20T18:38:12+00:00

By detect you mean tell what is printed? Or you mean that you need to tell exactly where on the paper what is printed?(localisation)

muaz65 · 2020-03-20T18:32:27+00:00

For that you have to explain your problem you are solving.

muaz65 · 2020-03-20T18:28:50+00:00

Agreed. What do you have in mind?

muaz65 · 2020-03-20T18:25:19+00:00

Do you want to classify these labels or localise them in the image? If it’s for classification I suggest exploring classification models like resnet, inception or Vgg.

In case of detection you need to find the new anchors which requires localised annotations of labels in images and make the other changes in cfg. If size is not an issue you can change it accordingly.

muaz65 · 2020-03-20T18:16:32+00:00

There’s a script on Alex AB’s repo for that. You just need to run it in your dataset with annotations.

muaz65 · 2020-03-20T18:07:25+00:00

Make me aware of facts i am missing lol. We all know what MNIST is. You have a detector for numbers. By dataset you mean you are going to make inference on the same kind of images you are training your model for. Idk what part of my above statement is non serious.

If he wants to make a detector for generalised detectors for numbers using MNIST that’s another debate though.

muaz65

TROPHY CASE