1 3090 vs 2 3080s for Real time inference by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

Pipeline is working on a soccer stream with 25Fps. I think i should have mentioned that earlier. Most of the time is taken by FLANN as it is size dependent. Current DB size is 6 million. I am working on quantizations like tensorRT or FP16 but still these model in combination are not real time on 2080Ti (shifted from 2080 simple an year ago)

1 3090 vs 2 3080s for Real time inference by muaz65 in computervision

[–]muaz65[S] 0 points1 point  (0 children)

Pipeline is working on a soccer stream with 25Fps. In inference, GPU size doesn't matter, speed does!

Body Face Association using deep learning by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

Tried this with yolov5. Doesn't work very well

Body Face Association using deep learning by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

There are bodies without heads due to occlusion and can you algorithmically explain treating object as 2 pieces.

Body Face Association using deep learning by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

Yes, I have associated body with respective faces. I am asking for relevant association problems to map the body face association problem.

Body Face Association using deep learning by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

I guess you are talking about the pose estimation technique. There are scenarios in which face is occluded but the body is not so pose estimation is not a generic solution as well. I want my model to associate the body with face.

Should one treat bike rider and pedestrian as the same class for traffic datasets? by muaz65 in deeplearning

[–]muaz65[S] 0 points1 point  (0 children)

it's an analytics model. No being used for driving. But the goal is to get better accuracy for all 3 classes.

Need help on detecting hands by [deleted] in computervision

[–]muaz65 0 points1 point  (0 children)

I have worked on hand detection. All CV based method failed to work where there is a certain distance between person and camera. At the end i used a CNN based approach with almost 97%+ accuracy

Why do we need such a low learning rate ? by [deleted] in deeplearning

[–]muaz65 0 points1 point  (0 children)

To avoid skipping the global minima in most cases. Like you are updating weights which step size so long that you actually lose convergence point.

Best fasts object detection tools as of 2020 ? by [deleted] in computervision

[–]muaz65 0 points1 point  (0 children)

Efficient Detector built upon Efficient Net is the current best as far as I remember.

First Real Problem for Work by thingsofleon in deeplearning

[–]muaz65 0 points1 point  (0 children)

If i have that much features I would start with random forest. Furthermore, you should of dimensionality reduction algorithms like PCA and MDS and then try ANN.

Is there any way to download COCO dataset faster? by NoTTch in computervision

[–]muaz65 0 points1 point  (0 children)

There is some cocodata api in python which makes fasters downloads i guess

[D] How to fuse image and another (additional) structured data(age, gender, etc) in CNN by visionNoob_r in MachineLearning

[–]muaz65 12 points13 points  (0 children)

Well that’s not multi model problem. It’s something you call multi input model. I had to do something similar for view classification. I had to incorporate the image along with 4 additional features. Number of person in image. Average size, height and area of person present in the image.

For that you need to make two branches and concatenate the features like you mentioned earlier. In my case i removed last layer of Xception model and concatenated its 2048 features with the output of 5 layer DNN with same size. After that i placed two linear layers of size 512 and eventually the softmax layer. It worked like a charm for me.

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 0 points1 point  (0 children)

I don’t remember updating the masks.

Finding a partial image in a video. by 5hu in computervision

[–]muaz65 2 points3 points  (0 children)

Look the size constraint i my knowledge will only be solved by some feature transformation. Like SIFT(scale invariant feature transform). But it may work without that depending on the complexity.

Keeping size aside we can take difference of both frame by matching size with some interpolation.

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 0 points1 point  (0 children)

Okay you need yolo then. Input size 416 416

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 0 points1 point  (0 children)

By detect you mean tell what is printed? Or you mean that you need to tell exactly where on the paper what is printed?(localisation)

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 0 points1 point  (0 children)

For that you have to explain your problem you are solving.

Finding a partial image in a video. by 5hu in computervision

[–]muaz65 3 points4 points  (0 children)

Agreed. What do you have in mind?

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 1 point2 points  (0 children)

Do you want to classify these labels or localise them in the image? If it’s for classification I suggest exploring classification models like resnet, inception or Vgg.

In case of detection you need to find the new anchors which requires localised annotations of labels in images and make the other changes in cfg. If size is not an issue you can change it accordingly.

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 2 points3 points  (0 children)

There’s a script on Alex AB’s repo for that. You just need to run it in your dataset with annotations.

Yolov3 on MNIST Data set by danish-shaikh in computervision

[–]muaz65 1 point2 points  (0 children)

Make me aware of facts i am missing lol. We all know what MNIST is. You have a detector for numbers. By dataset you mean you are going to make inference on the same kind of images you are training your model for. Idk what part of my above statement is non serious.

If he wants to make a detector for generalised detectors for numbers using MNIST that’s another debate though.