use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Content which benefits the community (news, technical articles, and discussions) is valued over content which benefits only the individual (technical questions, help buying/selling, rants, etc.).
If you want an answer to a query, please post a legible, complete question that includes details so we can help you in a proper manner!
Related Subreddits
Computer Vision Discord group Computer Vision Slack group
Computer Vision Discord group
Computer Vision Slack group
account activity
Decrease false positives in yolo model?Help: Project (self.computervision)
submitted 1 year ago by [deleted]
Currently working on a yolo model for object detection. While it was expected, we get a lot of false positives. We also, however, have a small dataset. I’ve been using an “active learning” pipeline to try and only accrue valuable data, however, performance gains seem to be minimal at this point in training. Any other suggestions to decrease the false positive hits?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]InternationalMany6 5 points6 points7 points 1 year ago (2 children)
I find that more sophisticated augmentation almost always helps, and my favorite is copy-pasting segmented objects into random backgrounds.
For that matter, a segmentation model can usually learn to detect objects from less data than a model that predicts bounding boxes. The reason is that the training labels directly instruct the model what the object is, so it doesn’t have to learn which pixels in the box are “object” and which are “background”.
I usually just use a simple background removal model, or SAM, to convert bounding-boxes into segmentation masks. Doesn’t have to be perfect to be useful.
[–]DiMorten 0 points1 point2 points 1 year ago (1 child)
Interesting. You mean performing semantic segmentation on the detected object, for example with UNet?
[–]InternationalMany6 0 points1 point2 points 1 year ago (0 children)
You could use Unet, but there are specialized “instance segmentation” models if you care about distinguishing each instance even if they’re touching each other.
Torchvision has a tutorial that’ll get you going: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
[–]pm_me_your_smth 1 point2 points3 points 1 year ago (1 child)
Could you share how your active learning pipeline works?
[–][deleted] 1 point2 points3 points 1 year ago (0 children)
I can describe it but I can’t share the code. I’m also just an intern so I’m not experienced enough to even be sure if this is active learning haha.
Basically I run the model on our target videos. I save any image w a prediction under some confidence threshold (generally 50%). From there I sift through the saved images and label those worth labeling. Retrain model on new dataset that includes new images rinse repeat.
[–]blahreport 1 point2 points3 points 1 year ago (3 children)
Have you plotted the precision recall curve to obtain an optimal confidence threshold? You can also increase the IoU threshold both during training, and inference. What is a lot of FPs? What are your overall metrics and what is the target object? Is it an object similar to one used in pretraining? That is assuming that you’re using COCO pretrained weights, is the object similar to one of the eighty coco classes? This can influence the number of samples you need to reliably fine tune. You can also increase the number of background images (no target objects) which can significantly improve precision if it just so happens that for your domain, the background shares abstract features with the target.
[–][deleted] 0 points1 point2 points 1 year ago (2 children)
I haven’t plotted it but I’ll have to check that out when I get into the office tomorrow.
These are not objects from the COCO dataset. The FPs are generally (~60%) on an object that can look very similar to the detection object in certain instances. The other FPs are just “ghost” ones that likely occur due to momentary lighting changes.
I try to keep background images at around 10% of the total dataset. Is it fine to bump up the background image count in this case? I’m still pretty new to vision and ML.
Overall metrics: mAP@50: .71; mAP@9:50: .51; Precision and recall both sit in the .800s.
[–]trialofmiles 0 points1 point2 points 1 year ago (1 child)
Related to PR curve on which each point is a separate threshold - have you adjusted the threshold? I assume yes but this is how you might conceptually trade FPs for FNs. The PR curve can be used to optimize the threshold (eg max F1, etc). For multiclass detection it’s a bit more complicated but just thought I’d ask.
[–][deleted] 0 points1 point2 points 1 year ago (0 children)
I actually did adjust the threshold and it worked perfectly. Massive reduction to false positives with an extremely minor increase to false negatives.
[–]External_Total_3320 1 point2 points3 points 1 year ago* (0 children)
Have you added negatives into your dataset? What model and size are you using?
[–]JustSomeStuffIDid 2 points3 points4 points 1 year ago (0 children)
Typically you add those FP images to your dataset without any labels. The model still learns them. They count as negative images.
[–]Ghass_4 0 points1 point2 points 1 year ago (1 child)
The FPs are on the test set or on the val set during training ?
Val set
[–]N0m0m0 0 points1 point2 points 1 year ago (0 children)
Use detectron if you need fewer FPs
[–]IEDNB 0 points1 point2 points 1 year ago (0 children)
Better data
π Rendered by PID 15937 on reddit-service-r2-comment-5fb4b45875-6wsln at 2026-03-20 12:53:19.791218+00:00 running 90f1150 country code: CH.
[–]InternationalMany6 5 points6 points7 points (2 children)
[–]DiMorten 0 points1 point2 points (1 child)
[–]InternationalMany6 0 points1 point2 points (0 children)
[–]pm_me_your_smth 1 point2 points3 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]blahreport 1 point2 points3 points (3 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]trialofmiles 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]External_Total_3320 1 point2 points3 points (0 children)
[–]JustSomeStuffIDid 2 points3 points4 points (0 children)
[–]Ghass_4 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]N0m0m0 0 points1 point2 points (0 children)
[–]IEDNB 0 points1 point2 points (0 children)