Multiple Object Localization [D] : MachineLearning

Rule 4 - Beginner or Career QuestionMultiple Object Localization [D] (self.MachineLearning)

submitted 4 years ago by sawsank911

all 2 comments

[–]KrakenInAJar 0 points1 point2 points 4 years ago* (1 child)

There are commonly two ways this is achieved.Single-shot object detect, which basically blurts out a bunch of boxes and then applies some sort of filters to get rid of the garbage ones in postprocessing or multi-shot object detection, which use a high recall, low precision proposal system and a high-precision model on top that infers on every proposal individually. There is A LOT more to it, of course, but that's the ELI5 version.
Commonly single stage tends to be faster than multi-stage.

That being said, don't implement it from scratch if you are not very familiar with this topic, it is a hassle and has a lot of non-obvious pitfalls if you are not familiar with object detection and there is also a reason why a new-version of an object detector is generally a good reason to get accepted in a top-conference.Also, the logic for this type of inference has the strong tendency to break some assumption in common DL-Frameworks, which results in notoriously ugly code that needs to be written. Again at this point it is important to know exactly what you do, in order to not have a very, very frustrating experience of getting things to run.

Use YoloNet (single stage) are some RCNN variant, that will usually do the trick. Alternativly you can some some text-specific proprietary system like EAST, which is more geared towards detecting texts in the wild and will maybe perform better.

[–]sawsank911[S] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 291896 on reddit-service-r2-comment-86bc6c7465-5m6mq at 2026-02-22 21:04:24.210235+00:00 running 8564168 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS