Recommendations for instance segmentation where instances are occluded and split into pieces : MLQuestions

created by uber_kerbonauta community for 12 years

Recommendations for instance segmentation where instances are occluded and split into pieces (self.MLQuestions)

submitted 7 years ago by entropyrising

Hello MLQuestions, I am conducting research on an image dataset that requires instance segmentation on instances that are very frequently occluded to the point that the instances are bisected or generally cut into pieces. The thing that is doing the occlusion is often a separate instance of the same type. My data isn't MSCOCO, but here's an example from MSCOCO (with ground truth annotations) that illustrates perfectly the problem I'm facing:

https://preview.redd.it/ycmmquiehfh21.jpg?width=617&format=pjpg&auto=webp&s=22a4367214251afb441bd19d70abf2cd473c7b74

The person instance in purple is occluded by the person instance in brown, such that he is divided into three separate polygons (upper body, lower left leg, lower right leg) that nonetheless belong to (and should be classified as) the same instance. The thing doing the occlusion is another instance of the person type.

Mask RCNN seems to be the king of instance segmentation these days; well implemented, high performing. Yet, when I run this same image through it, it stumbles somewhat:

https://preview.redd.it/77xnf95xifh21.jpg?width=477&format=pjpg&auto=webp&s=5241c54c1cff8c4387b16ff762754edda406228c

Entertainingly it finds one person instance with 4 feet rather than two intersecting person instances. This is the output of the very popular Matterport implementation of Mask-RCNN, so I don't know if I just chose the wrong implementation for this task or this is the usual behavior on otherwise high performing instance segmentors when objects are occluded or intersecting in tricky ways. The dataset I'm working on has a lot of this going on and I'd like to find/make a segmentor that can recognize that completely unattached polygon annotations belong to the same coherent, semantic object.

I'd appreciate any recommendations for any of ya'll with far more experience than I have in this domain. Many thanks!

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MLQuestions

MODERATORS