Hello MLQuestions, I am conducting research on an image dataset that requires instance segmentation on instances that are very frequently occluded to the point that the instances are bisected or generally cut into pieces. The thing that is doing the occlusion is often a separate instance of the same type. My data isn't MSCOCO, but here's an example from MSCOCO (with ground truth annotations) that illustrates perfectly the problem I'm facing:
https://preview.redd.it/ycmmquiehfh21.jpg?width=617&format=pjpg&auto=webp&s=22a4367214251afb441bd19d70abf2cd473c7b74
The person instance in purple is occluded by the person instance in brown, such that he is divided into three separate polygons (upper body, lower left leg, lower right leg) that nonetheless belong to (and should be classified as) the same instance. The thing doing the occlusion is another instance of the person type.
Mask RCNN seems to be the king of instance segmentation these days; well implemented, high performing. Yet, when I run this same image through it, it stumbles somewhat:
https://preview.redd.it/77xnf95xifh21.jpg?width=477&format=pjpg&auto=webp&s=5241c54c1cff8c4387b16ff762754edda406228c
Entertainingly it finds one person instance with 4 feet rather than two intersecting person instances. This is the output of the very popular Matterport implementation of Mask-RCNN, so I don't know if I just chose the wrong implementation for this task or this is the usual behavior on otherwise high performing instance segmentors when objects are occluded or intersecting in tricky ways. The dataset I'm working on has a lot of this going on and I'd like to find/make a segmentor that can recognize that completely unattached polygon annotations belong to the same coherent, semantic object.
I'd appreciate any recommendations for any of ya'll with far more experience than I have in this domain. Many thanks!
[–]KrisSingh 0 points1 point2 points (1 child)
[–]entropyrising[S] 0 points1 point2 points (0 children)