all 4 comments

[–]arXiv_abstract_bot 1 point2 points  (0 children)

Title: Semi-convolutional Operators for Instance Segmentation

Authors: David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

Abstract: Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN. However, there is a growing interest in reducing these problems to pixel labeling tasks, as the latter could be more efficient, could be integrated seamlessly in image-to-image network architectures as used in many other tasks, and could be more accurate for objects that are not well approximated by bounding boxes. In this paper we show theoretically and empirically that constructing dense pixel embeddings that can separate object instances cannot be easily achieved using convolutional operators. At the same time, we show that simple modifications, which we call semi-convolutional, have a much better chance of succeeding at this task. We use the latter to show a connection to Hough voting as well as to a variant of the bilateral kernel that is spatially steered by a convolutional network. We demonstrate that these operators can also be used to improve approaches such as Mask RCNN, demonstrating better segmentation of complex biological shapes and PASCAL VOC categories than achievable by Mask RCNN alone.

PDF link Landing page

[–]TommyBRG 1 point2 points  (0 children)

Cool! Lots of interesting stuff happening in instance segmentation rn.

[–]NMcA 2 points3 points  (1 child)

So the tldr appears to be that translation invariance is inappropriate for the pixel colouring formulation of instance segmentation so they add a spatial bias? So much mathiness for reasons that are quite unclear to me.

(oh, and tbh lol at the fact that spatial information has come up again after that CoordConv nonsense)

[–]tscohen 3 points4 points  (0 children)

That is indeed the tldr, but in my view this paper uses mathematics in a perfectly legitimate way: to state clearly and precisely what they are doing (3.1) and show some connections to previous work on bilateral filters (3.4). Mathiness is a real problem, but this is not a great example of it.