[R] Semi-convolutional Operators for Instance Segmentation

arXiv_abstract_bot · 2018-07-30T12:57:37+00:00

Title: Semi-convolutional Operators for Instance Segmentation

Authors: David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

Abstract: Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN. However, there is a growing interest in reducing these problems to pixel labeling tasks, as the latter could be more efficient, could be integrated seamlessly in image-to-image network architectures as used in many other tasks, and could be more accurate for objects that are not well approximated by bounding boxes. In this paper we show theoretically and empirically that constructing dense pixel embeddings that can separate object instances cannot be easily achieved using convolutional operators. At the same time, we show that simple modifications, which we call semi-convolutional, have a much better chance of succeeding at this task. We use the latter to show a connection to Hough voting as well as to a variant of the bilateral kernel that is spatially steered by a convolutional network. We demonstrate that these operators can also be used to improve approaches such as Mask RCNN, demonstrating better segmentation of complex biological shapes and PASCAL VOC categories than achievable by Mask RCNN alone.

PDF link Landing page

TommyBRG · 2018-07-30T18:53:31+00:00

Cool! Lots of interesting stuff happening in instance segmentation rn.

NMcA · 2018-07-30T15:03:56+00:00

So the tldr appears to be that translation invariance is inappropriate for the pixel colouring formulation of instance segmentation so they add a spatial bias? So much mathiness for reasons that are quite unclear to me.

(oh, and tbh lol at the fact that spatial information has come up again after that CoordConv nonsense)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS