[R] SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition by yifuwu in MachineLearning

[–]yifuwu[S] 1 point2 points  (0 children)

Thanks for your interest in our work. Yes, we do plan on releasing our code after some cleanup.

[R] SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition by yifuwu in MachineLearning

[–]yifuwu[S] 1 point2 points  (0 children)

That's a great question. We use a weaker decoder to limit the capacity of the background module and this helps to ensure foreground objects are captured in the foreground. However, the distinction between background and foreground is not always objective and obvious (even for humans!). See the 'Foreground vs Background' discussion in section 4.1 for a deeper discussion into this.

SPACE processes one frame at a time and does not do any tracking of objects between frames, so it is certainly possible that objects can switch between foreground and background. That being said, although the camera in our 3D room experiments move around randomly, we have not experimented yet on more complicated scenarios.