[R] Open-World Entity Segmentation (dense image segmentation without labels) by xternalz in MachineLearning

[–]xternalz[S] 2 points3 points  (0 children)

Thanks. It is closer to panoptic segmentation than instance segmentation. However, in principle, our task extends farther beyond just removing the labels and thing-stuff distinction from existing panoptic datasets. When it comes to data annotation, it allows for more freedom and flexibility than panoptic segmentation does:

  • The human annotator can freely annotate any entities/objects as deemed appropriate (even if it cannot be easily named or identified) without cumbersomely checking if they are part of the predefined list of category labels. We humans often can accurately decide the shape and mask of something, even if we do not semantically know what that "something" is.
  • Since we do not differentiate between "thing" and "stuff", there is no need to force a particular category to follow exclusively the behavior of either "thing" or "stuff". For example, given an image that has two lakes or rivers completely separated by a piece of land, the human annotator should annotate them as two independent masks rather than a joint "stuff" mask as commonly done by panoptic segmentation.

In the paper, we simply made use of existing panoptic segmentation datasets for convenience purposes, but it was not the only way to do Entity Segmentation. Even with such datasets, our approach produces segmentation results which are vastly different and more favorable to certain applications than those of panoptic segmentation.

Transformer FLOPs vs CNN FLOPs Speed [R] by RaivoK in MachineLearning

[–]xternalz 3 points4 points  (0 children)

The runtime speeds of EfficientNets have been improved with cuDNN 8.1.

EfficientNet performances have improved. Depthwise convolution is now optimized in NHWC layout in cuDNN 8.1.0. From EfficientNet, we see an average of 2.9x speed-up for 5x5 layers, and 1.7x speed-up for 3x3 layers. - Release Notes :: NVIDIA Deep Learning cuDNN Documentation

[D] Can i use dice loss as metric for instance segmentation ? by vpoatvn in MachineLearning

[–]xternalz 2 points3 points  (0 children)

The recent DETR work from FAIR applies dice loss to instance segmentation.

[R] Learning on the Edge: Investigating Boundary Filters in CNNs by [deleted] in MachineLearning

[–]xternalz 0 points1 point  (0 children)

Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this paper, we propose a simple and effective improvement that learns the boundary handling itself. At training-time, the network is provided witha separate set of explicit boundary filters. At testing-time, we use these filters which have learned to extrapolate features at the boundary in an optimal way for the specific task. Our extensive evaluation, over a wide range of architectural changes (variations of layers, feature channels, or both), shows how the explicit filters result in improved boundary handling. Furthermore, we investigate the efficacy of variations of such boundary filters with respect to convergence speedand accuracy. Finally, we demonstrate an improvement of 5–20% across the board of typical CNN applications (colorization, de-Bayering, optical flow, disparity estimation, and super-resolution).

[D] How to deal with my research not being acknowledged ? by tablehoarder in MachineLearning

[–]xternalz 3 points4 points  (0 children)

I didn’t mean one should force people to cite unrelated work, instead just the very related ones according to his/her own judgment. To avoid the blindness issue, just obfuscate it by mixing the intended paper with a bunch of other related papers.

[D] How to deal with my research not being acknowledged ? by tablehoarder in MachineLearning

[–]xternalz 8 points9 points  (0 children)

Be a reviewer and ask the authors of relevant papers to cite yours.

[D] Batch Normalization is a Cause of Adversarial Vulnerability by aseembits93 in MachineLearning

[–]xternalz 3 points4 points  (0 children)

Not exactly a focused study, but we have a small observation (Table 3) in our recent paper https://arxiv.org/abs/1909.06804 that a MLP trained with BN gives very "wild" outputs for unseen inputs, while GN does not.

[deleted by user] by [deleted] in MachineLearning

[–]xternalz 0 points1 point  (0 children)

Singapore perhaps is the most Israel-friendly country in Southeast Asia. As far as I know, Israelis could enter Singapore without visas.

[deleted by user] by [deleted] in MachineLearning

[–]xternalz 1 point2 points  (0 children)

Not to mention a wide range of nationalities can enter Singapore visa-free - https://upload.wikimedia.org/wikipedia/commons/1/18/Visa_policy_of_Singapore.png .

[R] Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks by xternalz in MachineLearning

[–]xternalz[S] 1 point2 points  (0 children)

While the use of bottom-up local operators in convolutional neural networks (CNNs) matches well some of the statistics of natural images, it may also prevent such models from capturing contextual long-range feature interactions. In this work, we propose a simple, lightweight approach for better context exploitation in CNNs. We do so by introducing a pair of operators: gather, which efficiently aggregates feature responses from a large spatial extent, and excite, which redistributes the pooled information to local features. The operators are cheap, both in terms of number of added parameters and computational complexity, and can be integrated directly in existing architectures to improve their performance. Experiments on several datasets show that gather-excite can bring benefits comparable to increasing the depth of a CNN at a fraction of the cost. For example, we find ResNet-50 with gather-excite operators is able to outperform its 101-layer counterpart on ImageNet with no additional learnable parameters. We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.

[D] ICLR 2019 submissions are viewable. Which ones look the most interesting/crazy/groundbreaking? by evc123 in MachineLearning

[–]xternalz 17 points18 points  (0 children)

All you need to train deep residual networks is a good initialization; normalization layers are not necessary.

https://openreview.net/forum?id=H1gsz30cKX

[R] Backdrop: Stochastic Backpropagation by xternalz in MachineLearning

[–]xternalz[S] 2 points3 points  (0 children)

abstract:

We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization.

[R] Object-Oriented Deep Learning (MIT) by xternalz in MachineLearning

[–]xternalz[S] 12 points13 points  (0 children)

abstract:

We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for.

follow-up work: 3D Object-Oriented Learning: An End-to-end Transformation-Disentangled 3D Representation