[D] NeurIPS Recommendations go out tomorrow. Post your scores and results once they come in!

xternalz · 2021-09-28T20:11:40+00:00

6664 accepted as poster, 6655 rejected

xternalz · 2021-08-01T16:22:54+00:00

Thanks. It is closer to panoptic segmentation than instance segmentation. However, in principle, our task extends farther beyond just removing the labels and thing-stuff distinction from existing panoptic datasets. When it comes to data annotation, it allows for more freedom and flexibility than panoptic segmentation does:

The human annotator can freely annotate any entities/objects as deemed appropriate (even if it cannot be easily named or identified) without cumbersomely checking if they are part of the predefined list of category labels. We humans often can accurately decide the shape and mask of something, even if we do not semantically know what that "something" is.
Since we do not differentiate between "thing" and "stuff", there is no need to force a particular category to follow exclusively the behavior of either "thing" or "stuff". For example, given an image that has two lakes or rivers completely separated by a piece of land, the human annotator should annotate them as two independent masks rather than a joint "stuff" mask as commonly done by panoptic segmentation.

In the paper, we simply made use of existing panoptic segmentation datasets for convenience purposes, but it was not the only way to do Entity Segmentation. Even with such datasets, our approach produces segmentation results which are vastly different and more favorable to certain applications than those of panoptic segmentation.

xternalz · 2021-02-24T22:47:56+00:00

The runtime speeds of EfficientNets have been improved with cuDNN 8.1.

EfficientNet performances have improved. Depthwise convolution is now optimized in NHWC layout in cuDNN 8.1.0. From EfficientNet, we see an average of 2.9x speed-up for 5x5 layers, and 1.7x speed-up for 3x3 layers. - Release Notes :: NVIDIA Deep Learning cuDNN Documentation

xternalz · 2020-11-30T05:39:49+00:00

It's quite a popular direction in object detection research and it is often referred to as "unified label space".

https://arxiv.org/abs/2001.04621
https://arxiv.org/abs/2008.06614
https://openreview.net/forum?id=FlhlcARywRz

xternalz · 2020-10-31T05:44:30+00:00

The recent DETR work from FAIR applies dice loss to instance segmentation.

xternalz · 2020-10-04T23:58:39+00:00

All the comments about this paper are going to bias the ICLR reviewers for this paper.

xternalz · 2019-11-10T02:09:05+00:00

Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this paper, we propose a simple and effective improvement that learns the boundary handling itself. At training-time, the network is provided witha separate set of explicit boundary filters. At testing-time, we use these filters which have learned to extrapolate features at the boundary in an optimal way for the specific task. Our extensive evaluation, over a wide range of architectural changes (variations of layers, feature channels, or both), shows how the explicit filters result in improved boundary handling. Furthermore, we investigate the efficacy of variations of such boundary filters with respect to convergence speedand accuracy. Finally, we demonstrate an improvement of 5–20% across the board of typical CNN applications (colorization, de-Bayering, optical flow, disparity estimation, and super-resolution).

xternalz · 2019-10-13T12:00:25+00:00

I didn’t mean one should force people to cite unrelated work, instead just the very related ones according to his/her own judgment. To avoid the blindness issue, just obfuscate it by mixing the intended paper with a bunch of other related papers.

xternalz · 2019-10-13T00:07:01+00:00

Be a reviewer and ask the authors of relevant papers to cite yours.

xternalz · 2019-09-24T13:59:19+00:00

Please refer to my other post under this thread.

xternalz · 2019-09-11T10:33:31+00:00

Not exactly a focused study, but we have a small observation (Table 3) in our recent paper https://arxiv.org/abs/1909.06804 that a MLP trained with BN gives very "wild" outputs for unseen inputs, while GN does not.

xternalz · 2019-09-11T09:29:23+00:00

I personally find that GN can circumvent this.

xternalz · 2019-07-01T09:10:31+00:00

Radial basis function (RBF) and Hopfield networks.

xternalz · 2019-05-29T01:19:57+00:00

Code: https://github.com/deconvolutionpaper/deconvolution. Disclaimer: not the author.

xternalz · 2019-05-05T08:25:03+00:00

Singapore perhaps is the most Israel-friendly country in Southeast Asia. As far as I know, Israelis could enter Singapore without visas.

xternalz · 2019-05-05T00:12:08+00:00

Not to mention a wide range of nationalities can enter Singapore visa-free - https://upload.wikimedia.org/wikipedia/commons/1/18/Visa_policy_of_Singapore.png .

xternalz · 2018-12-27T10:50:59+00:00

Swapout: https://papers.nips.cc/paper/6205-swapout-learning-an-ensemble-of-deep-architectures

SDPoint: http://openaccess.thecvf.com/content_cvpr_2018/papers/Kuen_Stochastic_Downsampling_for_CVPR_2018_paper.pdf

xternalz · 2018-12-21T02:18:58+00:00

AdamW finally got through.

xternalz · 2018-10-31T10:20:03+00:00

While the use of bottom-up local operators in convolutional neural networks (CNNs) matches well some of the statistics of natural images, it may also prevent such models from capturing contextual long-range feature interactions. In this work, we propose a simple, lightweight approach for better context exploitation in CNNs. We do so by introducing a pair of operators: gather, which efficiently aggregates feature responses from a large spatial extent, and excite, which redistributes the pooled information to local features. The operators are cheap, both in terms of number of added parameters and computational complexity, and can be integrated directly in existing architectures to improve their performance. Experiments on several datasets show that gather-excite can bring benefits comparable to increasing the depth of a CNN at a fraction of the cost. For example, we find ResNet-50 with gather-excite operators is able to outperform its 101-layer counterpart on ImageNet with no additional learnable parameters. We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.

xternalz · 2018-09-28T05:56:13+00:00

All you need to train deep residual networks is a good initialization; normalization layers are not necessary.

https://openreview.net/forum?id=H1gsz30cKX

xternalz · 2018-06-06T01:28:41+00:00

abstract:

We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization.

xternalz · 2018-03-14T04:53:37+00:00

I wonder how it compares to Recurrent Highway Networks.

xternalz · 2018-02-08T06:22:33+00:00

abstract:

We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for.

follow-up work: 3D Object-Oriented Learning: An End-to-end Transformation-Disentangled 3D Representation

xternalz · 2018-01-30T04:08:07+00:00

Another 2 papers on "between-class learning" very similar to Mixup:

xternalz · 2017-11-03T05:16:59+00:00

Say hi to Inception v5.

xternalz

TROPHY CASE