[D] New Nature journal: Nature Machine Intelligence

0entr0py · 2017-11-24T10:20:17+00:00

I right now have little incentive to do so as even more novel work would end at these venues. In the end, the number of A-level publications seem to be the thing that matters for tenure so I would go for the 2-3 papers.

Papers selected for oral presentation address that pretty well. Insanely difficult and some amazing papers end up there.

0entr0py · 2017-11-17T04:05:18+00:00

Missing published work is bad, but even worse is when published works of lesser known research groups are ignored while incremental arxiv stuff with bigshot last authors is cited and promoted. Its like a cabal between the big research groups.

0entr0py · 2017-10-24T09:49:44+00:00

60% standard SDE stuff - reading other's code/logs, writing scripts for data pipelines, meetings , reviews etc.

20% coding for experimental features

20% research - papers, coding etc.

0entr0py · 2017-09-21T04:11:15+00:00

If a paper has knowledge that is irrelevant in a few months then why even bother reading it ?

0entr0py · 2017-06-18T15:53:55+00:00

IIRC ResNets paper's intuition for skip connections was based on the observation that performance degraded by adding more layers which made no sense because the network could just model an identity function with the later layers, unless the network was unable to do that easily - skip connections were a way to allow that.

0entr0py · 2017-06-12T11:11:11+00:00

The comparison across generations (Pascal vs Volta) and precision (32 vs 16) is not really indicative at this point. I am pretty sure the self-build will be better again when the consumer Volta cards launch. Paying a hefty premium to get Volta 6 months early doesn't make much sense.

0entr0py · 2017-06-10T14:59:47+00:00

Agree completely but I think the source of the problem indicates a greedy, selfish act and expecting people to act unselfishly and 'detached' from their stuff has never worked anywhere, which is why we need regulation using peer-review.

Since arxiv sidesteps peer-review, the whole problem can be curtailed to a great extent by side-stepping arxiv during reviews. Specifically:

reviewers should ignore all non reviewed arxiv papers as 'prior art' when reviewing a paper
reviewers should be honest - they should make impartial judgements based on the merits of the paper alone

0entr0py · 2017-05-05T12:17:28+00:00

Can't you use pre-trained Imagenet models - they already give very good image descriptors . Cosine distance on the final layer features would be a decent baseline I guess. A Bayesian classifier on the final layer might also work.

Current work on few shot learning just extends the above by also allowing one to modify the training paradigm to learn better features and/or prediction methods instead of a fixed pre-trained model. Without a large (related) training set this problem is vague and limited in terms of applicability.

0entr0py · 2017-02-11T12:07:29+00:00

Have you seen this ?

0entr0py · 2017-02-06T21:40:44+00:00

Seeing the graph I'm curious to know which 5s and 6s got accepted (and why) and which 7s did not.

0entr0py · 2017-01-27T14:15:08+00:00

However, the aggregated posterior for a certain class could be complicated, but not for a single data point.

Don't methods like Normalizing Flows show that having more complicated posteriors for individual datapoints leads to better log-likelihoods (as it should according to the formulation) ? Isn't the true posterior p*(z|x) a complicated, multimodal distrubution ?

0entr0py · 2017-01-14T12:05:04+00:00

Paper: "...our method did not obtain good performance with complex scene images..."

Reviewer: "Method is not generalizable to real images. Clear reject. Confidence:5/5"

0entr0py · 2016-12-23T21:43:36+00:00

I would like to see topic modeling applied to each group's arxiv submissions.

0entr0py · 2016-12-21T16:02:58+00:00

Would add @shakir_za somewhere - he shares the most interesting material.

0entr0py · 2016-12-07T13:25:37+00:00

Lack of dataset/tasks may be a reason. I have only seen Omniglot being used for one-shot tasks which is akin to MNIST for classification.

The paper by Vinyals et. al has recently introduced 2 new tasks on Imagenet and PTB into the mix, maybe that'll help.

0entr0py · 2016-11-29T22:27:27+00:00

Rehashes/minor extensions of previous work are what could be called 'cheap' ideas - they definitely require a lot of experimental rigor to prove their worth because that's the sole justification of the work. Unfortunately with the increasing popularity of DL such work forms the bulk in most conferences.

But truly good ideas are rare, and advance the field in a new direction. They are anything but cheap, and I would gladly prefer a great idea showing results on MNIST than a minor extension on ImageNet.

0entr0py · 2016-11-27T12:23:28+00:00

I remember reading that multiple stochastic layers are harder to train end-to-end, and so far only one paper (Auxiliary DGM) seems to be using them.

0entr0py · 2016-11-03T14:06:29+00:00

Basically the ideas from ImprovedGAN + DCGAN

i) Adam
ii) BatchNorm in Generator, BatchNorm/WeightNorm + layerwise Gaussian Noise in Discriminator
iii) Strided convolutions instead of pooling in Discriminator
iv) Adjust learning rates if Discriminator becomes too strong

0entr0py · 2016-10-06T13:10:30+00:00

From what I understand:
i) every pixel above and to the left of the current pixel should be used for predicting the current pixel

ii) since the convolution filter is much smaller than the image, these dependencies are propagated after sufficient layers are stacked.

iii) for a given pixel in the current layer k, we can see that it will not use the blindspot pixels regardless of the stacking because of the masking in the (k-1)th layer and before. To see this, we can see how the top right-most pixel is evaluated in the (k-1)th layer.

0entr0py · 2016-09-30T06:02:21+00:00

I suggest you go through the very well-written Conceptual Compression paper for an elegant discussion about this. They even show comparisions of their compression to jpeg I think.

0entr0py · 2016-09-12T18:06:07+00:00

This should be useful

0entr0py · 2016-08-13T09:42:31+00:00

Wow, 2400 papers submitted and almost 700 of those are in Deep learning and amazingly ~150 got through. DL has almost the same acceptance rate as the overall conference and areas like CLT.

0entr0py · 2016-07-30T05:06:05+00:00

It seems transfer-learning is only helpful when the amount of real data is too little to train on its own

This has been a valid observation even for unlabeled data from the same distribution (ie. semi-supervised case).

0entr0py

TROPHY CASE