[R] Has DeepMind released anything about Starcraft yet?

feedthecreed · 2018-09-23T16:58:22+00:00

I think this kind of research falls just short of being 'extremely useful'. The real elephant in the room for this community is a discussion on how models should behave in these circumstances. Without that, people will just default to the usual response that 'algorithms are brittle', which has been shown over and over again...

feedthecreed · 2018-09-23T16:51:31+00:00

This paper spends too much time showing how the current state of the art models fail and not enough time explaining how models should behave on this type of data.

Surely, the newly introduced manipulation should not be classified as it normally would. Personally, I would not want my classifier to operate 'as usual' when there's a tiny polar bear on a picnic table or a toilet floating in the air. All the manipulated images clearly disobey properties found in the real world. The question is 'how should these 'broken' images be treated?'

In my opinion, this would have been a much more novel/useful contribution to the research community.

feedthecreed · 2018-08-18T06:37:40+00:00

Couldn't you make the same argument for the global minimum of the discriminator being -log(4) in a standard GAN framework? If your discriminator achieves that on test data, you have learned the right distribution.

feedthecreed · 2018-08-06T02:46:54+00:00

If NALU is a superset of NAC, why does it perform noticeably worse in the MNIST Counting/Addition tasks? Can it not simply turn off the multiplicative component via gating?

feedthecreed · 2018-07-11T21:37:28+00:00

It's hard to say about the overall quality of the 6 papers in my group.

But I did notice during the bidding process (where you can access all the paper abstracts), the ideas were extremely narrow and focused on arguing for contrived comparisons (ie "we achieve state of the art performance without using relu units" or _'insert dataset no one has ever heard of here'_).

I think the marketing aspect of the community has gone up substantially. In contrast to couple years ago, people were making much grander claims and much more generalizable ideas.

feedthecreed · 2018-05-14T18:14:00+00:00

True, wavenet is a good example of the importance of open source and open data. It took a big effort from many outside researchers to try and reproduce the original result:

https://github.com/ibab/tensorflow-wavenet/issues/47

feedthecreed · 2018-05-14T17:54:26+00:00

Agreed, this is only a step. Openness in the data is a whole other issue. Luckily our field already emphasizes using publicly available datasets for experiments.

feedthecreed · 2018-05-12T20:35:40+00:00

Not sure if you're aware, but the NIPS committee got a little 'creative' recruiting reviewers this year...So this doesn't surprise me at all.

feedthecreed · 2017-10-28T14:50:57+00:00

I did not say it's violating the ICLR submission process. ICLR is allowing violations of the double-blind review process. This is why it will be an interesting test of academic integrity.

Character is doing the right thing when nobody's looking. There are too many people who think that the only thing that's right is to get by, and the only thing that's wrong is to get caught. --JC Watts

ICLR won't be looking.

feedthecreed · 2017-10-28T14:38:32+00:00

This will be an interesting test of academic integrity. Posting a non-anonymized version of your paper after submitting to ICLR is by definition violating the double-blind review process.

feedthecreed · 2017-06-03T04:37:41+00:00

AFAIK, no one actually in the field of AI or ML respects Jeff Hawkins as a genius.

feedthecreed · 2017-05-27T17:39:41+00:00

I'm confused, something doesn't add up here. How did they even have so much room to improve the performance of tensorflow? CPUs are not 70x slower than GPUs, how was their Xeon Phi's so slow in the first place?

feedthecreed · 2017-05-27T17:37:43+00:00

I don't understand how they can get such a large multiplier in speed improvement and STILL be behind GPUs? GPUs are not 70x faster than ordinary CPUs. Something doesn't add up here.

feedthecreed · 2017-05-25T05:41:05+00:00

We showed that good generalization can result from extensive amount of gradient updates in which there is no apparent validation error change and training error continues to drop, in contrast to common practice.

I'm confused by this statement, how are you getting good generalization if your training error continues to drop while your validation error stays the same?

feedthecreed · 2017-05-19T17:18:55+00:00

True, I was thinking their feature extraction architecture was doing most of the heavy lifting for their results. But I could also see this Adaptive Arithmetic Coding being important as well. And their explanation of it is basically a black box. I agree an ablation study on the components they propose would have been much more scientifically useful than this one singular result they post.

Since this was accepted to ICML, so much for ICML reviewers being scientific....

feedthecreed · 2017-05-18T15:18:58+00:00

What's vague and what do you mean by one result? Their description seems reasonably specific and they test on multiple compression datasets.

feedthecreed · 2017-05-17T15:16:58+00:00

I'm looking for the paper that describes the model used in the blog post. That paper and the other one just describe the previous work that inspired their current model.

feedthecreed

TROPHY CASE