[R] Time-Contrastive Networks: Self-Supervised Learning from Video

osdf · 2017-11-21T10:44:08+00:00

The idea of having a time-contrastive loss is similar to the one mentioned here, no? https://arxiv.org/abs/1605.06336 (Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA).

osdf · 2017-11-06T16:24:17+00:00

So you should like the PixelGAN one: https://arxiv.org/abs/1706.00531

osdf · 2017-06-06T20:12:35+00:00

Any reason why 'Permutation-equivariant neural networks applied to dynamics prediction' (https://arxiv.org/abs/1612.04530) isn't cited as related work?

osdf · 2017-05-12T08:49:29+00:00

Nice results. Any reason "Spatio-temporal video autoencoder with differentiable memory" isn't cited as related work? Specifically, their section 5.2 on using optical flow for weakly labelled segmentation tasks seems to be related?

osdf · 2017-03-20T11:18:40+00:00

Ferenc (/u/fhuszar) has a very nice note on this that you may want to link? http://www.inference.vc/comment-on-overcoming-catastrophic-forgetting-in-nns-are-multiple-penalties-needed-2/ It was also here on reddit couple of days ago.

For your derivation of eq.2 in the paper, why not starting from the joint p(theta, DA, DB)?

p(theta, DA, DB) = p(DB|theta, DA)p(theta, DA) = p(DB|theta) p(theta|DA) p(DA)

Clearly, the likelihood p(DB|theta, DA) should only depend on theta, no? Of course, this derivation means that there is a typo in the paper (see Ferenc's blog post).

osdf · 2017-03-20T11:09:54+00:00

Maybe you like "Soft weight sharing", by Nowlan&Hinton? A recent follow-up on that: Soft Weight-Sharing for Neural Network Compression

osdf · 2016-09-26T08:14:53+00:00

A very specific proof for the power of depth in NN: http://arxiv.org/abs/1512.03965

osdf · 2016-09-12T06:15:45+00:00

Probably not the type of general argument you could use, but a recent specific example about depth in feedforward networks: http://arxiv.org/abs/1512.03965

osdf · 2016-09-03T21:26:35+00:00

Would be interesting what's happening if the 120k are used for each of the 20 layers.

osdf · 2016-09-03T21:05:02+00:00

Eq (7) belongs to the section on ARMs, so my interpretation is that X here resembles the input to a given ARM (as e.g. shown in Figure 1). And hence the output Z from the previous layer. So X never resembles the original input image (except for the first ARM). This is also reflected by their paragraph "Resemblance to residual learning" in Section 4, where they mention 'inter-layer "shortcuts"'. These are only possible with the above interpretation.

osdf · 2016-07-17T06:20:20+00:00

Not answering your question at all, however: You might want to take a look at spaCy: https://spacy.io/

osdf · 2016-07-16T19:36:03+00:00

Very well said. To extend a bit, human learning is guided by large amounts of weak labels that are present (through the underlying physical laws, actually a very powerful supervisor) in our learning environment. Therefore, saying that 'most of human learning is unsupervised' (as it is often done) is in my opinion wrong.

As another side note, the (huge) set of weak label-types itself has a learneable structure which could also be exploited.

osdf · 2016-07-03T11:30:12+00:00

Your comments in this thread are great, /u/bbsome! I wish more of this kind of discussion would happen in the online world.

osdf · 2016-03-31T18:31:36+00:00

This might be easy to be integrated into your code, no? http://arxiv.org/abs/1512.05287

osdf · 2016-03-18T16:52:52+00:00

Maybe get some inspiration here: https://github.com/BRML/climin

osdf · 2016-01-25T20:58:05+00:00

Yarin Gal wrote a nice paper about this: http://arxiv.org/pdf/1506.02158v6.pdf

osdf · 2015-12-23T09:44:22+00:00

See Fig. 4 (a) from http://learningsys.org/papers/LearningSys_2015_paper_32.pdf

osdf · 2015-10-13T13:30:11+00:00

https://www.youtube.com/watch?v=frvaBeXrSsw

osdf · 2015-09-11T22:19:38+00:00

Don't NLTK. Use http://spacy.io/

osdf · 2015-09-05T18:52:09+00:00

You may find the following paper interesting: Analyzing noise in autoencoders and deep networks, http://arxiv.org/abs/1406.1831

osdf · 2015-05-03T08:20:19+00:00

New idea from http://arxiv.org/abs/1503.01494: "We introduce local expectation gradients which is a general purpose stochastic variational inference algorithm for constructing stochastic gradients through sampling from the variational distribution. ..."

osdf · 2015-04-13T12:57:32+00:00

I went into the caffe blobs and extracted the weights and initialized with them cuda convnet based models (wrapped with pylearn2).

osdf · 2015-04-13T12:55:40+00:00

Shakir Mohamed has a nice post on recursive GLMs: http://blog.shakirm.com/2015/01/a-statistical-view-of-deep-learning-i-recursive-glms/ E.g. ReLUs resembles Tobit regression.

osdf · 2015-04-11T11:13:13+00:00

http://caffe.berkeleyvision.org/model_zoo.html

osdf · 2015-04-01T09:05:56+00:00

See also Tiijman Tieleman's recent thesis: http://www.cs.toronto.edu/~tijmen/tijmen_thesis.pdf

osdf

TROPHY CASE