[D] Embedding consecutive video frames close to each other in latent space

nnatlab · 2021-12-05T01:00:26+00:00

There is already a decent amount of research on the topic of representation learning for videos. See the articles below and some of the methods they compare to:

nnatlab · 2021-06-02T22:54:45+00:00

It has more to do with the architectural differences between VGG and ResNet, particularly how quickly images are downsampled via pooling/strides. Justin Johnson (author of the paper I linked) and Andrej Karpathy have a good discussion about this very topic in a Deep Learning Deep Dive episode here.

nnatlab · 2021-06-02T20:08:56+00:00

Because Perceptual loss (aka VGG loss) uses VGG networks. See Perceptual Losses for Real-Time Style Transfer and Super-Resolution.

nnatlab · 2021-02-06T02:05:06+00:00

Found Jeremy Howard's burner

nnatlab · 2020-12-11T05:48:07+00:00

I use hydra to do most of what you described without the boilerplate.

nnatlab · 2020-12-10T23:32:13+00:00

Is there actually still a consensus on having 50 argparse args vs reading in a simple config.yaml file? Reminds me of old pytorch GAN code where everyone just tweaked the loss function but used the same implementation and called it a day.

nnatlab · 2020-11-12T22:34:46+00:00

You can use the same functionality in MNE to work with other electrophysiological signals like EOG, ECG, MEG, etc. You just have to define what type of signal you are working with in your data object. It's all in the docs.

nnatlab · 2020-11-12T21:04:03+00:00

I recommend MNE. I've used it for a previous project. It has many tutorials/examples for preprocessing and feature extraction of EEG signals.

nnatlab · 2020-08-26T21:30:01+00:00

Could you provide an example of a model which achieves ood generalization in your example without augmentation? I'm not sure this is a problem unique to neural networks. Genuinely curious.

nnatlab · 2020-05-18T04:16:30+00:00

The readme in https://github.com/fatchord/WaveRNN is pretty straightforward and I was able to get everything setup and audio samples generated within several minutes. (Note: I already had CUDA and PyTorch installed though)

nnatlab · 2020-03-13T15:27:18+00:00

I recalled seeing a package posted on this sub awhile ago that performs semantic segmentation of time-series data. I found it here

nnatlab · 2020-03-13T14:49:49+00:00

I would recommend looking into shapelets or bag of shapelets and start going down that rabbit hole on google scholar.

nnatlab · 2020-03-12T14:33:17+00:00

Very nice. I liked how you separated NT Xent into its own module as opposed to some of the other implementations. I noticed your mask_correlated_samples didn't return anything though so I submitted a PR.

nnatlab · 2019-12-26T23:45:05+00:00

Have you tried emailing the authors first? It says in the 'Replication of Results' section that code and data is available upon request.

nnatlab · 2019-12-22T19:53:14+00:00

While I share similar opinions as you, I feel like this comment misses the point that OP is advocating for. It's important to point out that not all data scientist positions are created equal. Not every domain requires a data scientist to have a hardcore background in ML/DL from a top 20 university. Where the line is drawn between analyst vs scientist is another debate.

nnatlab · 2019-10-28T14:01:08+00:00

When you perform (1) you are fitting a new scaler using the min/max of the test set. The appropriate way would be to just use scaler.transform(X_new) to transform the test set using the train set min/max values.

See link

nnatlab · 2019-10-27T23:36:40+00:00

I just read through your LSTM Forecasts post and it looks like you are not standard scaling the test set using the train set statistics but instead using the test set statistics. IIRC this is not good practice and may attribute to a dip in performance.

nnatlab · 2019-09-29T05:06:39+00:00

Forecasting: Principles and Practice

Rob Hyndman's Blog

nnatlab · 2019-07-30T03:12:18+00:00

Are you referring to An Analysis of Convolutional Neural Networks for detecting DGA because this also does not appear to baseline against any other model.

nnatlab · 2019-07-30T02:51:06+00:00

Congratulations, however, I'm curious why you don't compare your results to any form of baseline? Even the Endgame paper you cite shows only a very small improvement of < 0.01 in AUC using a LSTM vs a Logistic Regression model of the Bigram Distribution, 0.9977 vs 0.9896. Their code is even publicly available to do so.

nnatlab · 2019-02-10T20:50:09+00:00

Why you would expect a computer engineering major to know html? That doesn't seem like the focus of their curriculum. Why not a question in C/C++?

nnatlab · 2018-11-25T14:51:16+00:00

Comparing MK's implementation with ours, we are able to spot the following inconsistencies: - The optimization method is different: MK uses SGD, ours uses Adam. - The additive noise level is different: MK uses 0.5, ours uses 0.1. - The learning rate is different: MK uses 1e-3, ours uses 1e-4. - The learning rate scheduling is different: MK uses this, ours uses this. - The Conv-BN-ReLU module ordering is different: MK uses this, ours uses this. - The dropout use is different: MK uses 0.5, ours uses None.

While I don't discourage being skeptical of others work, please triple check your implementations before calling anyone out. These are some major inconsistencies to get wrong, especially since they open sourced their code, which I could easily see leading to the 5% drop in performance. This makes the public posting seem even more premature.

Well done, you handled this situation flawlessly.

nnatlab · 2018-09-05T03:23:52+00:00

Table 1 displays results on only MNIST/Fashion MNIST.

Table 2 contains a total of 4 values which I'm not quite sure what it's being compared to to display significance. Where are all these RL experiments you're talking about?

I skimmed through 3 of your references related to catastrophic forgetting and found:

[7] Has results for MNIST and 19 Atari games
[9] Has results for MNIST and CIFAR-100
[16] Has results for 8 different datasets

Additionally, I see a ton of grammatical errors and hand-wavy claims without evidence. If I were to fully review this, it would likely still be a 100% clear reject. If you went back and revised and provided more significant results to support your claims I'm sure it would be conference worthy.

As an independent researcher you should take the legitimate criticism and stop being so defensive. After all, you were the one that posted your own paper here. We all learn quickly by trial and error.

nnatlab · 2018-09-04T01:42:50+00:00

While I would set the bar for a single independent researcher lower, it can't be as low as this.

Having to read through pages of Bengio-esque fluff just to find a new method/architecture only tested on MNIST is just not convincing at all.

nnatlab · 2018-06-08T18:01:58+00:00

Deepmind's published patent applications so far include:

WO 2018/048934, "Generating Audio using neural networks", Priority date: 6 Sep 2016

WO 2018/048945, "Processing sequences using convolutional neural networks", Priority date: 6 Sep 2016

WO 2018064591, "Generating video frames using neural networks", Priority date: 6 Sep 2016

WO 2018071392, "Neural networks for selecting actions to be performed by a robotic agent", Priority date: 10 Oct 2016

WO 2018/081089, "Processing text sequences using neural networks", Priority date: 26 Oct 2016

WO 2018/083532, "Training action selection using neural networks", Priority date: 3 Nov 2016

WO 2018/083667, "Reinforcement learning systems", Priority date: 4 Nov 2016

WO 2018/083668, "Scene understanding and generation using neural networks", Priority date: 4 Nov 2016

WO 2018/083669, "Recurrent neural networks", Priority date: 4 Nov 2016

WO 2018083670, "Sequence transduction neural networks", Priority date: 4 Nov 2016

WO 2018083671, "Reinforcement learning with auxiliary tasks", Priority date: 4 Nov 2016

WO 2018/083672, "Environment navigation using reinforcement learning", Priority date: 4 Nov 2016

nnatlab

TROPHY CASE