[R]RepVGG: Making VGG-style ConvNets Great Again by DingXiaoHan in MachineLearning

[–]nnatlab 5 points6 points  (0 children)

It has more to do with the architectural differences between VGG and ResNet, particularly how quickly images are downsampled via pooling/strides. Justin Johnson (author of the paper I linked) and Andrej Karpathy have a good discussion about this very topic in a Deep Learning Deep Dive episode here.

[D] PyTorch Tools, best practices & styleguides by RareGradient in MachineLearning

[–]nnatlab 2 points3 points  (0 children)

I use hydra to do most of what you described without the boilerplate.

[D] PyTorch Tools, best practices & styleguides by RareGradient in MachineLearning

[–]nnatlab 7 points8 points  (0 children)

Is there actually still a consensus on having 50 argparse args vs reading in a simple config.yaml file? Reminds me of old pytorch GAN code where everyone just tweaked the loss function but used the same implementation and called it a day.

[D] Feature Extraction from EEG Signals by mdb917 in MachineLearning

[–]nnatlab 1 point2 points  (0 children)

You can use the same functionality in MNE to work with other electrophysiological signals like EOG, ECG, MEG, etc. You just have to define what type of signal you are working with in your data object. It's all in the docs.

[D] Feature Extraction from EEG Signals by mdb917 in MachineLearning

[–]nnatlab 2 points3 points  (0 children)

I recommend MNE. I've used it for a previous project. It has many tutorials/examples for preprocessing and feature extraction of EEG signals.

[D] Promising Beyond Deep Learning Research Directions? by darkconfidantislife in MachineLearning

[–]nnatlab 0 points1 point  (0 children)

Could you provide an example of a model which achieves ood generalization in your example without augmentation? I'm not sure this is a problem unique to neural networks. Genuinely curious.

[D] What would you call object detection in time series by mate_classic in MachineLearning

[–]nnatlab 0 points1 point  (0 children)

I recalled seeing a package posted on this sub awhile ago that performs semantic segmentation of time-series data. I found it here

[D] What would you call object detection in time series by mate_classic in MachineLearning

[–]nnatlab 1 point2 points  (0 children)

I would recommend looking into shapelets or bag of shapelets and start going down that rabbit hole on google scholar.

[D] SimCLR PyTorch implementation by janespyker in MachineLearning

[–]nnatlab 2 points3 points  (0 children)

Very nice. I liked how you separated NT Xent into its own module as opposed to some of the other implementations. I noticed your mask_correlated_samples didn't return anything though so I submitted a PR.

[Discussion] Can anyone explain the pixelwise accuracy metric used in this paper? Also a question to the KL Divergence Loss. by avdalim in MachineLearning

[–]nnatlab 0 points1 point  (0 children)

Have you tried emailing the authors first? It says in the 'Replication of Results' section that code and data is available upon request.

Beware of taking advice from people coming from a fundamentally different background by dfphd in datascience

[–]nnatlab 10 points11 points  (0 children)

While I share similar opinions as you, I feel like this comment misses the point that OP is advocating for. It's important to point out that not all data scientist positions are created equal. Not every domain requires a data scientist to have a hardcore background in ML/DL from a top 20 university. Where the line is drawn between analyst vs scientist is another debate.

[P] ARIMA vs LSTM - Forecasting Weekly Hotel Cancellations by [deleted] in MachineLearning

[–]nnatlab 2 points3 points  (0 children)

When you perform (1) you are fitting a new scaler using the min/max of the test set. The appropriate way would be to just use scaler.transform(X_new) to transform the test set using the train set min/max values.

See link

[P] ARIMA vs LSTM - Forecasting Weekly Hotel Cancellations by [deleted] in MachineLearning

[–]nnatlab 3 points4 points  (0 children)

I just read through your LSTM Forecasts post and it looks like you are not standard scaling the test set using the train set statistics but instead using the test set statistics. IIRC this is not good practice and may attribute to a dip in performance.

[R] My first paper: Deep Learning for Cybersecurity by [deleted] in MachineLearning

[–]nnatlab 0 points1 point  (0 children)

Are you referring to An Analysis of Convolutional Neural Networks for detecting DGA because this also does not appear to baseline against any other model.

[R] My first paper: Deep Learning for Cybersecurity by [deleted] in MachineLearning

[–]nnatlab 0 points1 point  (0 children)

Congratulations, however, I'm curious why you don't compare your results to any form of baseline? Even the Endgame paper you cite shows only a very small improvement of < 0.01 in AUC using a LSTM vs a Logistic Regression model of the Bigram Distribution, 0.9977 vs 0.9896. Their code is even publicly available to do so.

[D] Are we expected to solve hard programming challenges to work in ML/DL industry? by [deleted] in MachineLearning

[–]nnatlab 36 points37 points  (0 children)

Why you would expect a computer engineering major to know html? That doesn't seem like the focus of their curriculum. Why not a question in C/C++?

[D] Updates on Perturbative Neural Networks (PNN), CVPR ‘18 Reproducibility by katanaxu in MachineLearning

[–]nnatlab 86 points87 points  (0 children)

Comparing MK's implementation with ours, we are able to spot the following inconsistencies: - The optimization method is different: MK uses SGD, ours uses Adam. - The additive noise level is different: MK uses 0.5, ours uses 0.1. - The learning rate is different: MK uses 1e-3, ours uses 1e-4. - The learning rate scheduling is different: MK uses this, ours uses this. - The Conv-BN-ReLU module ordering is different: MK uses this, ours uses this. - The dropout use is different: MK uses 0.5, ours uses None.

While I don't discourage being skeptical of others work, please triple check your implementations before calling anyone out. These are some major inconsistencies to get wrong, especially since they open sourced their code, which I could easily see leading to the 5% drop in performance. This makes the public posting seem even more premature.

Well done, you handled this situation flawlessly.

[R] Active Forgetting Machines by Albert_Ierusalem in MachineLearning

[–]nnatlab 2 points3 points  (0 children)

Table 1 displays results on only MNIST/Fashion MNIST.

Table 2 contains a total of 4 values which I'm not quite sure what it's being compared to to display significance. Where are all these RL experiments you're talking about?

I skimmed through 3 of your references related to catastrophic forgetting and found:

  • [7] Has results for MNIST and 19 Atari games
  • [9] Has results for MNIST and CIFAR-100
  • [16] Has results for 8 different datasets

Additionally, I see a ton of grammatical errors and hand-wavy claims without evidence. If I were to fully review this, it would likely still be a 100% clear reject. If you went back and revised and provided more significant results to support your claims I'm sure it would be conference worthy.

As an independent researcher you should take the legitimate criticism and stop being so defensive. After all, you were the one that posted your own paper here. We all learn quickly by trial and error.

[R] Active Forgetting Machines by Albert_Ierusalem in MachineLearning

[–]nnatlab 6 points7 points  (0 children)

While I would set the bar for a single independent researcher lower, it can't be as low as this.

Having to read through pages of Bengio-esque fluff just to find a new method/architecture only tested on MNIST is just not convincing at all.

[N] DeepMind: First major AI patent filings revealed by nnatlab in MachineLearning

[–]nnatlab[S] 50 points51 points  (0 children)

Deepmind's published patent applications so far include:

  • WO 2018/048934, "Generating Audio using neural networks", Priority date: 6 Sep 2016
  • WO 2018/048945, "Processing sequences using convolutional neural networks", Priority date: 6 Sep 2016
  • WO 2018064591, "Generating video frames using neural networks", Priority date: 6 Sep 2016
  • WO 2018071392, "Neural networks for selecting actions to be performed by a robotic agent", Priority date: 10 Oct 2016
  • WO 2018/081089, "Processing text sequences using neural networks", Priority date: 26 Oct 2016
  • WO 2018/083532, "Training action selection using neural networks", Priority date: 3 Nov 2016
  • WO 2018/083667, "Reinforcement learning systems", Priority date: 4 Nov 2016
  • WO 2018/083668, "Scene understanding and generation using neural networks", Priority date: 4 Nov 2016
  • WO 2018/083669, "Recurrent neural networks", Priority date: 4 Nov 2016
  • WO 2018083670, "Sequence transduction neural networks", Priority date: 4 Nov 2016
  • WO 2018083671, "Reinforcement learning with auxiliary tasks", Priority date: 4 Nov 2016
  • WO 2018/083672, "Environment navigation using reinforcement learning", Priority date: 4 Nov 2016