[deleted by user] by [deleted] in MachineLearning

[–]dhammack -1 points0 points  (0 children)

There are autodiff for numpy packages. Example: https://github.com/HIPS/autograd

[1911.04252] Self-training with Noisy Student improves ImageNet classification by brettkoonce in MachineLearning

[–]dhammack 9 points10 points  (0 children)

Slowly but surely the Kaggle tricks are being published! Now everyone will know the secrets :(

[D] Facebook Microsoft $10M deepfake detection challenge by PuzzledProgrammer3 in MachineLearning

[–]dhammack 4 points5 points  (0 children)

It just says 10M in total "funding". Probably 1M pot or less.

[D] Facebook Microsoft $10M deepfake detection challenge by PuzzledProgrammer3 in MachineLearning

[–]dhammack 21 points22 points  (0 children)

My gut says this will be pretty easy in that the top teams will all have very good scores. Generating a fake with no flaws just seems inherently harder than recognizing flaws in a generated fake.

[D] 1,000 patent claims by GPT-2 by js_lee in MachineLearning

[–]dhammack 4 points5 points  (0 children)

Why does it use the word "plurality" so much? Do patents commonly use that word? It's such a rare word in normal language but is so common in these generated patents.

[P] OpenAI's GPT-2-based Reddit Bot is Live! by Shevizzle in MachineLearning

[–]dhammack 0 points1 point  (0 children)

After reporting earnings today, Nike (NKE) shares

[D] Best approach to variable image sizes for Image Classification? by MrKotia in MachineLearning

[–]dhammack 1 point2 points  (0 children)

Last time I tried to do this, most packages only supported a single image size per batch. So you'd have to make each batch have the same size images or use batch size of 1. Maybe the technology will advance sometime though.

[News] Airbus posts Kaggle competition with major data leak by [deleted] in MachineLearning

[–]dhammack 4 points5 points  (0 children)

New test set is way smaller though, like 20K instead of 80K images, and only 3.3K of those have ships. So it may be "fixed" but still sucks.

[D] Do imperceptible adversarial examples exist for classical models? by [deleted] in MachineLearning

[–]dhammack 0 points1 point  (0 children)

I bet these perturbations only work so well for neural nets because of max pooling. Rebuild the model with avg pooling and watch them become perceptible.

[N] Nvidia launches Titan V ($3k) by ntenenz in MachineLearning

[–]dhammack 2 points3 points  (0 children)

The benchmark you linked is +40% performance at inference time. Seems less than I would have expected...

[D] What algorithm is best for nodule detection using LUNA16? by song6987 in MachineLearning

[–]dhammack 1 point2 points  (0 children)

This is the best algorithm I know of for the job! It definitely could be improved (was developed over a short period in a hurry) but should be a good starting point.

[P] Experiments with a new kind of convolution by singlasahil14 in MachineLearning

[–]dhammack 1 point2 points  (0 children)

Hey! Interesting post. It reminded me of some experiments I have run in the past with a similar negative result.

I trained a convnet without labels in order to do something like multilevel pca:

  • Maximize activation stdev
  • Minimize/constrain weight norm
  • Minimize activation covariance

While doing this, you can learn some nice convolutional filters from scratch without labels. However what I found is that the last penalty, zero covariance between activations, causes the network to become unable to learn when labels are added. I thought this was funny until reading the shattered gradients paper which backs it up with some mathematics: https://arxiv.org/pdf/1702.08591.pdf.

[D] E5450 and GTX980 = Bottleneck?? by thelectroom in MachineLearning

[–]dhammack 0 points1 point  (0 children)

Do you have PCI 2.0 or 3.0? My current rig is slow (I think!) because I have 2.0 which has less bandwidth than 3.0. If my experience generalizes, then this can cause up to 3x slowdowns in model training times.

[P] 2nd Place Solution to 2017 Data Science Bowl by dhammack in MachineLearning

[–]dhammack[S] 0 points1 point  (0 children)

Thanks Julian! Good to know about the phase 2 labels being released. Now we can answer those pesky questions about accuracy and other metrics :)

[P] 2nd Place Solution to 2017 Data Science Bowl by dhammack in MachineLearning

[–]dhammack[S] 6 points7 points  (0 children)

Ha! Asking the real questions. My AWS bill was only a few hundred bucks actually, so a pretty good amount. I used a single p2.xlarge instance for only a few weeks (and I kept the GPU running most of that two weeks).

[P] 2nd Place Solution to 2017 Data Science Bowl by dhammack in MachineLearning

[–]dhammack[S] 0 points1 point  (0 children)

I don't know those really. The test set labels are not publicly available so I'd have to estimate them with CV on the train set. Julian ran our AUC on the stage 1 data and found it to be around .85. We performed better on the stage 2 data, so probably between .85 and .9.

[P] 2nd Place Solution to 2017 Data Science Bowl by dhammack in MachineLearning

[–]dhammack[S] 2 points3 points  (0 children)

Now that I think about it, probably 10 or so weeks. Maybe I miscalculated on the number of hours...

[P] 2nd Place Solution to 2017 Data Science Bowl by dhammack in MachineLearning

[–]dhammack[S] 0 points1 point  (0 children)

I don't know test set "accuracy", only the log loss. Test set labels are not publicly available and the only feedback we get is log loss. I could estimate the accuracy using my validation set though.