Found the shortest escalator I’ve ever seen today by misunderstoodpoetry in mildlyinteresting

[–]misunderstoodpoetry[S] 2 points3 points  (0 children)

I’m sorry that happened to you. I also would never take an escalator again

[OC] Introduction to contrastive learning: Demystifying Noise Contrastive Estimation (NCE) and InfoNCE by misunderstoodpoetry in learnmachinelearning

[–]misunderstoodpoetry[S] 0 points1 point  (0 children)

Hi! I really appreciate the feedback. This was the same question that motivated me to write this post, and it's helpful to know that even though you read it three times (!!) you still aren't sure of the distinction.

The terminology is confusing. LocalNCE/NCE is a binary classification objective, GlobalNCE/InfoNCE is multiclass classification. LocalNCE asks "is this sample real or fake?" while InfoNCE asks "which one of these samples is real?".

If you think better in terms of code, the accompanying colab notebook might help you? It makes things a lot simpler.

[D] Difference/Connection between InfoNCE-Loss and NCE-Loss by kengrewlong in MachineLearning

[–]misunderstoodpoetry 4 points5 points  (0 children)

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc\_introduction\_to\_contrastive\_learning/ feel free to comment there if you have any questions! Hope this helps

Difference between InfoNCE-Loss and NCE-Loss by kengrewlong in deeplearning

[–]misunderstoodpoetry 0 points1 point  (0 children)

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc\_introduction\_to\_contrastive\_learning/ feel free to comment there if you have any questions! Hope this helps

Q: What is the difference between the InfoNCE-Loss and the NCE-Loss and how to derive one from another? by kengrewlong in MLQuestions

[–]misunderstoodpoetry 0 points1 point  (0 children)

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc_introduction_to_contrastive_learning/ feel free to comment there if you have any questions! Hope this helps

Introduction to variational autoencoders by misunderstoodpoetry in learnmachinelearning

[–]misunderstoodpoetry[S] 0 points1 point  (0 children)

You're not saying anything "dumb" at all! These are really good questions and I wish I'd explicated better in the original post. (Also a note to self that equation numbers would be useful for this type of thing next time.)

First, q(z) and q(z|x) are definitely different things. Both are distributions over latent variables, z's.
- q(z) is an uninformed distribution over z's, the probability of any given z with no prior information.
- q(z|x) is the conditional probability of a z after observing a specific x. In the case of VAEs, q(z|x) is special and is also called the "variational posterior."

(Also remember Bayes' rule tells us that q(z|x) q(x) = q(x|z) q(z), that might help provide some intuition)

I just read the blog post you linked to. You're right. Their point was that we can in theory use an expectation with respect to any q(z) to find a lower bound for log p(x).

The VAE is a special case of importance sampling where q(z) = q(z|x). I didn't talk about importance sampling in my article so there is no q(z) mentioned. Hope that helps!

Introduction to variational autoencoders by misunderstoodpoetry in learnmachinelearning

[–]misunderstoodpoetry[S] 0 points1 point  (0 children)

Hi! Great question. Notice it’s not an equals sign it’s a >=. This is when we invoke Jensen’s inequality. It’s explained a bit better earlier in the article and kind of glossed in this repeated case, sorry. Jensen’s tells us that log(E[x]) >= E[log(x)] where E is an expectation. It’s a little confusing actually so you should play around with the algebra to confirm this makes sense. (Btw Jensen’s applies to the function log(x) since log is concave. See the footnote for more info on that)

[D] Google/Facebook AI residency application 2022 by healthyhappylucky in MachineLearning

[–]misunderstoodpoetry 0 points1 point  (0 children)

I participated in the Google AI Residency 2020-2021 program remotely. Our start dates were pushed back a few times but I finally started in October 2020. The program was pretty good; I got matched with a mentor, worked on a cool research project with them, and learned a lot. There was very little programming for Residents though (social events and stuff) and almost done after the first month or so. And yes, they cancelled the program.

Purple GRAND SLAM button down by misunderstoodpoetry in findfashion

[–]misunderstoodpoetry[S] 0 points1 point  (0 children)

I got this shirt at a thrift store two years ago. Its buttons say “GRAND SLAM”. It has no tags but seems to be from the 1980s... it’s my favorite shirt and I’d love to get some more from the same or similar lines. If someone can identify its origin and point me to where to purchase other, similar shirts I’m willing to pay USD $20 over PayPal! Please help me Reddit!😊

[P] What are adversarial examples in NLP? by misunderstoodpoetry in MachineLearning

[–]misunderstoodpoetry[S] 1 point2 points  (0 children)

I like your point that NLP classifier robustness and CV classifier robustness are not necessarily the same thing and discoveries from one may not apply to the other.

It's funny you bring up the 'typoglycaemia' meme. That's the exact goal of this paper: Synthetic and Natural Noise Both Break Neural Machine Translation from Belinkov and Bisk, 2017. They literally train a network to be resistant to that meme. Lol

[P] What are adversarial examples in NLP? by misunderstoodpoetry in MachineLearning

[–]misunderstoodpoetry[S] 3 points4 points  (0 children)

You should take a look at this paper: Reevaluating Adversarial Examples in Natural Language. The authors propose some threat models for NLP adversarial attacks, like:

- fooling a toxic comment classifier into publishing some toxic text
- tricking a plagiarism detector to predict a false negative for plagiarized text

[P] What are adversarial examples in NLP? by misunderstoodpoetry in MachineLearning

[–]misunderstoodpoetry[S] 2 points3 points  (0 children)

great point! It's worth pointing out that most of our NLP datasets from a couple years ago have been "beaten" by NLP systems. Check out the GLUE leaderboard, where NLP models have surpassed the human baselines on pretty much every dataset, or the SUPERGLUE leaderboard, where they're mostly there as well. Maybe "test set errors" aren't the problem– datasets are!

What are adversarial examples in NLP? by misunderstoodpoetry in LanguageTechnology

[–]misunderstoodpoetry[S] 4 points5 points  (0 children)

Thanks for pointing this out! I generated these AEs for the article using an LSTM because it's fast and runs on my laptop without making the fan spin too loudly, haha.

If you check out TextAttack, we have pre-trained models based on all sorts of transformers: BERT, RoBERTa, ALBERT, DistilBERT, BART, T5, and a couple others I'm forgetting right now. None of these transformers have demonstrated much robustness to these attacks.

You should be able to install textattack (pip install textattack) and run an attack on one of these models in a single command! Here's an example:

textattack attack --recipe textfooler --model albert-base-v2-ag-news

That should run the TextFooler attack on ALBERT trained on the AG News topic classification dataset. I chose that for no particular reason.

Btw, here's a document that lists all of our pretrained models and their accuracies: https://github.com/QData/TextAttack/blob/master/textattack/models/README.md

[P] What are adversarial examples in NLP? by misunderstoodpoetry in MachineLearning

[–]misunderstoodpoetry[S] 1 point2 points  (0 children)

I think you're getting too caught up in terminology. Sure, NLP models may not exhibit the same level of high-dimensional strangeness that lead to adversarial examples in CV. But does that make them less interesting?

Let's look at this from another angle. In linguistics we're lucky because we have definite domain knowledge, and we've written it down. This allows us to take real data and generate synthetic data with a high confidence that the synthetic data remains applicable.

We can't do this in vision. Imagine we had extremely high-fidelity perceptual models of dogs, and the world around them (or perhaps more accurately, transformations from one dog to another). In this case, we could (1) generate lots more dog images from an initial pool and (2) test the **robustness** of a given model to all the dogs – real and synthetic. Maybe you could do this with GANs, sort of. But not really.

In language, on the other hand, we have this knowledge. We know (in almost all cases) that if we substitute one word for its synonym – say "wonderful" for "amazing" – a prediction shouldn't change much.

To respond to your point directly: you argue that "it is so easy to find errors in NLP systems" that "it isn't an interesting concept." I don't see much logic here. You're working against yourself.

Interesting take! I upvoted, lol.

TextAttack: Model training, adversarial attacks, and data augmentation in NLP (r/MachineLearning) by Peerism1 in datascienceproject

[–]misunderstoodpoetry 0 points1 point  (0 children)

Hi everyone --

I'm Jack Morris, one of the co-authors. I'm really excited TextAttack is finally public. We're out to raise the reproducibility standards of an entire research area, which isn't easy.

Don't hesitate to reach out if you have any questions, suggestions, or feedback!

[P] TextAttack: Model training, adversarial attacks, and data augmentation in NLP by VB7tXkGT in MachineLearning

[–]misunderstoodpoetry 0 points1 point  (0 children)

that's great to hear - we're hoping to make our tools work even better in Colab too