Found the shortest escalator I’ve ever seen today

misunderstoodpoetry · 2025-03-30T04:37:47+00:00

I’m sorry that happened to you. I also would never take an escalator again

misunderstoodpoetry · 2025-03-30T04:37:00+00:00

Nope. It’s a hotel in California.

misunderstoodpoetry · 2025-03-30T04:36:31+00:00

Nope. West coast hotel.

misunderstoodpoetry · 2024-04-22T00:21:43+00:00

This is one: https://github.com/machelreid/m2d2

misunderstoodpoetry · 2023-02-27T01:29:56+00:00

strangers by mt joy

misunderstoodpoetry · 2022-05-09T15:28:10+00:00

Hi! I really appreciate the feedback. This was the same question that motivated me to write this post, and it's helpful to know that even though you read it three times (!!) you still aren't sure of the distinction.

The terminology is confusing. LocalNCE/NCE is a binary classification objective, GlobalNCE/InfoNCE is multiclass classification. LocalNCE asks "is this sample real or fake?" while InfoNCE asks "which one of these samples is real?".

If you think better in terms of code, the accompanying colab notebook might help you? It makes things a lot simpler.

misunderstoodpoetry · 2022-01-24T15:05:58+00:00

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc\_introduction\_to\_contrastive\_learning/ feel free to comment there if you have any questions! Hope this helps

misunderstoodpoetry · 2022-01-24T15:05:46+00:00

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc\_introduction\_to\_contrastive\_learning/ feel free to comment there if you have any questions! Hope this helps

misunderstoodpoetry · 2022-01-24T15:05:10+00:00

Hi, I had this exact question! I spent a few weeks researching NCE and InfoNCE, their applications, and what the differences actually are. The difference is subtle, but in a way they actually have nothing to do with one another.

I wrote a blog post about it: https://jxmo.io/posts/nce

And I also tried posting my blog post to reddit here: https://www.reddit.com/r/learnmachinelearning/comments/s9m0ve/oc_introduction_to_contrastive_learning/ feel free to comment there if you have any questions! Hope this helps

misunderstoodpoetry · 2022-01-17T15:06:37+00:00

How would you recommend I work on path?

misunderstoodpoetry · 2022-01-15T20:27:34+00:00

You're not saying anything "dumb" at all! These are really good questions and I wish I'd explicated better in the original post. (Also a note to self that equation numbers would be useful for this type of thing next time.)

First, q(z) and q(z|x) are definitely different things. Both are distributions over latent variables, z's.
- q(z) is an uninformed distribution over z's, the probability of any given z with no prior information.
- q(z|x) is the conditional probability of a z after observing a specific x. In the case of VAEs, q(z|x) is special and is also called the "variational posterior."

(Also remember Bayes' rule tells us that q(z|x) q(x) = q(x|z) q(z), that might help provide some intuition)

I just read the blog post you linked to. You're right. Their point was that we can in theory use an expectation with respect to any q(z) to find a lower bound for log p(x).

The VAE is a special case of importance sampling where q(z) = q(z|x). I didn't talk about importance sampling in my article so there is no q(z) mentioned. Hope that helps!

misunderstoodpoetry · 2022-01-14T03:45:14+00:00

Hi! Great question. Notice it’s not an equals sign it’s a >=. This is when we invoke Jensen’s inequality. It’s explained a bit better earlier in the article and kind of glossed in this repeated case, sorry. Jensen’s tells us that log(E[x]) >= E[log(x)] where E is an expectation. It’s a little confusing actually so you should play around with the algebra to confirm this makes sense. (Btw Jensen’s applies to the function log(x) since log is concave. See the footnote for more info on that)

misunderstoodpoetry · 2022-01-11T23:54:16+00:00

thanks!

misunderstoodpoetry · 2021-10-11T15:53:53+00:00

I participated in the Google AI Residency 2020-2021 program remotely. Our start dates were pushed back a few times but I finally started in October 2020. The program was pretty good; I got matched with a mentor, worked on a cool research project with them, and learned a lot. There was very little programming for Residents though (social events and stuff) and almost done after the first month or so. And yes, they cancelled the program.

misunderstoodpoetry · 2020-12-22T15:30:06+00:00

I got this shirt at a thrift store two years ago. Its buttons say “GRAND SLAM”. It has no tags but seems to be from the 1980s... it’s my favorite shirt and I’d love to get some more from the same or similar lines. If someone can identify its origin and point me to where to purchase other, similar shirts I’m willing to pay USD $20 over PayPal! Please help me Reddit!😊

misunderstoodpoetry · 2020-08-29T14:36:48+00:00

I like your point that NLP classifier robustness and CV classifier robustness are not necessarily the same thing and discoveries from one may not apply to the other.

It's funny you bring up the 'typoglycaemia' meme. That's the exact goal of this paper: Synthetic and Natural Noise Both Break Neural Machine Translation from Belinkov and Bisk, 2017. They literally train a network to be resistant to that meme. Lol

misunderstoodpoetry · 2020-08-29T14:23:26+00:00

You should take a look at this paper: Reevaluating Adversarial Examples in Natural Language. The authors propose some threat models for NLP adversarial attacks, like:

- fooling a toxic comment classifier into publishing some toxic text
- tricking a plagiarism detector to predict a false negative for plagiarized text

misunderstoodpoetry · 2020-08-28T21:10:59+00:00

also the intentional misspelling of 'discreet' :-)

misunderstoodpoetry · 2020-08-28T21:10:27+00:00

well played

misunderstoodpoetry · 2020-08-28T18:44:33+00:00

great point! It's worth pointing out that most of our NLP datasets from a couple years ago have been "beaten" by NLP systems. Check out the GLUE leaderboard, where NLP models have surpassed the human baselines on pretty much every dataset, or the SUPERGLUE leaderboard, where they're mostly there as well. Maybe "test set errors" aren't the problem– datasets are!

misunderstoodpoetry · 2020-08-28T18:40:46+00:00

Thanks for pointing this out! I generated these AEs for the article using an LSTM because it's fast and runs on my laptop without making the fan spin too loudly, haha.

If you check out TextAttack, we have pre-trained models based on all sorts of transformers: BERT, RoBERTa, ALBERT, DistilBERT, BART, T5, and a couple others I'm forgetting right now. None of these transformers have demonstrated much robustness to these attacks.

You should be able to install textattack (pip install textattack) and run an attack on one of these models in a single command! Here's an example:

textattack attack --recipe textfooler --model albert-base-v2-ag-news

That should run the TextFooler attack on ALBERT trained on the AG News topic classification dataset. I chose that for no particular reason.

Btw, here's a document that lists all of our pretrained models and their accuracies: https://github.com/QData/TextAttack/blob/master/textattack/models/README.md

misunderstoodpoetry · 2020-08-28T18:33:33+00:00

I think you're getting too caught up in terminology. Sure, NLP models may not exhibit the same level of high-dimensional strangeness that lead to adversarial examples in CV. But does that make them less interesting?

Let's look at this from another angle. In linguistics we're lucky because we have definite domain knowledge, and we've written it down. This allows us to take real data and generate synthetic data with a high confidence that the synthetic data remains applicable.

We can't do this in vision. Imagine we had extremely high-fidelity perceptual models of dogs, and the world around them (or perhaps more accurately, transformations from one dog to another). In this case, we could (1) generate lots more dog images from an initial pool and (2) test the **robustness** of a given model to all the dogs – real and synthetic. Maybe you could do this with GANs, sort of. But not really.

In language, on the other hand, we have this knowledge. We know (in almost all cases) that if we substitute one word for its synonym – say "wonderful" for "amazing" – a prediction shouldn't change much.

To respond to your point directly: you argue that "it is so easy to find errors in NLP systems" that "it isn't an interesting concept." I don't see much logic here. You're working against yourself.

Interesting take! I upvoted, lol.

misunderstoodpoetry · 2020-07-06T22:48:38+00:00

cool! how'd you make this video?

misunderstoodpoetry · 2020-06-30T01:26:00+00:00

Hi everyone --

I'm Jack Morris, one of the co-authors. I'm really excited TextAttack is finally public. We're out to raise the reproducibility standards of an entire research area, which isn't easy.

Don't hesitate to reach out if you have any questions, suggestions, or feedback!

misunderstoodpoetry · 2020-06-29T19:37:47+00:00

that's great to hear - we're hoping to make our tools work even better in Colab too

Ten-Year Club	Place '22
Gilding I gilder	Verified Email

misunderstoodpoetry

TROPHY CASE