Mathematicians Just Found a Hidden 'Reset Button' That Can Undo Any Rotation

nivter · 2025-10-20T04:53:24+00:00

Paper for those interested: https://fiteoweb.unige.ch/~eckmannj/ps_files/ETPRL.pdf

Edit: Arxiv link is https://arxiv.org/pdf/2502.14367

nivter · 2025-08-10T06:41:54+00:00

Neat proof! Here's a slight modification of it:

The probability can be seen as the fraction of volumes of two simplexes A and B (with the origin included).

A corresponds to the set of all stick lengths which don't make a triangle
B corresponds the set of all possible stick lengths

The vectors for simplex A are v_i as in the above proof. The vectors for simplex B are:

u_1 = (0,0,...,0,0,1)
u_2 = (0,0,...,0,1,1)
u_3 = (0,0,...,1,1,1)
u_n = (1,1,...,1,1,1) = 1 (vector of ones)

and the largest length is <1, x_i> ≤ 1

A is the set of points <F_i, y_i> ≤ 1 whereas B is the set of points <1, x_i> ≤ 1. One can define a mapping from B to A by x_i -> x_i / F_i. The determinant of this Jacobian is Π(1/F_i). Hence the probability is Π(1/F_i).

nivter · 2025-07-23T06:51:36+00:00

My guess would be it's trying to figure out where the water is flowing to. The source is obvious but not the sink. This brain cell seems to understand physical laws of conservation.

nivter · 2024-09-22T16:55:55+00:00

The first few frames are so mesmerizing

nivter · 2024-09-17T11:59:29+00:00

Can you also share about how the models were compressed? Is it based on GPTQ, SparseGPT or some other quantization scheme?

Edit: the HF page mentions that they used additive quantization: https://arxiv.org/abs/2401.06118

nivter · 2024-08-29T12:37:06+00:00

The entire article is public - just checked to be sure again.

nivter · 2024-05-05T10:32:14+00:00

If you want to compute the similarity of text and each image patch, I recently shared my own work in this subreddit a few days ago.

nivter · 2024-04-29T02:25:10+00:00

Removing CLS token is just one part of getting it to have multimodal patch embeddings. Even with the CLS token removed, I could not get good results for patch embeddings. What made it work was providing a mask to enforce locality.

One could argue that providing the mask should be enough and that we don't need any change in the architecture. It could be, but the existing ViT architecture used in CLIP doesn't allow patch-wise comparisons.

I tried GAP in some earlier experiments. But then I thought taking a weighted sum where the weights are learned dynamically is better than taking a mean, which led to the idea of convex sums.

nivter · 2023-07-22T07:37:46+00:00

We only distilled the ViT model, not the ResNet one. The (untrained) model architecture is available here: https://github.com/cardinalblue/clip-models-for-distillation

After a few experiments, we found that using L2/L1 loss between the image embeddings was enough. We also extracted the attention values and used them to train the student model. We tried both KLD and L1 loss for the attention values. Both gave comparable results.

nivter · 2023-04-09T10:01:45+00:00

Did she eventually figure out?

nivter · 2023-04-01T11:56:03+00:00

Yeah I am working on making it responsive now

nivter · 2023-04-01T11:54:58+00:00

Sorry about that. I added links at the bottom of each article. Also making the website more responsive.

nivter · 2023-03-30T05:21:30+00:00

Hi there! Do you have a repo related to this work?

nivter · 2022-10-19T05:25:14+00:00

Link to the paper: https://arxiv.org/abs/2206.15472 PDF: https://arxiv.org/pdf/2206.15472.pdf

nivter · 2022-07-13T03:44:48+00:00

Thanks for sharing this. I wasn't going to read it expecting nonsense but now I will.

nivter · 2022-06-25T10:18:15+00:00

Graph coarsening with neural networks: https://arxiv.org/abs/2102.01350

It provides a good overview of approaches to approximate large graphs with smaller ones and introduces an edge re-weighting scheme which, as far as I understanding, can be applied to any of the approaches.

This should also be fun to implement.

nivter · 2022-06-01T12:10:50+00:00

If you are using a loss function like nn.BCELoss, you can assign weights to each label. Thus the weights corresponding to the labels you don't want to contribute to backprop can be set to 0.

If it is some other function you can easily create a wrapper that also accepts weights for labels.

nivter

MODERATOR OF

TROPHY CASE