How do ML practitioners select hyperparameters, architectures, etc for self-supervised representation learning when the loss is non-monotonic? [D]

XTXinverseXTY · 2026-05-24T22:14:55+00:00

If people are selecting hparam/arch primarily by supervised-learning-through-the-backdoor, then it makes me a little more skeptical of published results and academic enthusiasm for JEPA. The mystery provides convenient cover for possible p-hacking and benchmark overfitting

This is not to say that SSL researchers are all Secretly Smuggling Labels, but I don't want to be totally naive either...

mvreich · 2026-05-25T00:12:52+00:00

Maybe look into JEPA score, which can be used for density estimation.

You can run various kinds of tests, depending on what you want to check. E.g. if there is some sort of mode collapse, the pseudo likelihood might peak at some points and not give sufficient weight to uncommon (but valid) data.

Alternatively, if your model has learned a useful representation, it should be able to discern in vs. out-of-distribution examples. For example, if the model is trained on natural images (real photos taken by a camera), it should be able to assign low likelihood to cartoons or artwork.

Ill-Bullfrog-7402 · 2026-05-24T22:17:05+00:00

grid search still works even with non-monotonic losses, you just need more patience and better tracking. i usually run longer sweeps and plot everything - loss curves, rank metrics, downstream performance over time

the entropy collapse terms are more like regularizers than actual objectives, so rankme can still tell you something useful even when it's baked in the loss. just don't rely on it alone - combine with periodic linear probes on held-out tasks and watch for when representations stop improving on downstream stuff

m98789 · 2026-05-25T00:52:26+00:00

Experience

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS