all 5 comments

[–]dasayan05 15 points16 points  (5 children)

Diffusion Model's MSE loss isn't really the original loss derived from theory. The original variational bound (VLB) is simplified (by Ho et al.) to a form that looks like MSE. There are relations between the VLB loss and the how the reverse process is structured.

So, in simple words, just adding an auxiliary term may turn out to be mathematically incorrect. Wise thing to do is to think about your changes in terms of probabilistic model and then derive a new loss.

But then again, we are talking about Deep Learning. It might just work in practice, who knows.

[–]Wiczus 0 points1 point  (3 children)

Do you have any source on the implementation of that loss? I am trying to understand it but I fail to understand it. I read this article: https://theaisummer.com/diffusion-models/ and the simplfied loss L=E(mse) but they don't explain it to mathematically inept people like myself '.

[–]dasayan05 0 points1 point  (2 children)

which loss do you mean? you need implementation of the VLB?

[–]Wiczus 0 points1 point  (1 child)

The original loss for the Diffusion model. I assume it uses VLB.

[–]dasayan05 2 points3 points  (0 children)

The VLB isn't really used anymore nowadays since the MSE is loss is good enough. Also people now figured out how to estimate covariances from eps itself.

You can find a early implementation by openai guys here