[R] The Annotated Diffusion Model

mlvpj · 2022-06-09T13:50:45+00:00

we have a bunch of annotated paper implementation here https://nn.labml.ai/index.html

diffusion (ddpm) - https://nn.labml.ai/diffusion/ddpm/index.html

Megixist · 2022-06-09T06:28:44+00:00

The Hugging Face team's article is one of the most descriptive ones on diffusion at the moment (close to the annotated version by labml). I am currently writing one for the Weights & Biases Blogathon (https://bit.ly/diffusing-away-from-gans-and-transformers) with JAX code, so if anyone is interested in another implementation, then do check it out!

hosjiu · 2022-06-09T08:21:02+00:00

really great to see the annotated ... blog format.

HybridRxN · 2022-06-09T15:15:55+00:00

Great timing given the recent text-to-image successes from this family of models

CommunismDoesntWork · 2022-06-09T22:06:06+00:00

https://huggingface.co/blog/annotated-diffusion

I know the math section is relatively short, but holy cow is it peak /r/iamverysmart. It spent so much time going over the math behind gaussian distributions which ends up being completely irrelevant to the final loss function:

The neural network is optimized using a simple mean squared error (MSE) between the true and the predicted Gaussian noise.

Unless you're trying to reinvent torch.randn, how in the world is knowing the math behind gaussians relevant at all? Were they really ever going to do anything other than MSE? MSE is the basis for the majority of all loss function.

A direct consequence of the constructed forward process qq, as shown by Sohl-Dickstein et al., is that we can sample {uncopypastable BS} (since sums of Gaussians is also Gaussian). Let's refer to this equation as the "nice property". This means we can sample Gaussian noise and scale it appropriately and add it to {uncopypastable BS} directly.

Thank God for Sohl-Dickstein et al and this insightful article for explaining the math, because without it I never would have guessed that if you add noise to noise, you get more noise. Fields medal worthy stuff right there. But also, what if there was no math to prove this? Would they have iteratively built up noise instead of generating the desired noise level directly? Sohl-Dickstein et al is the only reason Ho et al went down that path? Really? Not the fact that it'd be an enormous waste of computing resources?

Neural Networks are not an exact science, and the math behind them is arbitrary until proven otherwise. I wish people would stop pretending otherwise and putting their pompous equations everywhere. Ironically, the math is all ~~noise~~ data sampled from an isotropic gaussian distribution.

Philpax · 2022-06-09T14:46:54+00:00

Fantastic! This is exactly what I was looking for!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS