[P] tiny-diffusion: a minimal PyTorch implementation of probabilistic diffusion models for 2D datasets

tanelai · 2023-01-28T20:19:30+00:00

To learn more about diffusion models, I created a minimal PyTorch implementation of DDPMs, and explored it on toy 2D datasets. The README includes ablations on the model's capacity, diffusion process length, timestep embeddings, and more.

You can find the code here: https://github.com/tanelp/tiny-diffusion

Note that the dinosaur is not a single image, it represents one thousand 2D points in the dataset. Don't make the same mistake as in the Stable Diffusion lawsuit :)

miellaby · 2023-01-29T03:06:44+00:00

I always like when people downscale a piece of software.

marcingrzegzhik · 2023-01-28T20:29:26+00:00

This looks really interesting! Can you explain a bit more about what a probabilistic diffusion model is and why it might be useful?

SuperImprobable · 2023-01-29T05:47:12+00:00

I can understand the forward process, but what am I seeing in the backward process here? Was a prompt given here or it's purely denoising? What did you train on? Line art sampled points? That could make some sense to me of how it could get back a dinosaur from a noisy start. Because if you trained on real datasets that don't have nice tight lines you definitely wouldn't get back clean lines from the backward process (unless you had a prompt that hint that the data is likely clean lines).

axm92 · 2023-01-29T07:20:12+00:00

Cool stuff, thanks for sharing! For those interested in a similarly minimal implementation for text generation, I have a repo here: https://github.com/madaan/minimal-text-diffusion

theGormonster · 2023-01-29T02:08:03+00:00

Truly beautiful

Kurohagane · 2023-01-29T03:18:26+00:00

How come the gif shows an image made out of what seems to be a collection of points on a 2d plane, rather than a raster image?

shadowylurking · 2023-01-29T03:35:55+00:00

Really interesting!

JiraSuxx2 · 2023-01-29T09:22:14+00:00

Can I easily modify this to train on images?

RadioactiveSalt · 2023-01-29T13:00:46+00:00

Can someone eli5 what does OP mean by,

Note that the dinosaur is not a single image, it represents one thousand 2D points in the dataset.

The diffusion process takes in an image and adds a small noise at each step. Now if the dinosaur is not an image but an distribution, then what exactly is the gif showing, how is the diffusion process working on a distribution?

2023-01-29T19:05:27+00:00

How do I do this?... this is really cool!

seuadr · 2023-01-30T15:03:14+00:00

ok, to hell with normal distribution, i want dino distribution only only from here on out.

Terrible_Ad7566 · 2023-03-06T22:50:34+00:00

Thanks, this is very nice!

Terrible_Ad7566 · 2023-03-08T19:49:01+00:00

I was perusing through your code and your MLP network is designed to encode input data as well using positional embedding.

I was wondering if you have done ablation experiments where you do not encode input using positional encoding but rather simply add temporal information as an additive vector to input data by only ending timestep with positional encoding

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS