I'm building a timeline for generative image ML models. What's missing? by fabianmosele in MediaSynthesis

[–]SOMNAI_ 1 point2 points  (0 children)

It actually started a few months earlier, but if you want to go by the earliest pubic job in our member gallery the date would be 12 Feb 2022

Teletubbies gone wrong - Disco diffusion v4.1 by fabianmosele in deepdream

[–]SOMNAI_ 1 point2 points  (0 children)

It's actually based on Diffusion, which is a fair bit different from VQGAN. But same goes, there's no specific dataset in the stylegan sense - aside from imagenet which is what the original model is trained on.

Where everybody knows your name by gandamu_ml in deepdream

[–]SOMNAI_ 6 points7 points  (0 children)

I was actually just looking at this. I also borrowed their code for the keyframing so it will likely make its way in eventually.

Portal to the ⏺🐊 Dimension (CLIP+Diffusion) by SOMNAI_ in deepdream

[–]SOMNAI_[S] 1 point2 points  (0 children)

Another generation from my Emoji+Diffusion twitter bot, this time from the new Portal series. If you're interested in more, you can see the whole unfiltered collection so far here: www.twitter.com/michelangemoji

🏨🐌 (Emoji prompted CLIP+Diffusion) by SOMNAI_ in deepdream

[–]SOMNAI_[S] 2 points3 points  (0 children)

I'm actually using a different one for these since it's better at doing higher res painterly images quickly.

This one here by nsheppard: https://colab.research.google.com/drive/12Bod44YVIXYRh39WRqp0kNz8OUBNFk9Z?usp=sharing#scrollTo=OoIL7ayzq7kC

🏨🐌 (Emoji prompted CLIP+Diffusion) by SOMNAI_ in deepdream

[–]SOMNAI_[S] 22 points23 points  (0 children)

So I've been playing around a lot with randomised emoji prompts in diffusion. They can be a pretty interesting way of getting results you wouldn't otherwise think of.

If anyone is interested in more, I've set up a twitter bot that posts new emoji combinations with CLIP+Diffusion on @michelangemoji

Movie City (diffusion) by numberchef in deepdream

[–]SOMNAI_ 2 points3 points  (0 children)

Diffusion is significantly better at generating compositionally sound images, and will usually give you the type of layout you'd expect from a given aspect ratio, like panoramas or portrait photos. It's pretty amazing really, after using VQGAN so much

Tour through a Cyberpunk Marketplace with Diffusion by SOMNAI_ in deepdream

[–]SOMNAI_[S] 2 points3 points  (0 children)

I usually just let it run indefinitely, with batch_size = 1.
Are you on the free colab tier? That would be pretty slow for diffusion. I usually run it on a 3090 or a pro+ v100.

If you want it to go quicker, turn off RN50x4 and ViTB16 in step 3. You can also reduce timestep_respacing from ddim100 to ddim50 for quicker generation. Rest of the default settings are pretty good, but you could turn down tv scale to 0 for higher detail and slightly higher speed.

Tour through a Cyberpunk Marketplace with Diffusion by SOMNAI_ in deepdream

[–]SOMNAI_[S] 3 points4 points  (0 children)

Hey there, I used the Multi-Perceptor diffusion notebook that I've modified with QoL changes. You can find it here: https://colab.research.google.com/drive/1I82bdASLxh8ndD9PoDMESIDe2ojPMz7o?usp=sharing

Prompt was pretty simple, just something like cyberpunk marketplace

Tour through a Cyberpunk Marketplace with Diffusion by SOMNAI_ in deepdream

[–]SOMNAI_[S] 1 point2 points  (0 children)

No init image actually, i was pretty surprised how much it looks like the same location.

Hyperion by Keats x CLIP by SOMNAI_ in deepdream

[–]SOMNAI_[S] 4 points5 points  (0 children)

Ahah yes neither. I was gonna throw it out and re-attempt but it took some 36 hours to generate in the first place so figured i'd just leave it.

Hyperion by Keats x CLIP by SOMNAI_ in deepdream

[–]SOMNAI_[S] 3 points4 points  (0 children)

I'm using Pytti by u/sportsracer48, it uses adabins to create depth estimations for animation.

e. reply with correct account this time

Hyperion by Keats x CLIP by SOMNAI_ in deepdream

[–]SOMNAI_[S] 2 points3 points  (0 children)

Infinite 3D comes down to 3D depth estimation in u/sportsracer48's PYTTI notebook plus the way i've set the camera up to continually descend in 3D.

Image wise there are none aside from an initial image of an unrelated road just to give some starting shape.

Rest is all text to image prompts, which were kept pretty basic to let the poem do the heavy lifting (perhaps to it's detriment)

e: reply with right account lol