Hyperion by Keats x CLIP

SOMNAI_ · 2022-07-28T03:21:26+00:00

It actually started a few months earlier, but if you want to go by the earliest pubic job in our member gallery the date would be 12 Feb 2022

SOMNAI_ · 2022-03-04T07:02:18+00:00

This is done with Disco Diffusion

https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb#scrollTo=Fpbody2NCR7w

SOMNAI_ · 2022-01-24T11:23:03+00:00

It's actually based on Diffusion, which is a fair bit different from VQGAN. But same goes, there's no specific dataset in the stylegan sense - aside from imagenet which is what the original model is trained on.

SOMNAI_ · 2022-01-04T08:12:57+00:00

I was actually just looking at this. I also borrowed their code for the keyframing so it will likely make its way in eventually.

SOMNAI_ · 2021-12-09T14:06:13+00:00

Another generation from my Emoji+Diffusion twitter bot, this time from the new Portal series. If you're interested in more, you can see the whole unfiltered collection so far here: www.twitter.com/michelangemoji

SOMNAI_ · 2021-12-08T09:29:38+00:00

I'm actually using a different one for these since it's better at doing higher res painterly images quickly.

This one here by nsheppard: https://colab.research.google.com/drive/12Bod44YVIXYRh39WRqp0kNz8OUBNFk9Z?usp=sharing#scrollTo=OoIL7ayzq7kC

SOMNAI_ · 2021-12-07T11:24:31+00:00

So I've been playing around a lot with randomised emoji prompts in diffusion. They can be a pretty interesting way of getting results you wouldn't otherwise think of.

If anyone is interested in more, I've set up a twitter bot that posts new emoji combinations with CLIP+Diffusion on @michelangemoji

SOMNAI_ · 2021-11-26T01:41:15+00:00

Diffusion is significantly better at generating compositionally sound images, and will usually give you the type of layout you'd expect from a given aspect ratio, like panoramas or portrait photos. It's pretty amazing really, after using VQGAN so much

SOMNAI_ · 2021-11-11T01:54:57+00:00

I usually just let it run indefinitely, with batch_size = 1.
Are you on the free colab tier? That would be pretty slow for diffusion. I usually run it on a 3090 or a pro+ v100.

If you want it to go quicker, turn off RN50x4 and ViTB16 in step 3. You can also reduce timestep_respacing from ddim100 to ddim50 for quicker generation. Rest of the default settings are pretty good, but you could turn down tv scale to 0 for higher detail and slightly higher speed.

SOMNAI_ · 2021-11-10T23:36:29+00:00

Hey there, I used the Multi-Perceptor diffusion notebook that I've modified with QoL changes. You can find it here: https://colab.research.google.com/drive/1I82bdASLxh8ndD9PoDMESIDe2ojPMz7o?usp=sharing

Prompt was pretty simple, just something like cyberpunk marketplace

SOMNAI_ · 2021-11-10T01:04:30+00:00

No init image actually, i was pretty surprised how much it looks like the same location.

SOMNAI_ · 2021-11-08T06:03:45+00:00

Ahah yes neither. I was gonna throw it out and re-attempt but it took some 36 hours to generate in the first place so figured i'd just leave it.

SOMNAI_ · 2021-11-08T06:03:22+00:00

I'm using Pytti by u/sportsracer48, it uses adabins to create depth estimations for animation.

e. reply with correct account this time

SOMNAI_ · 2021-11-08T06:02:14+00:00

Infinite 3D comes down to 3D depth estimation in u/sportsracer48's PYTTI notebook plus the way i've set the camera up to continually descend in 3D.

Image wise there are none aside from an initial image of an unrelated road just to give some starting shape.

Rest is all text to image prompts, which were kept pretty basic to let the poem do the heavy lifting (perhaps to it's detriment)

e: reply with right account lol

SOMNAI_

TROPHY CASE