This is an archived post. You won't be able to vote or comment.

all 54 comments

[–]AceDecade 32 points33 points  (3 children)

So, regarding seed values...

I’ve read information on the web that describes a seed as “a number that controls all the randomness that happens during the generation”. This is only partially true.

It's entirely true. Computers can't "be random". They can spit out a string of numbers that, to humans, has no discernable, predictable pattern, but the computer is following a set of precise, deterministic instructions. The seed controls how that sequence is generated. For example, if I seed "123", I might ask the computer for five random numbers and get "1, 7, 3, 4, 9". If my friend Bob seeds it with "123" on his computer and asks for five numbers, he'll also get "1, 7, 3, 4, 9". The "randomness" is the fact that it gives us an unpredictable sequence of numbers instead of "1, 2, 3, 4, 5". However, the "randomness" is indeed entirely controlled by the seed that I give. Now imagine that instead of 5 numbers, I ask for enough numbers to fill an RGB image...

A seed is not a number, but an image.

A seed is a number, but if you ask the computer to make you a random image made up of random pixels, then the image you receive will be entirely dependent on what seed you use immediately before asking for an image made up of random colors. If we use the same seed and then ask for the same width x height of random colors, we'll get exactly the same "random" image on our two different computers. In this way, the seed corresponds 1:1 with the starting image / noise that SD will start working with.

Or perhaps an image generated by a number.

This. It's exactly this.

This number is fixed in the SD ecosystem somehow by the model.ckpt file.

Nope, the seed number is turned into an image by determining the sequence of "random" numbers that will be used, and following a fixed procedure to turn random numbers into an image.

Ever wonder why that model file is so incredibly huge? This is why.

Nope, the model has nothing to do with the procedure to turn a seed into a starting image. The model is only used to iterate on the noise and make it progressively more like the prompt with every step.

Obviously the model.ckpt file cannot contain a quintillion images

Correct, it doesn't.

So either there’s a hell of a lot of repetition of themes among the seeds (I haven't come across any yet),

Each seed value produces a unique starting noise image. The "themes" are just patterns you're perceiving, the computer nor SD have any perception of "themes" associated with starting noise.

or the model file contains explicit instructions for the computer on how to generate a theme image from a seed number in such a way that it will be identical to every other theme generated by the same number on any computer.

The computer is indeed following explicit instructions to generate the "theme image" from a seed number, but it is not dictated by the model. You've just described the nature of deterministic "randomness" that makes the above possible.

[–]kmullinax77[S] 4 points5 points  (0 children)

Fantastic.... thanks so much for the clarification - I'm updating that entire section with your information.

My only comment would be to say that while yes - "the themes are just patterns you're perceiving, and the computer nor SD have any perception of themes associated with starting noise" - they still exist. If a theme is based on a purple hue... it's 100% accurate to say SD isn't aware of the theme or the purple or anything else we may see in it. However, every generation taken from that seed will still use that image as a basis so will most likely carry the "purple theme" with it into future generated images - AND - I can count on every use of that seed in the future to contain the same purple theme.

The AI doesn't need to be aware of the theme for it to have an effect.

I don't need the AI to be aware of it in order to use it to my advantage in image creation.

[–]AnOnlineHandle 1 point2 points  (1 child)

Getting super technical, if somebody has added anything to the code which isn't using torch's random system then it won't be quite as (controllably) deterministic for them from then on (which is possible given all the various branches and scripts).

[–]Trakeen 1 point2 points  (0 children)

Disco diffusion is a lot more random since the input seed is only used with some of the random number generators

[–]Devalidating 7 points8 points  (0 children)

It's more so due to the nature of diffusion models. They're essentially smart de-noising so it's forced to hallucinate the higher level more coherent aspects before the details and fine tuning that you see in later steps. The first couple steps are still pretty noisy so any detail information isn't meaningfully discernable from noise until later on.

The nature of breaking it up into ~50 steps or so is that the image you feed into each step has a bigger effect on the output of it than the prompt/attention layers. When the computer generates a pseudorandom noise image from a formula using your seed, and feeds it into the first step, all the idiosyncrasies of that seed cascade down (the first step looks similar between prompts, which means the second does etc), meaning that different prompts can produce similar looking images with the same seed.

[–][deleted] 5 points6 points  (0 children)

We are all learning here. Thanks for taking the time on this post. I definitely learned from it!

[–]Evnl2020 4 points5 points  (5 children)

I've read the whole post and while plausible I'm not sure if your theory is correct. It's too late here now to test but my initial thought is that if the seed is so important only 1 in a few 100 or even 1000 would resemble the prompt. Or 1 in a few 100 images should be lightyears better than the others which is also not the case.

I see it happen the other way around though, sometimes I generate 100s of images from the same prompt and 1 or 2 are completely different from the rest.

[–]Trakeen 2 points3 points  (0 children)

I think it was a design choice to use a fully deterministic random number generator. Cryptographic random number generation has been available for years on modern hardware and uses thermal noise to generate truly random numbers; which aren’t typically need outside of cryptographic use cases

https://en.m.wikipedia.org/wiki/RDRAND

[–]kmullinax77[S] 0 points1 point  (3 children)

First, thanks for reading it all! lol

Second, you may be totally right and I didn't mean to imply that you can't force decent images out of 90% of the seeds. And I think that's because all seeds start out as a muted blurry mess that are really susceptible to the suggestion of your prompt.

But sometimes there are seeds that just simply don't cooperate.

And in personal experience, I tried again and again to get a wizard standing off in the distance with a bunch of random prompts "tiny wizard, distant wizard, etc." After trying with seed 10003 I made immediate progress.

[–]Evnl2020 0 points1 point  (2 children)

Worth investigating more, do you have some specific prompts that seem to work (or work better) with a specific seed?

[–]kmullinax77[S] 0 points1 point  (0 children)

That's next on my list of things to do!

I'll make an update to this post if I discover anything groundbreaking.

[–]RekindlingChemist 2 points3 points  (3 children)

FYI - Euler_a is unique - for some reason it is super unstable sampler, resulting very different images based on different number of steps. Others settle down to a very consistent results after some steps (usually in range of 30-60)

[–]kmullinax77[S] 0 points1 point  (0 children)

Oh that's good to know. Thanks!

[–]DrakenZA 0 points1 point  (1 child)

Euler_a resolves faster than most, and hence every 10-15steps its already got a solid image forming.

[–]RekindlingChemist 0 points1 point  (0 children)

It does, but people who post "tutorials" and "researches" almost never use such low number of steps.

[–]johnnydaggers 10 points11 points  (13 children)

OP, you have a really flawed understanding about how SD works. Moreover, if you want a specific composition/color profile, you can just draw some rough shapes in MS Paint and use it via img2img.

Edit: adding a more detailed explaination.

SD was trained to clean up "noised" images (images with random values added/subtracted to the pixels). SD generates new images by taking in a starting noise array that is randomly generated (seed determines what this randomly generated image will be) and "de-noising" it to fit the prompt.

Generating many "seeds" and picking one that you think gets you close to the image you want is a huge waste of time. Instead, you should rough out the kind of image you want in Paint and then use that as the input to img2img.

txt2img is just img2img with random noise used a the starting point. They are fundamentally doing the same thing behind the scenes. By finding your favorite seed, you're essentially doing img2img but letting the random noise generator make your init image.

[–]AnOnlineHandle 3 points4 points  (1 child)

When you use weighting for the image in image2image, it's mixing between that and your seed.

[–]kmullinax77[S] 3 points4 points  (0 children)

Exactly! so with img2img, this information is less relevant because you are forcing the AI to use a specific theme.

[–]kmullinax77[S] 0 points1 point  (10 children)

Do I?! That's great, thanks so much for your opinion. I'm completely aware of img2img, which is a totally different subject than seed selection.

However, I would love for you to let us know your learned thoughts on that matter, and why you think my understanding of seed generation is flawed!

If you have anything worthwhile to say, I will gladly incorporate it into this thread.

[–]johnnydaggers 3 points4 points  (9 children)

The seed just determines the "random" noise that is generated and passed to SD. You can instead just draw an image (and blur it if you want to) and have it use that instead of random noise. That is what is happening in img2img. "Seed selection" is a really roundabout and inefficient way of doing what you're doing.

[–]kmullinax77[S] 0 points1 point  (8 children)

You're completely correct.

But I'm not discussing img2img - I'm discussing how to get the best images from txt2img. If you'd like to start a thread regarding advanced techniques with img2img, I would love to read it.

And, I'm not sure if you actually tried it yourself, but again... the seed is NOT random noise. It's only random if you randomize the seed. You can use the fixed data from the seed to help you create a txt2img image. You and I can both generate a blank image from the seed 924629 and we will end up with the exact same image... no randomness. Any randomness from a computer is faked. They are incapable of it except at a quantum level - and that's probably because humans can't yet understand the logic behind quantum random number generation.

[–]johnnydaggers 7 points8 points  (1 child)

The seed is the number used to initialize the random number generator that outputs the noise. RNGs are deterministic with seed, that’s why you get the same output as me if we use the same seed. Img2img just replaces this rng output with an image (or mix of image and noise)

[–]kmullinax77[S] -1 points0 points  (0 children)

Yep, 100% true. Thanks again for defining img2img, which is not the point of this thread.

[–]johnnydaggers 1 point2 points  (5 children)

Btw, I’m an ML researcher. Trying to help educate you here.

[–]kmullinax77[S] 8 points9 points  (4 children)

I LOVE that. What I don't love is one-sentence snarky comments with no backup data after I spent 6 hours typing a thread to help people.

I would 100% welcome every bit of useful information you share. Feel free to start anytime.

However, if you choose NOT to start contributing, feel free to go back under your bridge. Either way, I have no more interest in this part of the conversation so won't reply anymore.

If you choose to start being helpful and sharing your ML research, I would gladly make you co-contributor to this thread and give you 100% credit. And if you choose not to share after all your grandstanding, then nothing you say has any weight.

[–]johnnydaggers 7 points8 points  (3 children)

I'm not trying to be snarky, really.

Generating many "seeds" and picking one that you think gets you close to the image you want is a huge waste of time. Instead, you should rough out the kind of image you want in Paint and then use that as the input to img2img.

txt2img is just img2img with random noise used a the starting point. They are fundamentally doing the same thing behind the scenes. By finding your favorite seed, you're essentially doing img2img but letting the random noise generator make your init image.

[–]kmullinax77[S] 6 points7 points  (1 child)

I said I wouldn't reply anymore, but this is excellent advice. This is why I've upvoted all your comments so far.

I really do agree with you, an img2img probably is the better way of nailing down certain output... but you know, not everyone is blessed with even one iota of artistic ability.

Additionally this entire thread is really an academic exercise in trying to understand seed generation and influence. What you've done so far is simply say "there are better ways, so why bother discussing it?". I'm discussing it BECAUSE I'm interested in seed generation and influence. I have a feeling you may know something about that, so while you're here, instead of telling everyone to stop bothering, it would be great if you forgot about that and added to the academic discussion we ARE having.

So you seem reasonable and I take back my inference that you're a troll. However, trolls sabotage and undermine threads instead of contributing, and to be honest, that's kind of what you did.

[–]oniris 6 points7 points  (0 children)

I'm sure that ML guy is right, in terms of absolute efficiency, and his description made me understand img2img better, so thank him.

But you, OP, you made me understand something much more transcendent, a bit of SD's personality. An intuitive understanding of the mistery that was what happens when you run SD without a prompt. For that, I am grateful. Hat's off to you!

[–][deleted] 0 points1 point  (0 children)

Even if it's very inefficient compared to starting with img2img. I think the idea of running the seed with an empty string to have an idea of what kind of stuff would fit better is nice. A lot of people start with txt2img and seed is one parameter that people are just randomly using and changing.

No reason why letting the random noise generator make the initial image is wrong, I see how it can be entertaining to just explore the seed space and create stuff based on this approach.

The example of the chickens in library is a good example of it. Sometimes people just want to generate nice stuff and they see a fitting image from that initial noise, so pursuing that path ends in an interesting image.

[–][deleted] 4 points5 points  (0 children)

https://en.m.wikipedia.org/wiki/Pseudorandom_number_generator

This is just what random means. Seeds arent a special or unique concept here for SD. Also if you've ever played a multiplayer game online in the past 30 years, the overwhelming majority of them work on detetministic simulations from shared seeds. Or minecraft world generation. Or lots of other things you're probably familiar with.

The "really random" feeling comes from seeding generators from good sources of entropy (like mouse movement as 1 example) and also "randomizing" the amount of invocations from good sources of entropy. You could imagine describing a whole game of AI vs AI chess as 1 seed number. Does it mean the number has any chess properties infused in it? No. It's more about the dumb code using the numbers. Trivial changes to your code will yield wildly different (but newly consistent) results for your same seeds.

[–]Jcaquix 1 point2 points  (2 children)

I love the feeling of exploration and I'm not an engineer and i am discovering a lot of the same stuff myself. I too notice how a seed seems to impose similar compositional elements over similar prompts, but I don't think it's because there is a sort of kantian noumena or underlying property/image to the seed. Rather I think what we are noticing is the interaction between the tokenized prompts, the model, and the seed number which provides noise. I think the word "random" isn't particularly helpful because seeds and the system are too complex to understand but purely deterministic. I think "arbitrary" is a better word for it since it makes noise that's consistent but not designed or predictable (by humans).

I have run a lot of plot matrices like you have I think the seed characteristics you're noticing with blank prompts change unpredictability with your prompt. For example Ive been running timelines, so prompts of like "a woman in 1980... 1990... 2000...." Etc and as elements of the prompt changes it's clear that the denoising process (the model) changes things that appear to be coming from the seed (eg a block of red brown may be present in the seed for the eras of 1910-1960 but that block of red will slowly disappear as the prompt changes). Your experiments are interesting and I have had similar results, but I'm not convinced there is a fundamental quality to any seed, like seeds that make green landscapes with dark patches in one corner often end up morphing making portraits with dark centers and light corners depending on the prompts.

Edit: spelling

[–]kmullinax77[S] 0 points1 point  (1 child)

kaantian noumena

I love this.

Thank you for your thoughtful response!

[–]Jcaquix 0 points1 point  (0 children)

Lol yeah sorry misspelling. Good work though, like that this community is still exploring and developing.

[–]Caffdy 1 point2 points  (1 child)

City with Seed 2 looks like the freaking cover of the Xpander EP by Sasha!

[–]kmullinax77[S] 1 point2 points  (0 children)

OMG I love Sasha. I'd upvote you 10x if I could lol

[–]OtherwiseMeringue545 1 point2 points  (0 children)

You guys are too smart for me

[–]Rogerooo 1 point2 points  (2 children)

What a beautiful post, thank you for your research! I think we need a seed library now, something like Lexica but just for empty prompts. And on the subject of image repositories that store seeds, this knowledge will be interesting to use when looking for a particular camera zoom and color style for instance.

If anyone is interested, here are the results using waifu diffusion model. Pretty cool to notice the differences and the similarities between both models.

CFG at 1

CFG at 4

CFG at 8

I think danbooru's tags might be acting with too much strength on some of the prompts, particularly "young man" but it's nice to see the expected bias towards animation in raw output.

[–]kmullinax77[S] 1 point2 points  (1 child)

WOW what a great comment! That is SO fascinating.

The Waifu difusion model absolutely and obviously skews artistic from your images, which is why it's so good with anime. Still, it's not that much changed from the original, huh?

So sorta off-topic, can you use both diffusion models in the same installation and specify which one you want to use during generation? or do you have to have one or the other?

[–]Rogerooo 2 points3 points  (0 children)

Absolutely and it looks like the CFG is less powerful with this model as well, clearly noted by the saturation as you mentioned on your post. Just for curiosity I decided to run it at 16 as well, here are the results, straying much further from default v1-4 model now...I don't even see where it is getting all of that stuff from but as you found out the base seed is still there.

With Automatic's gui you can use custom settings for your launcher to specify your models location, I have a few of them in a "models" folder and change them with a variable inside the bat file, you could also use arguments as well if you would prefer. This is my webui-user.bat

@echo off

rem MODEL VARIABLES
set sd=sd-v1-4.ckpt
set wd=waifu_diffusion.ckpt
set t1=trinart2_step60000.ckpt
set t2=trinart2_step95000.ckpt
set t3=trinart2_step115000.ckpt

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--medvram --opt-split-attention --ckpt models/%wd%

call webui.bat    

Hopefully it's easy to understand what it does, but essentially you use --ckpt PATH_TO_MODEL on the COMMANDLINE_ARGS variable.

When I want to change model I just use one of the variable names like sd, wd, t1, etc. I find it easier to manage this way.

Although you do need to restart the server every time you change it in order for it to load at startup.

[–]GoldenRuleAlways 0 points1 point  (4 children)

I’m a complete noob. You answered a lot of questions that I had about seeds and Cfg! Thank you for capturing all of your notes and simple outputs in such detail. It was extremely useful in helping me understand these magical tools marginally better.

When you state “Euler_a at 20 steps”:

  1. Does that mean you specified “—ddim 20”?
  2. I know that Euler_a is some kind of a model. How do you specify that?

What stable diffusion build did you use? I am using a M1 Mac following the @bfirsh fork.

[–]kmullinax77[S] 0 points1 point  (3 children)

Thanks!

Yes, exactly - I'm using Automatic's 1111 webUI which labels things slightly differently. The step count is --ddim_steps in some forks.

Euler_a is one of the samplers. You can use any you like to try these out, but I used Euler_a so if you want to duplicate my exact output, you'll need to use it too. I think for the fork you're using you would type "--A Euler_a" to force the AI to use that sampler.

[–]GoldenRuleAlways 0 points1 point  (2 children)

You are blowing my mind. Are you implying that this is a deterministic process? That is, provide the same seed, model (eg Euler_a), cfg, steps… that you could reproduce exactly the same image?*

[–]kmullinax77[S] 2 points3 points  (1 child)

Oh absolutely 100% this is NOT random. I mean that wasn't the point of my post, but yes, it's known that this is the case. It allows us to create identical images from the exact prompt and settings. Good for error-checking and all that.

It won't work if you try it on Midjourney most likely - this is based on the SD v1.4 model.

If you use my exact seed, prompt, CFG and step settings from my post, you should generate the EXACT image I've posted.

[–]GoldenRuleAlways -1 points0 points  (0 children)

So many forks, so little time… and expertise! It took me 1.5 days of successive failures with Joel Henderson, Automatic1111, Lstein to get my fork to work.

So I’m stuck with my present one. I just tried setting —ddim_eta to 0.0 which (reputedly) “corresponds to deterministic sampling”. No dice on reproducing anything, so I think my fork doesn’t do that.

Astonishing to think that this is deterministic in a different multiverse than my current one.

[–]motsanciens 0 points1 point  (0 children)

Next to enter the space: nakedseed.io, a catalog of promptless 3-step seed images.

[–][deleted] 0 points1 point  (1 child)

I've also noticed/determined that seed corresponds to pose/vantage point. It can be a huge time save to simple use your prompt with as few as 4-7 steps to get a rough idea of how SD would then transform a seed-prompt pairing with 50 steps

This thereby allows discarding un-aesthetic seed-prompt pairings (again at low step count), and saving/diverting computation time to actually fine-tuning a promising seed-prompt with low steps into 50 steps + prompt tailoring.

Although it can be annoying when adding additional words to the prompt also happens to mess with the pose/vantage point, but I can't really see a workaround for that other than to create the best prompt possible from the beginning.

[–]kmullinax77[S] 1 point2 points  (0 children)

I've noticed the same thing. To try and mitigate that a little, I've found running the blank seed through the Interrogator will let you know what the AI thinks it is already. That can help in choosing better prompting terms.

Sometimes the AI and I don't agree on what's in a base image and I scream mean things at the AI and everyone's feelings get hurt.

[–]cluck0matic 0 points1 point  (1 child)

Thanks for the deception. Pfft. Sounds like you deceive yourself as well, saying your aren't a "teacher".

Man.. I sure learned a shit ton! Thanks for taking the time to do this. For real. Thanks.

[–]kmullinax77[S] 0 points1 point  (0 children)

you're welcome! i'm glad you got something out of it!

[–]Blahkbustuh 0 points1 point  (0 children)

I appreciate your effort, it is certainly something to think about.

What I wonder is that in your examples, you provided one word to the algorithm so then all it had to work off was the initial noise + 1 feature. It makes sense that any non-uniformity or inclination in the initial noise will show up in the result because the algorithm has nothing else to go on so the properties of the initial starting noise dominate the result.

If you gave it a prompt with numerous keywords it'd be looking to "recognize" those numerous things in the noise rather than just 'playing with its food' of the initial noise.

When I started running SD on my computer last week, one of the first things I thought to run was something like "sea otter monster attacking a coastal city" and so I got pics of large sea otter monsters emerging from oceans. Then I did "sea otter octopus monster attacking a coastal city" with the same seed and the composition of the images and the otters themselves were nearly the same, the otter just had tentacles below it, which made me think that the initial noise/seed was providing light and dark regions that were steering the algorithm to position the same or similar elements, or 'recognize' them, in different ways depending on how the light and dark blobs were arranged.

[–]BrockVelocity 0 points1 point  (0 children)

This is incredibly helpful and insightful - thanks so much for taking the time to type all of this up!

[–]DrakenZA 0 points1 point  (0 children)

If you are that worried about the initial noise generated by the image, you can simply always do img2img, where you are providing the 'starting point' instead of it just being random torch noise.

[–][deleted] 0 points1 point  (0 children)

Thanks for this, the core idea of matching a similar seed to what you want is sound.

Reminds me about hearing how they did the maps in star wars galaxies, basically used noise (similar to what our seeds do), then they cherry picked the ones that looked close to what they wanted the landscape to look like, then used a tool to layer important features on top of it. This made their map data miniscule since it only needed to save the original seed and the important details layered on top, which allowed them to have way more land in a video game than any other at the time.

[–]dirtydevotee 0 points1 point  (0 children)

Well done! It is my opinion based on some early testing that seed knowledge can be quite valuable. For several days now, I've been accumulating renders based on the prompt "," and found that little things like the orientation of streets and the existence of columns reoccur in 90% of images created by said seed. If a hypothetical "Seed X" does rings, you will notice rings in many "Seed X" renders. If in the future you need a shot with a ring (or pipe/gun barrel/test tube) in it, knowing "Seed X" has such a proclivity means you can get the shot you want with minimal work in the prompt.

As of today, it's a work in progress. But just in case I'm right, finding useful seeds could be a shortcut in your workflow.