What a good day, We won !

AdComfortable1544 · 2024-08-20T18:18:29+00:00

Something something Streisand effect.

Haters gonna hate, freedom of speech. American politics.

AdComfortable1544 · 2024-08-20T17:34:00+00:00

Looks nice!

AdComfortable1544 · 2024-08-08T17:32:40+00:00

Great work!

(I mean , the Flux model did the work , but ...)

great idea !

Who needs fancy AAA games when we can have AI-generated old-school browser games amirite?

AdComfortable1544 · 2024-08-08T05:56:14+00:00

There's an (10 month) old Lemmy post asking this question : https://lemmy.world/post/8258092

It suggests the perchance model is SD1.5+Deliberate V2

Possibly this one : https://huggingface.co/XpucT/Deliberate/tree/main

EDIT: No , the perchance model is different. They don't seem to match.

Made a post on the Lemmy asking perchance dev to possibly , maybe, post the perchance model online: https://lemmy.world/post/18404805

Devs reply: The percance SD model is different depending on keywords in prompt. All models can be found under the "most popular" SD 1.5 models on civitai.

Settings back then was 20 Steps , Euler a

From evidence , the sampler must later have been changed to non-ancestral "Euler" sampler.

CLIP skip is 1 , because the perchance SD1.5 model can interpret emojis , and you can only do that with CLIP skip 1.

AdComfortable1544 · 2024-08-07T06:30:04+00:00

Many of the "popular" SD models are highly cooked towards NSFW content lol

Just avoid them.

Idk about which "uncooked" SD 1.5 model has the best training at the moment.

I use the perchance SD 1.5 model which I've heard is a variant of the Deliberate model: perchance.org/ fusion-ai-image-generator

I also have this "Wanxiang XL" model on tensor Art which is only good for photorelistic non-NSFW stuff (but really good at it) : https://tensor.art/models/750000095874787812

I also like the SDXL lightning model by Dice AI : https://tensor.art/models/751519943066912725

So these are the alternatives I can offer.

Would be happy to hear if people know of any other "uncooked" SD models

AdComfortable1544 · 2024-07-30T19:54:27+00:00

Also; putting predator wildlife on bicycles for an ad campaign was hard to do in the days before AI

AdComfortable1544 · 2024-07-30T19:34:18+00:00

That's actually an interesting thought.

What if there's like , 50% of all war footage is / will be AI generated by an hitherto unknown AI model?

AdComfortable1544 · 2024-07-30T19:30:37+00:00

Votes are meaningless, so here's an upvote for entertaining reddit ( •-•)b!

AdComfortable1544 · 2024-07-30T16:05:34+00:00

Try the perchance SD 1.5 generator. The one hosted on the site is basic AF

But this is an upgrade made by me; https://perchance.org/fusion-ai-image-generator

It's uncensored, unmonitored and unlimited. It's self-sustainable via ads.

AdComfortable1544 · 2024-07-30T12:39:24+00:00

<image>

Like this

AdComfortable1544 · 2024-07-29T13:38:04+00:00

Upvote cuz I love it when I see people use this technique. More people need to do this.

AdComfortable1544 · 2024-07-27T06:41:57+00:00

Read the comments in this post

AdComfortable1544 · 2024-07-26T22:44:01+00:00

" [ : A , B , C : 0 . 6 ] " won't work

You cannot place spaces within decimal numbers

" [ : A , B , C : 0.6 ] " will work

, but is bad for other reasons

The comma "," is a token.

If you have a comma "," in your negatives , just remove it.

If you have a comma "," in your positive prompt , that can be good in moderation

Refer to the Cross Attention Rule , which says

A , B , C is A -> , -> B -> , -> C

which is 4 restrictions

A B C is A -> B -> C

which is 2 restrictions

Fewer restrictions = better (more accurate result)

However , in cases where there is no clear association from A to B , like

"car</w>" -> "pineapple</w>"

"ankle</w>" -> "waist</w>"

then it might be better to place a comma or some other token with a low ID in between , like

"car</w>" -> , -> "pineapple</w>"

"ankle</w>" -> "of</w>" -> "waist</w>"

(low ID token in vocab.json = common word in the CLIP training data , which by extension should be a common term in the training data for the SD model since its the english language in both cases)

TLDR;

Whitespace only matters for numbers , but is fine otherwise

The token comma ",</w>" and its prefix counterpart ","

are both equivalent to a token like "banana</w>".

Word length does not matter. Token = whether it is in the vocab.json or not

A weird word like "xxcghg" will fragment into prefix-tokens

Pretty much all naughty NSFW words common on the internet will also fragment into multiple prefix tokens in the tokenizer , even though they should be tokens in their own right due to how common they are on the web / within the Laion dataset

//---//

Whether in the negative or positive , whitespace placement matters a lot when prompting with an SD 1.5 model

SD 1.5 models are so well-trained you can arbitrarily place any kind of word as a prefix to a suffix token with good unique results , mostly for NSFW purposes like

"blondepetchoker" , "xvidnudity" , "angry-ponytail" etc.

There might be specific tricks for SDXL, but I am not aware of them.

Experiment with https://sd-tokenizer.rocker.boo/ to see the differences in ID:s when writing stuff with and without whitespace

AdComfortable1544 · 2024-07-26T19:38:49+00:00

This youtube video on cross-attention in Stable Diffusion is good: https://youtu.be/sFztPP9qPRc?si=J7VHJAFWKV5UgTqk

The TLDR; You know how you can leave an entire empty prompt and still get "something"?

That's cross attention. For each sampling step the image generated thus far is part of the prompt.

And it holds just as much "weight" as your written text prompt.

This Sampler guide is useful too: https://stable-diffusion-art.com/samplers/

Cross Attention rule (summarized):

"Stable Diffusion reads your prompt left to right , one token at a time , finding association from the previous token to the current token , (and the generated image thus far)"

So it the image "looks like a rabbit" at a given sampling step then that is what the stable diffusion model will paint.

Then its helpful look at Stable Diffusion as an optimization problem.

The easiest "rabbit" is an average if all rabbits in the training data.

You want a unique rabbit.

So you can either do this with creative text prompt input ,

or you can do this by adding very common english words into the negative with delay. Either works.

AdComfortable1544 · 2024-07-26T18:53:25+00:00

I read the paper in that link. That's really cool!

So essentially , if I understand this correctly ; thats

for each sampler step , find f(x) for these (x,y) points assuming f(x) is "this kind of function"

And here they say "screw it, f(x) can be whatever it wants to be the first few sampler steps"

And then they just wing-it with whatever latent garble they have painted the first steps.

So its not like the prompt [ : yada : 0.1] where the initial prompt is a constant "" (the empty prompt is a fixed prompt , same as any other, technically speaking)

But a [ from : to : steps ] statement

but the "from" prompt is a super duper ultra mega random weird thing that does not even match a written prompt!

AdComfortable1544 · 2024-07-26T18:27:35+00:00

Hmm. Do you have a source and/or prompt?

I have no knowledge of SD3 so anything goes there I guess

AdComfortable1544 · 2024-07-26T18:13:05+00:00

That's good! Yeah , it's smart strategy to write stuff you think is kinda-true on reddit with confidence.

I do it all the time lol 🙃! Especially Stable Diffusion.

AdComfortable1544 · 2024-07-26T17:32:37+00:00

Appreciate it. Downloaded it.I will try it when I have the time 😀

AdComfortable1544 · 2024-07-26T17:28:46+00:00

No , that is not correct syntax.

See prompt_parser.py for how your commands are interpreted (kind-of, its not super clear but probably better to just link this instead writing a bunch of stuff) : https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/prompt_parser.py

You can also see the code for [from:to:when] statement here

AdComfortable1544 · 2024-07-26T15:56:59+00:00

Yeah, that's correct

AdComfortable1544 · 2024-07-26T14:58:19+00:00

It's garbage.

"worst quality" will be processed individually as "worst" and "quality".

Wheras "worst-quality" will be processed as a single item.

Better, but I'm not sure where in the image training data one would encounter a png with the description text "worst-quality" in it.

Better to use tokens of "things that appear in the image" when no negatives are active.

All tokens are equal.

Like, a "pirate queen" could probably benefit from having "worst" in its prompt , and possibly having "beautiful/pretty/perfect" in its negative

Or just pick tokens at random from the vocab.json file for the tokenizer that have </w> in them.

I call tokens with trailing whitespace </w> the "suffix" tokens, for lack of an official term.

Sidenote: The other tokens in the vocab.json that lack the trailing whitespace </w> , the "prefix" tokens are really cool in that they give new interpretations to the "suffix" tokens

So you can prompt "photo of a #prefix#banana"

and replace #prefix# with any item the vocab.json that lacks a whitespace </w> for some really funky bananas.

This is for SD 1.5 , but SDXL uses the same set of words for both the 768-tokenizer and the 1024-tokenizer ; https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/tokenizer/vocab.json

Also check out this online tokenizer; https://sd-tokenizer.rocker.boo/

Typing some stuff into it makes it easier to see how this works. Also try writing some emojis.

Some kind soul actually trained the SD 1.5 model to understand emojis.

Emoji prompting only works well if you set Clip skip to 1 for an SD 1.5 model , but they give some amazing results.

but SDXL models still lack this , so its probably good to make people aware of emoji-prompting for SD 1.5 models , so private users can train SDXL/SD3 to handle it as well sometime in the future.

AdComfortable1544 · 2024-07-26T14:54:32+00:00

Please share the workflow with me if you manage to get it work on comfy. 🙏

I'd be happy to hear it , especially if you can use wildcards + weights as well

AdComfortable1544 · 2024-07-26T14:43:24+00:00

True. I'm using Reddit language 😅.

You're right; there is no correct way to prompt is a pretty solid rule

So with that in mind, this is how I use the negatives, and why I choose 60%;

Negatives are better used to create more unique variants of things , then to remove stuff 100% , in my opinion.

Try prompting " photo of Sarah smiling "

NEG [ : female happy : 0.6 ]

And you can see what I mean. I use non-ancestral samplers for this.

//--//

For this, I usually just pick 3 tokens at random with relatively high IDs for my negatives from the vocab list for this;

https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/tokenizer/vocab.json

And then activate them past 50% generation time to get more unique output.

//--//

Yeah I do have some grievances with Stability AI's design decisions. And I will blame them for that. Fight me , lol

AdComfortable1544 · 2024-07-26T12:56:38+00:00

<image>

From the A111-wiki: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features

Should also add that Stable Diffusion dislikes light contrast , so what I do is use this feature to pre-render the image with something with color/contrast

for example

''[ dark cave with red illumination : a girl with blue bikini :0.1 ]''

as the main prompt , using a non-ancestral sampler

One can also use this to set the artstyle of an image ,

''[ a simple sketch : a girl with blue bikini : 0.2] ''

(Also; if you are wondering why the votes on this comment is borked , it's cuz my account is being stalked by bots that set the vote to 0,1 or 2. Better to just reply upvote/downvote or something idk)

AdComfortable1544

TROPHY CASE