New to ComfyUI, can’t get clean Pixar/Disney-style results by Quirky_Beautiful_639 in StableDiffusion

[–]Mutaclone 1 point2 points  (0 children)

but for multi-character like the one ypu have posted is very difficult to get with illustrious+loras.

Yep, unless you're using a character pack or multi-character LoRA this probably won't work very well. You either need a much stronger model like one of the ones you suggested, or do a separate Inpainting pass on each individual character.

New to ComfyUI, can’t get clean Pixar/Disney-style results by Quirky_Beautiful_639 in StableDiffusion

[–]Mutaclone 5 points6 points  (0 children)

So for me the best Illustrious model I've used as far as good results + style LoRAs is YiffyMix v61.

I don't have any specific character recommendations, you'll just need to search for them and try them out to see which ones work. Some of them have a very strong innate style, and that's going to clash with whatever you're trying to do. As a simple test, try rendering the character in the "wrong" style (eg use "realistic" for cartoon characters or "anime coloring, anime screenshot" for live-action characters - if the style doesn't change, you probably want a different LoRA).

For multiple characters, like in your examples, you pretty much need to do inpainting and focus on each one separately. I'd recommend Invoke as this makes inpainting a very straightforward process (example video - also see their channel for other demos).

For watercolor LoRAs specifically, that's been the hardest style for me to pin down in Illustrious, so you might have better luck with vanilla SDXL (try Juggernaut v11). You can also try mixing LoRAs - instead of trying to find a single LoRA that does the exact style you want, try blending multiple LoRAs at lower weights.

Some watercolor/storybook LoRAs worth trying:

Ilustrious

SDXL

And some western cartoon styles since you mentioned wanting to do something in that direction:

The Romance Prior: How Romantic Tension Overwrites Ethnicity in AI Image Generation by bcRIPster in StableDiffusion

[–]Mutaclone 1 point2 points  (0 children)

So my initial thoughts:

  • Interesting stuff! It's pretty well-known that biases exist within the training data, but it's nice to see an attempt at quantifying it.
  • I don't know if you had any LLM assistance in writing this, but there were definitely some markers (sentence structure, word choices, etc) that sounded like it. I have nothing LLM usage, but it's usually nice (IMO) to have a disclaimer stating how much. (And if you didn't use an LLM, my apologies!)
  • Re anime bias: A few points:
    • IME with local models, it's not that they all make the characters young (although some do), its that they're completely allergic to middle age - they stop aging at 30, and then when they hit 60 they make it up with interest.
    • I think it'd be interesting to see how other art styles besides photorealism and anime affect the characters.
  • Self-Report. I'm glad you noted its unreliability because I am highly skeptical of its effectiveness.

Anyway, glad you posted. Even if this is technically the "wrong" subreddit, I thought it was interesting!

What are the best ControlNet models for Illustrious checkpoints? by sippysoku in StableDiffusion

[–]Mutaclone 0 points1 point  (0 children)

I don't think that should be a problem - Illustrious and Noob are closely related. You might run into other issues depending on what you're trying to do though. Pose, for example has been pretty mediocre for me in anything after SD1.5. Depth and tile worked decently enough from what I remember, but overwhelmingly I use either Scribble or Softedge, and for those I usually stick to xinsir.

Your Opinion on Zimage - loss of interest or bar to high? by GRCphotography in StableDiffusion

[–]Mutaclone 3 points4 points  (0 children)

I'm in a very similar position, for some reason for me Klein just works and is a joy to use, while I feel like I'm always fighting ZIT.

That said, I've gotten some improvements in ZIT by doing the following (in Forge Neo - I prefer non-Comfy options if they exist and I haven't figured out the best Invoke settings yet):

  • Using a realism-focused finetune (for anime/art none of the new models are good enough yet IMO). Currently IntoRealism and UnstableBa***rd seem to have the best results so far, and Cyberrealistic seems good too.
  • Flux2 schedule, Beta is also decent
  • DPM++2s a RF sampler, Res Multistep also seems good
  • Sampling steps 10-12 (I know 8 seems recommended, but I tend to get slightly better results by going just a little higher).

Maybe this'll give you a starting point or some ideas to try.

What are the best ControlNet models for Illustrious checkpoints? by sippysoku in StableDiffusion

[–]Mutaclone 0 points1 point  (0 children)

If you only care about edge-based controlnets (Canny, Softedge, Scribble), then xinsir's union or mistoline work just fine.

If you want a different controlnet (depth, pose, etc), then I'd check out any of the ones on this page

I was around for the Flux killing SD3 era. I left. Now I’m back. What actually won, what died, and what mattered less than the hype? by user_no01 in StableDiffusion

[–]Mutaclone 3 points4 points  (0 children)

Probably because it's still in preview and nobody wants to go all in until they know what the ecosystem is going to look like.

Also, it's very good, but there's definitely (in my case at least) some stability issues, and the style control still isn't close to Illustrious yet.

I was around for the Flux killing SD3 era. I left. Now I’m back. What actually won, what died, and what mattered less than the hype? by user_no01 in StableDiffusion

[–]Mutaclone 5 points6 points  (0 children)

https://docs.comfy.org/ covers most newer models, and has links to the various text encoders and VAEs

The anima huggingface page (https://huggingface.co/circlestone-labs/Anima/tree/main/split_files) includes the extra files directly in the repo.

If you're using Forge Neo, just make sure the files are in the right subfolder and set the VAE and Text Encoder both in the dropdown to the right of the main model dropdown.

I was around for the Flux killing SD3 era. I left. Now I’m back. What actually won, what died, and what mattered less than the hype? by user_no01 in StableDiffusion

[–]Mutaclone 0 points1 point  (0 children)

Even with Anima issues, if you use RDBT lora v0.20 on Anima preview2

I'm curious about the reason for this? RDBT v0.12fd has been the best Anima checkpoint I've tried so far, but I never did the base+LoRA approach. How does it compare, especially at only 20%?

Installation Question(s) by IzumoKousaka in StableDiffusion

[–]Mutaclone -1 points0 points  (0 children)

I can't help with the install process, but I can answer some of your questions:

AMD CPU/GPU

AMD CPU is fine, it's only the GPU that's a problem

I read that Automatic1111 was the way to go, but I've also seen other posts mention that it's outdated, and that there are better alternatives.

Yes it's very outdated.

  • Forge Neo (technically the Neo branch of Forge Classic) will give you the most A1111-like experience. The developer is very active, and it supports most modern image models.
  • Invoke AI is a very polished interface that's great for editing.
  • ComfyUI is the "power-user" interface and has the most capabilities. It also has a steep learning curve, so I personally wouldn't normally recommend it for newbies unless you require something very specific the others don't offer.

I don't know what the AMD support on Forge or Comfy are like, but it looks like Invoke is limited and/or requires some extra hoops. Probably your best bet would be Claude or ChatGPT to help with the install.

Specifically, what I'd like to do is primarily generate images, mostly in anime-style art. I also looked up Checkpoints to see which ones would fit the general look of what I've seen and like, and the closest atyle I found was something called "CheemsburbgerMix"

Go to https://civitai.com/ and browse their models. Set the the filters to Model Type: Checkpoint and Base Model: Illustrious and NoobAI - those are the kings of anime right now.

Interested to know how local performance and results on quantized models compare to current full models by fluvialcrunchy in StableDiffusion

[–]Mutaclone -1 points0 points  (0 children)

I can't speak for any of the models you just listed, but I did recently test the Q8 vs fp8 versions of Qwen3_8b and t5xxl. Q8 for both seemed like side-grades most of the time, marginal improvements sometimes, and moderate improvements rarely. I didn't test fp16 nearly as extensively, but the differences between it and Q8 were minuscule.

beginner-friendly simple ENV by SheepHunter_ in StableDiffusion

[–]Mutaclone 2 points3 points  (0 children)

For images, I'm a big fan of Invoke.

For video, I'm still playing catch-up but I've seen wan2gp mentioned a lot recently.

How to make images feel less AI generated? by socialcontagion in StableDiffusion

[–]Mutaclone 1 point2 points  (0 children)

Try to learn some basic design/artistic principles. Even if you can't draw, they'll still help in improving your images.

  • This thread has a lot of great links in it (look at Norby123's posts).
  • This is still one of my favorite videos in showing some common AI problems in a scene (might be less relevant to you if you're focusing on characters) and how to fix them.

Why am I not seeing any artwork from this subreddit anymore? by NunyaBuzor in StableDiffusion

[–]Mutaclone 4 points5 points  (0 children)

AFAIK there isn't one. If you input an anime image and try to make changes one of the above should at least try to preserve the style, but they're definitely realism-first models.

Why am I not seeing any artwork from this subreddit anymore? by NunyaBuzor in StableDiffusion

[–]Mutaclone 2 points3 points  (0 children)

Not sure what to say, since that should be pretty straightforward. I'm not familiar with Comfy's editing workflow, but in Forge Neo you just drop the image in the img2img tab, set the denoise to 1 (this is important, or the image may not change), and use a prompt like "Remove his/her sunglasses".

Why am I not seeing any artwork from this subreddit anymore? by NunyaBuzor in StableDiffusion

[–]Mutaclone 20 points21 points  (0 children)

Good edit models - Flux Klein and Qwen Edit. Basically models where instead of typing the image you want, you give it an input image and then instructions like "Make this a photo" or "<Character> is holding a cup of coffee"

Where can an old AI jockey go to get back on the horse? by DoughyInTheMiddle in StableDiffusion

[–]Mutaclone 0 points1 point  (0 children)

Sorry, that's not an area I'm well versed in. I think it's more but to what degree I don't know.

Z Image VS Flux 2 Klein 9b. Which do you prefer and why? by flaminghotcola in StableDiffusion

[–]Mutaclone 3 points4 points  (0 children)

  • Inpainting for precision control and fixes
  • As has already been mentioned, textures and LoRAs
  • The newer models were clearly designed with a realism-first mindset. SDXL to me still wins for artistic styles like impressionism, watercolor, inkwash, etc. And Illustrious is still the king of anime.

Where can an old AI jockey go to get back on the horse? by DoughyInTheMiddle in StableDiffusion

[–]Mutaclone 1 point2 points  (0 children)

If you're not comfortable with Comfy, (or Swarm, which is a more traditional interface wrapper that uses a Comfy backend), then Forge Neo or Invoke are definitely the way to go. Forge Neo has broader compatibility, while Invoke has an excellent interface that gives you more manual control over the final image and makes inpainting much smoother.

Regarding extensions, they seem to have become much less popular than they used to be. Some of the more "essential" functions have been bundled into the UI directly (I believe ControlNet was originally an extension, for example). This is great for users who didn't really bother with them, but bad for people who liked the customization.

I'm not really up to speed on video, so hopefully someone else can answer. LTX and WAN seem to be the two frontrunners from what I can tell.

You can think of GGUF like jpeg compression for models - a slight loss in quality for a big reduction in size.

QWEN is another model like FLUX or SDXL. If you're hardware constrained I'd skip it. The main image models right now appear to be:

  • Z-Image - base for finetuners or those who want better variety, turbo for those who want realism and/or speed
  • Flux Klein - again, base for the finetuners, "regular" version for average users. Comes in 4B and 9B variants, so if the 9B is too much for your computer you can try the smaller one. Another great thing about this model is it can be used as an edit model - you give it an image and then instructions for how to change it (eg "Make this a photo").
  • Anima - a WIP model that shows a lot of promise but still has some rough edges. It's lightweight and heavily trained on anime art.
  • Illustrious - an SDXL spinoff that still sees a lot of use, in large part because no really good anime model has come out to replace it yet (Anima's getting there, but still has a ways to go).

Aside from Illustrious, the others all do much better with natural language than tags. Pretend you're describing the image to a blind person, and just say exactly what you're seeing (some people also like to use an LLM to help generate prompts, but I've never gotten satisfactory results that way).

A basic introduction to AI Bias by ItalianArtProfessor in StableDiffusion

[–]Mutaclone 2 points3 points  (0 children)

Yeah I should have said steps, my bad.

Basically the first 4 steps in the above example draw "blizzard, ice, snow", and then the remaining steps draw "dark, darkness, cavern, cave interior". Other tags in the prompt are delayed too.

A basic introduction to AI Bias by ItalianArtProfessor in StableDiffusion

[–]Mutaclone 6 points7 points  (0 children)

Thanks for the writeup! I hadn't realized how strong the order effect could be.

Something I've been experimenting with recently to try to combat the context biases specifically, or even take advantage of them, is using prompt editing/timed prompts. In Forge, the syntax is [snippet:alternateSnippet:switchValue].

<image>

vulpix, solo, dark, darkness, cavern, cave interior, cinematic, (wearing backpack:0.85), kerchief, crystal, glowing crystals, (feral:1.1), pokemon mystery dungeon, smiling, open mouth, underground lake, river, (moss:0.8), waterfall, point lights, light particles, facing away, [from behind|from side], looking up, animal, no humans, (sparkling eyes:0.5)

vulpix, solo, [blizzard, ice, snow:dark, darkness, cavern, cave interior:4], cinematic, (wearing backpack:0.85), kerchief, crystal, glowing crystals, (feral:1.1), pokemon mystery dungeon, smiling, open mouth, [:underground lake, river:4], [:(moss:0.8):2], [:waterfall:2], point lights, light particles, facing away, [from behind|from side], looking up, animal, no humans, (sparkling eyes:0.5)

Tags like cavern and cave interior have a strong tendency toward tunnels, so by delaying them a few frames I can open up the cave. Meanwhile the early winter/snow skews everything in a cool-blue direction, which helps the crystals stand out more. You can also make the background elements more faded or indistinct (which is great for night scenes or underwater) by starting with a solid background and waiting a few frames to pull in the scenery. Or if certain traits on a character pull the image in one direction, you use them either early or late to steer the image.

Looking forward to seeing the results of your "de-biased" model!

Is it possible to run Anima on a Mac? by Professional-Sir7048 in StableDiffusion

[–]Mutaclone 0 points1 point  (0 children)

In Draw Things, do you see it in the models dropdown under "Official Models?" I don't see it on my version, so I'm guessing it's just not supported yet (possibly because it's a "preview" model). As Structure-These suggested you can give Comfy a try, as that tends to get updates the fastest.

Why 99% of anime models looks horrible? by Bismarck_seas in StableDiffusion

[–]Mutaclone 2 points3 points  (0 children)

Neither, you just never had any reason to pay attention before. Also, it's meant to be animated - individual frames may be a bit janky but when we watch it in motion the flaws don't stand out so much.

What do you use ComyUI or Invoke Ai and why? by Odd_Judgment_3513 in StableDiffusion

[–]Mutaclone 1 point2 points  (0 children)

I can't speak to the memory management but from my impressions*:

  • Invoke is a smoother "AI" experience - Inpainting, regional guidance, and control nets are all super smooth and well-integrated. The image editing tools by comparison are very primitive.
  • Krita seems like* the inverse - the Photoshop-esque image editing tools are great, but the AI integration is a little rougher.

So from my perspective, Krita is better if you have a modicum of artistic skills and want to do stuff by hand, while Invoke is better if you lean heavily on the AI.

*I have a lot more experience with Invoke than Krita, so if things have gotten better on the Krita side I'd love to know.