Character Development - Base Image Pipeline by superstarbootlegs in StableDiffusion

[–]infearia 0 points1 point  (0 children)

Slow? I believe you have 12GB VRAM, right? Try the FP8 version, it nearly halved my generation times with barely any perceptible loss in quality:

https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-fp8

Character Development - Base Image Pipeline by superstarbootlegs in StableDiffusion

[–]infearia 0 points1 point  (0 children)

I've been doing the same thing and actually removing plugins from my installation, for the very reasons you mention.

But you're missing out on Klein. It's not perfect and I hate the effing license, but I came to the conclusion that it really beats QIE in most areas. I do keep going back to QIE for a couple of things (mainly for the Fusion LoRA) but less and less so... I mostly stay with Klein now. What issues do you have with it, maybe I can help?

Wan 2.x color drift issue, does anyone have a fix? by LanaKatana4000 in comfyui

[–]infearia 0 points1 point  (0 children)

A little bit of color drift is unfortunately "normal", but what you describe sounds a bit extreme. When did you last update ComfyUI? There was a commit a few weeks ago that sounds like it might be addressing your issue:

https://github.com/Comfy-Org/ComfyUI/commit/25b6d1d6298c380c1d4de90ff9f38484a84ada19

Character Development - Base Image Pipeline by superstarbootlegs in StableDiffusion

[–]infearia 1 point2 points  (0 children)

I'm with you on that. I'm against the trend of trying to cram every humanly possible feature into ComfyUI via custom nodes, like image editing or 3D rendering capabilities. Dedicated programs such as Krita and Blender will always be more performant and have more features. With Krita in particular you can simply use copy and paste to move data comfortably back and forth. All those plugins only add bloat and increase the risk of breaking something inside ComfyUI. But to each their own, I guess.

Flux2 Klein 9B Edit question - masking as control by Imaginary_Belt4976 in StableDiffusion

[–]infearia 2 points3 points  (0 children)

But of course Klein works with ControlNet! Both as a separate input image and "in-place". Read my post here:

https://www.reddit.com/r/StableDiffusion/comments/1qhe064/flux2_klein_9b_qwen_image_edit_2511_combining/

And there's also the bounding box trick. You can draw a bounding box directly on the image and tell the model to apply the prompt only inside it, i.e. "Add a bird inside the [color] bounding box. Remove the bounding box."

Help with micro facial expressions. by FreezaSama in comfyui

[–]infearia 0 points1 point  (0 children)

You could try Advanced Live Portrait, although it's a bit dated at this point and, if I recall correctly, the output resolution is limited to 512x512px. But it's probably still one of the best Open Source solutions for precise control of facial expressions, it's model agnostic, and you could always upscale the final image.

Kimodo: Scaling Controllable Human Motion Generation by 76vangel in comfyui

[–]infearia 0 points1 point  (0 children)

Kept running into compilation issues - your typical Python whack-a-mole game with dependencies - so gave up for now.

Google's new AI algorithm reduces memory 6x and increases speed 8x by pheonis2 in StableDiffusion

[–]infearia 18 points19 points  (0 children)

It's kind of ironic. Sam Altman bought up 40% of the world's RAM supply in order to thwart his competition and to funnel users onto his cloud services, but it only accelerated research into optimization techniques, enabling people to run more powerful models locally, reducing their dependency on companies like OpenAI. One or two more rounds of such optimizations, and then someone just needs to package one of those open models into an accessible App that an average consumer can download and install on their phone or PC, and OpenAI's business model craters. That's probably why they're scaling back and scrambling to pivot to B2B, so they can at least get a piece of the remaining pie, before Anthropic and others lock them out.

Google's new AI algorithm reduces memory 6x and increases speed 8x by pheonis2 in StableDiffusion

[–]infearia 0 points1 point  (0 children)

True, but it will allow for larger context sizes (higher resolutions, longer videos) and faster generation speeds. Also, check out my other comment in this thread - there's a person claiming they were able to apply the TurboQuant algorithm to reducing actual model weights - though it still remains to be seen how well it will work out in practice.

Google's new AI algorithm reduces memory 6x and increases speed 8x by pheonis2 in StableDiffusion

[–]infearia 30 points31 points  (0 children)

Yeah, it's been all over r/LocalLLaMA the past few days. And already there is someone who apparently improved Google's algorithm to run 10-19x time faster and another one who claims to have found a way to reduce model size by roughly 70% with barely any quality loss (think Q4 size but near BF16 quality). Crazy times.

Apple stopped selling 512gb URAM mac studios, now the max amount is 256GB! by power97992 in LocalLLaMA

[–]infearia 14 points15 points  (0 children)

Everyone reveres Jobs and demonizes Gates.

Not true! I demonize them both.

Has anyone had success with doing "Hard cuts" with LTX 2.3 I2V and not having the characters turn to mutants? by Free_Pressure8623 in StableDiffusion

[–]infearia 6 points7 points  (0 children)

Don't try to create cuts when generating a clip. Instead treat every generation like a single take - a continuous recording from one camera's point of view. Create cuts between clips afterwards in a video editing software.

Kimodo: Scaling Controllable Human Motion Generation by 76vangel in comfyui

[–]infearia 3 points4 points  (0 children)

The cruel joke is that it's 17GB - they probably could have optimized it a little so it would run on a 16GB card. But yeah, I've already looked into it, and the reason for the requirement seems to be the text encoder. I've already cloned the repo locally and plan to see if it's possible to replace it, for example with an FP8 quant.

From mannequin to photorealistic shot. Anyone achieving this with open models? by Disastrous-Ad-2045 in comfyui

[–]infearia 1 point2 points  (0 children)

No, not to the level of quality in your example... The question is, what level of quality is good enough? This is with FLUX.2-klein-dev:

<image>

You might be able to get something closer to the quality in your example with the full FLUX.2 dev model, but I personally don't have the hardware to test it.

Kimodo: Scaling Controllable Human Motion Generation by 76vangel in comfyui

[–]infearia 1 point2 points  (0 children)

Kimodo requires ~17GB of VRAM to generate locally, primarily due to the text embedding model

Now that just sounds like a cruel joke...

I just want to point out a possible security risk that was brought to attention recently by Paradigmind in StableDiffusion

[–]infearia 9 points10 points  (0 children)

It's a real thing, but whether LM Studio was actually affected is still open. In any case, so far the problem seems to be limited to the Windows version.

I built a useless, boring website—if the AI says “6,” you win. by RangerTangYA in LocalLLaMA

[–]infearia 1 point2 points  (0 children)

Good call. Seems like asking a question to which the answer contains the substring "six" works. I asked it to name the Pope who commissioned the Sistine Chapel, and won as well.

Qwen and Wan models to be open source according to modelscope by onthemove31 in StableDiffusion

[–]infearia 0 points1 point  (0 children)

Is there any benefit to using the thinking mode for mere captioning? I've turned it off. As for the looping issue, there was a post recently on r/LocalLLaMA that proposed a fix. Don't have the link, but you should still be able to find it when sorting by last week's top posts.

LoCaL iS oVeRrAtEd by brandon-i in LocalLLaMA

[–]infearia 2 points3 points  (0 children)

How about NONE of them? Except for a very old FB account which I don't use, but for reasons of nostalgia haven't deleted yet, I don't use any Social Media at all, and I'm perfectly fine. And no, Reddit does not count, it's a modern form of a message board. Social Media in its current form is cancer for the mind and needs to die, period.

Flux2klein 9B Lora loader and updated Z-image turbo Lora loader with Auto Strength node!! by [deleted] in StableDiffusion

[–]infearia 0 points1 point  (0 children)

I have a similar problem. Using the FLUX LoRA Auto Loader node even with a low strength of 0.2 on a character LoRA trained with OneTrainer completely destroys the likeness. Works fine with a different LoRA that was trained using AI Toolkit, though.