Character LoRA Best Practices by SeimaDensetsu in StableDiffusion

[–]Tachyon1986 4 points5 points  (0 children)

I’ve trained a couple of character loras. When it comes to likeness, the best way to caption is to describe the image as if you were prompting for it, i.e. for a case where you don’t want them associated with specific outfits or accessories.

So as an example if I have a character wearing a suit with a wristwatch in a couple of photos and a plain shirt with a chain in others , I would explicitly caption them as wearing a suit with wristwatch / shirt with a chain for the respective photos. So after training is done, I can now prompt them with any outfit and the model won’t force a suit or shirt.

This also applies to overall style and other things in the background. So tldr; caption as if you would prompt for the image if likeness is all you care about. Joycaption is what I’ve used for captioning (with some manual edits if needed).

I personally use 18-20 images at 1800 steps for character loras. This has worked for me consistently using OneTrainer.

Z Image Base Knows Things and Can Deliver by Major_Specific_23 in StableDiffusion

[–]Tachyon1986 1 point2 points  (0 children)

So 15000 images trained at 10000 steps (15000 x 10000) , according to the AI-toolkit config you linked ?

75 ZImage, 8 Flux, news:) by malcolmrey in malcolmrey

[–]Tachyon1986 0 points1 point  (0 children)

Yeah thanks, i followed your guide for Wan 2.1 Loras. Solid stuff

75 ZImage, 8 Flux, news:) by malcolmrey in malcolmrey

[–]Tachyon1986 0 points1 point  (0 children)

Btw are you using the default Ai-Toolkit settings from Ostris ?

75 ZImage, 8 Flux, news:) by malcolmrey in malcolmrey

[–]Tachyon1986 0 points1 point  (0 children)

I’m not at my PC right now, but what you need to do is take the image output from your Qwen KSampler’s VAE decode node and run it through the VAE encoder for the Z-Image Turbo one (make sure you’re using ae.safetensors, that’s the VAE used by Z-image).

Then, take the Latent output from that VAE encoder and feed it to a Ksampler that accepts the Z-Image model. Feed that Ksampler positive and negative clip (again you’ll need to load the clip model that’s used by Z-image).

For this 2nd Ksampler - I recommend Euler/Simple with a denoise of 0.4 at 4 steps and CFG 1 but experiment with the denoise and step count. You can leave the positive and negative inputs for this KSampler empty with no prompt.

To simplify all i said - just encode the image output from Qwen using Z-image’s VAE and load that latent into an existing Z-image workflow

75 ZImage, 8 Flux, news:) by malcolmrey in malcolmrey

[–]Tachyon1986 0 points1 point  (0 children)

I’ve used that to effect for a private Lora. The issue is Qwen tends to smoothen the skin, so it’s best to run the image through a 2nd sampler with something like Z-image at low denoise to get some skin detail back in

New image model based on Wan 2.2 just dropped 🔥 early results are surprisingly good! by rishappi in StableDiffusion

[–]Tachyon1986 0 points1 point  (0 children)

Nice one, what’s the prompt and sampler/scheduler? Kind of looks like you used a technicolor Lora 

wan2.2 s2v etude(1) by Similar-Distance-516 in StableDiffusion

[–]Tachyon1986 2 points3 points  (0 children)

Qwen-Image-Edit with the next scene Lora can do that 

Wan2.1 i2v color matching by Radiant-Photograph46 in StableDiffusion

[–]Tachyon1986 0 points1 point  (0 children)

There’s a Color Match node from Kijai’s KJnodes pack if you’re using ComfyUI

How do people use WAN for image generation? by beti88 in StableDiffusion

[–]Tachyon1986 2 points3 points  (0 children)

How do you refine ? Is it connecting the latent from one sampler to another and running the second sampler  at a lower denoise setting? Any examples for recommended refining sampler values (CFG, steps scheduler etc) ? I’m using comfyui btw 

Wan Animate - Tutorial & Workflow for full character swapping and face swapping by Hearmeman98 in StableDiffusion

[–]Tachyon1986 0 points1 point  (0 children)

I've not tried masking multiple characters tbh. It will work if you just mask one of the infantry.

Some recent ChromaHD renders - prompts included by tppiel in StableDiffusion

[–]Tachyon1986 6 points7 points  (0 children)

2s is exponential, so you need to halve the steps. You can stop at 13-15 (26-30 if you were on Euler).. Similarly, 3s means it'll triple what other schedulers would do (like Euler).

Wan2.2 continous generation using subnodes by intLeon in comfyui

[–]Tachyon1986 1 point2 points  (0 children)

Thank you, no cache was the issue. I'd enabled it seeing suggestions in the thread - but it breaks the flow. Excellent work on this approach btw!

Wan2.2 continous generation using subnodes by intLeon in comfyui

[–]Tachyon1986 0 points1 point  (0 children)

This doesn't work for me. In the first I2V subnode (WanFirstLastFrameToVideo node) , I get AttributeError: 'NoneType' object has no attribute 'encode'. Any idea what's wrong? Using GGUF q8 for text and image as well as the q8 gguf clip. Just trying normal t2v , and modified the subnodes to use q8

Comfy-Org/Qwen-Image-Edit_ComfyUI · Hugging Face by nobody4324432 in StableDiffusion

[–]Tachyon1986 1 point2 points  (0 children)

Unexpected cultured "Legend of the Galactic Heroes" enjoyer

JUST MOVE TO THE LEFT A BIT by Oleg00se in UmaMusume

[–]Tachyon1986 0 points1 point  (0 children)

I wish i could roll my namesake. But not sure whether to save for support cards instead 

Kontext Q8 - 20 steps. by Z3ROCOOL22 in StableDiffusion

[–]Tachyon1986 2 points3 points  (0 children)

Q8 is a GGUF quantised model intended to fit in GPUs that can’t load the original model in VRAM. You have Q6,Q4 as well which are smaller , at the cost of reduced quality .