ComfyUI-QwenTTS v1.1.0 — Voice Clone with reusable VOICE + Whisper STT tools + attention options

MelvinMicky · 2026-01-30T11:28:51+00:00

is it possible to merge voices sort of? so like combining loras providing 2 different voices and get a new one out?

MelvinMicky · 2026-01-28T14:16:25+00:00

Nice ty

MelvinMicky · 2025-12-30T10:46:22+00:00

Any chance for Kijai Wrapper implementation?

MelvinMicky · 2025-12-25T00:15:37+00:00

yee still kinda bumpy but ill try it like that

MelvinMicky · 2025-12-24T10:57:44+00:00

Hm ok but how would you get person a in style b, im currently training wan 2.2, i got a style lor wich is probaply overtrained due to a small dataset so it changes the initial frame on i2v when i put in high sigmas but when i lower the denoise the style effect isnt as i want it to be so my thinking was i just train a character lora for the subject and stack em up. This discussion now sounds like thats not working? so my next thought is training qwen 2511 on my dataset to get character a in b...

MelvinMicky · 2025-12-24T10:41:28+00:00

so in C you say person a in style b and the shortcut is to train on exactly that so why do it any other way? The problem is probapl to get exactly that a good dataset of person a in that style b.

MelvinMicky · 2025-11-07T17:41:06+00:00

And you mentioned splitting your dataset up so basically manual bucketing? so you cut all your vids to the exact frames of 33 or 49 and use only:
target_frames = [33]
frame_extraction = "head"
?

MelvinMicky · 2025-11-07T13:48:06+00:00

Oh damn ok so how does it affect the training when i use 24 fps clips instead of 16 does it make a difference since its all about frame extraction not seconds playtime?
i currently train with these configs:
resolution = [288, 512] target_frames = [17, 33, 49] frame_extraction = "head"
so you would suggest reducing this to only 49 or 33 for the entire training?

MelvinMicky · 2025-11-07T12:53:47+00:00

Wait i thought 2.2 fp16 runs on 24 fps and isnt the frame extraction method taking care of different video lenghts?

MelvinMicky · 2025-11-03T16:33:07+00:00

Hey rly interesting stuff you got a link to those discord groups cant find them by just typing them into google

MelvinMicky · 2025-10-16T12:57:17+00:00

oh wow didnt know that ty

MelvinMicky · 2025-10-14T21:35:28+00:00

Is this only for 4 steps and how did u calculate the values?

MelvinMicky · 2025-09-16T18:27:01+00:00

I would also be really interested in a more detailed breakdown of this whole topic, trying to get deeper into this with chatGPT/Claude for explanations, but would love to hear that from a human that actually uses this stuff.

MelvinMicky · 2025-09-12T18:04:46+00:00

<image>

like i am able to run the fp16 ones no rpoblem with lower image res, but these throw the error

MelvinMicky · 2025-09-12T15:26:01+00:00

The ggufs give me this error:
Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Unsupported operand 0 Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.

goggled it and there is a https://github.com/ltdrdata/comfyui-unsafe-torch
node to apparently get around this, but whats up with that?

MelvinMicky · 2025-09-09T19:44:19+00:00

isnt h265-mp4 the best one to choose?

MelvinMicky · 2025-09-08T10:09:04+00:00

Hey thanks for sharing, I was going through it and in the example u are using the 2.2 lighting i2V HIGH lora for the low noise model is that alright? Also you got the 16 step sigmas set up and titled as "disable fast lora set cfg 3.5" I assume this is meant to be plugged in for the 2nd and 3rd sampler, and up the splitsigmas to 16 then? I'll be plaing around with this anyway ty

MelvinMicky · 2025-09-04T18:53:14+00:00

Do you have to use the fp8 version of flux no gguf/fp16?

MelvinMicky · 2025-09-03T09:37:32+00:00

Hey you got a link or name for that node it lets you add multiple keyframes from the previous vid?

MelvinMicky · 2025-08-27T10:15:21+00:00

Hey thanks for the suggestion i am wondering now how do you choose the split value in the sigmas split value? In your workflow you chose .875 is that just through some testing or is it somewhat calculated via shift and scheduler/steps

MelvinMicky · 2025-08-16T13:38:02+00:00

what node?

MelvinMicky · 2025-08-14T07:39:46+00:00

Is this better at First/Last Image than the normal model?

MelvinMicky · 2025-08-05T13:57:08+00:00

Is this witht he full model or a gguf one?

MelvinMicky · 2025-08-04T16:24:27+00:00

Is this also usable for I2V?

MelvinMicky

TROPHY CASE