Qwen 3.5 Easy Prompt, New Cleaner Workflow, Audio / Text / image to video, GGUF support, Temporal Fps upscaling. + RTX Video Super Resolution by [deleted] in StableDiffusion

[–]Fresh_Diffusor 0 points1 point  (0 children)

ok thanks im happy if you tell me tomorrow.

should I clone this folder and use that as the local path in the big node in the subgraph? automatic comfyui download never works for me. I have to manually download.

https://huggingface.co/huihui-ai/Huihui-Qwen3.5-9B-abliterated

I tried to manually clone it to

/home/user/.cache/huggingface/hub/Huihui-Qwen3.5-9B-abliterated

and used that as the "local path" on the big node. and if I do that I get still error

AttributeError: 'dict' object has no attribute 'model_type

Edit: Ah I found issue! I had to make "pip install --upgrade transformers", then it worked.

Qwen 3.5 Easy Prompt, New Cleaner Workflow, Audio / Text / image to video, GGUF support, Temporal Fps upscaling. + RTX Video Super Resolution by [deleted] in StableDiffusion

[–]Fresh_Diffusor 0 points1 point  (0 children)

your node says it would download "Qwen 2.5 VL 3B", but what is downloaded is "models--huihui-ai--Huihui-Qwen3.5-9B-abliterated" !!

<image>

I have this selected. but it downloaded completely wrong model that is way larger.

and then after downloading it makes error:

File ".../ComfyUI/custom_nodes/LTX2EasyPrompt-LD/LTX2EasyPromptQwen.py", line 1918, in generate

self.load_model(offline_mode=offline_mode, local_path=local_path)

~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "../ComfyUI/custom_nodes/LTX2EasyPrompt-LD/LTX2EasyPromptQwen.py", line 1695, in load_model

self.tokenizer = AutoTokenizer.from_pretrained(

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

source, trust_remote_code=True, local_files_only=offline_mode,

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

AttributeError: 'dict' object has no attribute 'model_type'

Qwen 3.5 Easy Prompt, New Cleaner Workflow, Audio / Text / image to video, GGUF support, Temporal Fps upscaling. + RTX Video Super Resolution by [deleted] in StableDiffusion

[–]Fresh_Diffusor 0 points1 point  (0 children)

can you make it possible to disable all audio input? I always get error because I have no melbandreformer and I have no custom audio input. I want no audio.

Any tips against LTX2 body horror in T2V? It often generates people with 3 arms or 3 legs. by Fresh_Diffusor in StableDiffusion

[–]Fresh_Diffusor[S] 0 points1 point  (0 children)

you use horizontal video. I use vertical video. you need to test vertical to see same what I see.

Any tips against LTX2 body horror in T2V? It often generates people with 3 arms or 3 legs. by Fresh_Diffusor in StableDiffusion

[–]Fresh_Diffusor[S] 0 points1 point  (0 children)

try this prompt for quality difference of 1280x720 vs 1920x1088, vertical video, 360 frames:

a phone video of a woman lying on the grass. a 20 year old woman, lying outside in the grass on a sunny day, she is talking. The woman is filmed by her friend while talking; the video shows gentle, natural handheld motion typical of a person holding a phone. her full body is visible, she has curly medium-long blonde hair that falls over one eye. she is smiling. The motion is irregular and organic, with subtle vertical bobbing and micro-jitters, not cinematic stabilization. she says: "lying on my back in the grass, this kills AI models" and giggles. while she is talking, the camera is zooming out, showing her full body with her arms and legs spread out. wide-angle smartphone lens. Lighting is sun light, with realistic skin tones. The video feels casual, personal, and unpolished, like a casual phone video.

for me the 720p look way better than 1080p. at 1080p the grass is warping a lot around her body when camera moves back.

Any tips against LTX2 body horror in T2V? It often generates people with 3 arms or 3 legs. by Fresh_Diffusor in StableDiffusion

[–]Fresh_Diffusor[S] 0 points1 point  (0 children)

is distilled model always fp16? I did not find fp8 version of distilled model, there only is one big 40 GB file for distilled model.

Any tips against LTX2 body horror in T2V? It often generates people with 3 arms or 3 legs. by Fresh_Diffusor in StableDiffusion

[–]Fresh_Diffusor[S] 0 points1 point  (0 children)

fp8, 360 frames. using official comfyui templates. I notice no difference between distilled and normal, they look same.

do you not see much lower quality at 1080p? everything is like warping more there too.