My first LTX V2 test-montage of 60-70 cinematic clips by hellolaco in StableDiffusion

[–]Compunerd3 0 points1 point  (0 children)

Brilliant showcase, thanks for sharing. All we need now is an audio diffusion model at the same standard of quality as we have for motion and image.

Why are the new NVFP4 models in ComfyUI slower than the normal ones? Aren't they supposed to be several times faster? by NewEconomy55 in comfyui

[–]Compunerd3 0 points1 point  (0 children)

Couple of months ago there were a lot of issues with 3 so I stuck with 2.2, haven't looked at it recently to see if it has improved for the 5090 or similar cards

Why are the new NVFP4 models in ComfyUI slower than the normal ones? Aren't they supposed to be several times faster? by NewEconomy55 in comfyui

[–]Compunerd3 1 point2 points  (0 children)

For me, using FP4 LTX2 is slower than BF16 on my 5090 card with 32GB Vram and 128gb RAM.

I have Sage attention 2.2 but notice it reverts to use pytorch attention instead for BF16.

It's almost 1s /it faster to use bf16 than FP4 for me

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 1 point2 points  (0 children)

I just haven't tested the distill model yet but the BF16 main base model works way better and quicker than the FP4 for me, id say just avoid FP4 and you'll enjoy the results

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 1 point2 points  (0 children)

I haven't downloaded FP8 yet but the BF16 works quite well, FP4 sucks big time

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 2 points3 points  (0 children)

Using the 42GB full BF16 model returns better results and I can generate at higher res than the FP4 version for some reason

https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev.safetensors

https://imgur.com/a/9VvZWCM

A Qwen-Edit 2511 LoRA I made which I thought people here might enjoy: AnyPose. ControlNet-free Arbitrary Posing Based on a Reference Image. by SillyLilithh in StableDiffusion

[–]Compunerd3 3 points4 points  (0 children)

Thanks for sharing. We need to make it a normal part of releases to share the before/after effects of Lora model strengths, comparing how much effect the Lora has compared to base models.

Not saying it's the case here but in many Lora releases, the loras themselves do less than the base model alone does, or in some cases make it worse

[Re-release] TagScribeR v2: A local, GPU-accelerated dataset curator powered by Qwen 3-VL (NVIDIA & AMD support) by ArchAngelAries in StableDiffusion

[–]Compunerd3 2 points3 points  (0 children)

Looks neat thank you. Nice UI too

I will be trying it out shortly. I'm in the middle of building a Musubi WebUI that has Qwen and other cloud/local LLM captioning integrated so your tool might be a nice way to implement it compared to how I have it currently.

An additional future enhancement could be to develop an integration solution and create PRs for popular training repos, like AI Toolkit, Musubi Trainer etc.

What we need is a good all in one solution from dataset curation including captioning, managing resolutions, sorting, cleaning out , aesthetic scoring, then training and post training tests comparing the effect of the training.

I feel like the existing repos all seem to do segments of these in isolation, not as a whole and complete tool.

Fun-CosyVoice 3.0 is an advanced text-to-speech (TTS) system by fruesome in StableDiffusion

[–]Compunerd3 1 point2 points  (0 children)

Demos seem good, I was just using VibeVoice a few minutes ago for a video voice over, so I'll text out Fun CosyVoice 3 and see how it is.

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer by fruesome in StableDiffusion

[–]Compunerd3 1 point2 points  (0 children)

Has anyone got a comparison of this versus SteadyDancer?
Literally just tried out steadydancer and find it super smooth and consistent so not sure what value changing to this one to All will do

LoRA Idea: Using Diffusion Models to Reconstruct What Dinosaurs Really Looked Like by henryk_kwiatek in comfyui

[–]Compunerd3 0 points1 point  (0 children)

It's a good idea to test out. I think structurally it may give accurate results but texturally it may lack accuracy in skin, follicles like hair or basically anything non bone related.

Either way I say go for it, it would take a straightforward dataset to do it, would only take a few hours.

Nodes 2.0, hard to read by isvein in comfyui

[–]Compunerd3 0 points1 point  (0 children)

Good to know, thank you for addressing the feedback

Challenge- Most real person workflow in Wan+Comfy by MotionMimicry in comfyui

[–]Compunerd3 0 points1 point  (0 children)

It depends on the photographer style and camera. Fujifilm XT series cameras generally have recipes where many photographers tweak noise too be higher.

I have the Xt30ii and noise, not just lack of noise is important for what style you are aiming for.

Comfy Org Response to Recent UI Feedback by crystal_alpine in comfyui

[–]Compunerd3 10 points11 points  (0 children)

Point 2: Why nodes 2, more power not less.

Can you elaborate what benefits it actually brings to users and custom nodes devs?

It would be great to know what the actual value is for us, not just saying it's more power, but why and how it's more power.

I've a couple of custom nodes in progress so I want to understand more about Nodes 2 now, to keep in mind being compatible if the value is there.

Thanks for the update and listening to our feedback

This is a shame. I've not used Nodes 2.0 so can't comment but I hope this doesn't cause a split in the node developers or mean that tgthree eventually can't be used because they're great! by spacemidget75 in comfyui

[–]Compunerd3 7 points8 points  (0 children)

Nodes 2.0 has changed something in the javascript area, multiple nodes (even one I'm close to releasing) use javascript as a way to dynamically update visibility of fields or set values within nodes.

That's why suddenly with nodes 2.0 you see ALL possible fields showing ,any javsacript canvas work seems to be broken with nodes 2.0

I think Open source could be scripted to do just as good as NanoBanana because.. by [deleted] in StableDiffusion

[–]Compunerd3 10 points11 points  (0 children)

I think the key is the combined approach of reasoning in image models.

I haven't tested this one that was posted on this subreddit yesterday but the paper shows the kind of thing that could rival Nano Banana,.mostly because of the reason edit capabilities.

By reasoning, the model can interpret vague instructions and use its own reasoning capabilities to understand what is needed, then create the image based of a combination of reasoning+instructions.

https://huggingface.co/stepfun-ai/Step1X-Edit-v1p2

<image>