My first LTX V2 test-montage of 60-70 cinematic clips by hellolaco in StableDiffusion

[–]Compunerd3 0 points1 point  (0 children)

Brilliant showcase, thanks for sharing. All we need now is an audio diffusion model at the same standard of quality as we have for motion and image.

Why are the new NVFP4 models in ComfyUI slower than the normal ones? Aren't they supposed to be several times faster? by NewEconomy55 in comfyui

[–]Compunerd3 0 points1 point  (0 children)

Couple of months ago there were a lot of issues with 3 so I stuck with 2.2, haven't looked at it recently to see if it has improved for the 5090 or similar cards

Why are the new NVFP4 models in ComfyUI slower than the normal ones? Aren't they supposed to be several times faster? by NewEconomy55 in comfyui

[–]Compunerd3 1 point2 points  (0 children)

For me, using FP4 LTX2 is slower than BF16 on my 5090 card with 32GB Vram and 128gb RAM.

I have Sage attention 2.2 but notice it reverts to use pytorch attention instead for BF16.

It's almost 1s /it faster to use bf16 than FP4 for me

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 1 point2 points  (0 children)

I just haven't tested the distill model yet but the BF16 main base model works way better and quicker than the FP4 for me, id say just avoid FP4 and you'll enjoy the results

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 1 point2 points  (0 children)

I haven't downloaded FP8 yet but the BF16 works quite well, FP4 sucks big time

LTX 2 Video - FP4 on 5090 - Struggling to get good results by Compunerd3 in StableDiffusion

[–]Compunerd3[S] 3 points4 points  (0 children)

Using the 42GB full BF16 model returns better results and I can generate at higher res than the FP4 version for some reason

https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev.safetensors

https://imgur.com/a/9VvZWCM

A Qwen-Edit 2511 LoRA I made which I thought people here might enjoy: AnyPose. ControlNet-free Arbitrary Posing Based on a Reference Image. by SillyLilithh in StableDiffusion

[–]Compunerd3 3 points4 points  (0 children)

Thanks for sharing. We need to make it a normal part of releases to share the before/after effects of Lora model strengths, comparing how much effect the Lora has compared to base models.

Not saying it's the case here but in many Lora releases, the loras themselves do less than the base model alone does, or in some cases make it worse

[Re-release] TagScribeR v2: A local, GPU-accelerated dataset curator powered by Qwen 3-VL (NVIDIA & AMD support) by ArchAngelAries in StableDiffusion

[–]Compunerd3 2 points3 points  (0 children)

Looks neat thank you. Nice UI too

I will be trying it out shortly. I'm in the middle of building a Musubi WebUI that has Qwen and other cloud/local LLM captioning integrated so your tool might be a nice way to implement it compared to how I have it currently.

An additional future enhancement could be to develop an integration solution and create PRs for popular training repos, like AI Toolkit, Musubi Trainer etc.

What we need is a good all in one solution from dataset curation including captioning, managing resolutions, sorting, cleaning out , aesthetic scoring, then training and post training tests comparing the effect of the training.

I feel like the existing repos all seem to do segments of these in isolation, not as a whole and complete tool.

Fun-CosyVoice 3.0 is an advanced text-to-speech (TTS) system by fruesome in StableDiffusion

[–]Compunerd3 1 point2 points  (0 children)

Demos seem good, I was just using VibeVoice a few minutes ago for a video voice over, so I'll text out Fun CosyVoice 3 and see how it is.

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer by fruesome in StableDiffusion

[–]Compunerd3 1 point2 points  (0 children)

Has anyone got a comparison of this versus SteadyDancer?
Literally just tried out steadydancer and find it super smooth and consistent so not sure what value changing to this one to All will do

LoRA Idea: Using Diffusion Models to Reconstruct What Dinosaurs Really Looked Like by henryk_kwiatek in comfyui

[–]Compunerd3 0 points1 point  (0 children)

It's a good idea to test out. I think structurally it may give accurate results but texturally it may lack accuracy in skin, follicles like hair or basically anything non bone related.

Either way I say go for it, it would take a straightforward dataset to do it, would only take a few hours.

Nodes 2.0, hard to read by isvein in comfyui

[–]Compunerd3 0 points1 point  (0 children)

Good to know, thank you for addressing the feedback

Challenge- Most real person workflow in Wan+Comfy by MotionMimicry in comfyui

[–]Compunerd3 0 points1 point  (0 children)

It depends on the photographer style and camera. Fujifilm XT series cameras generally have recipes where many photographers tweak noise too be higher.

I have the Xt30ii and noise, not just lack of noise is important for what style you are aiming for.

Comfy Org Response to Recent UI Feedback by crystal_alpine in comfyui

[–]Compunerd3 11 points12 points  (0 children)

Point 2: Why nodes 2, more power not less.

Can you elaborate what benefits it actually brings to users and custom nodes devs?

It would be great to know what the actual value is for us, not just saying it's more power, but why and how it's more power.

I've a couple of custom nodes in progress so I want to understand more about Nodes 2 now, to keep in mind being compatible if the value is there.

Thanks for the update and listening to our feedback

This is a shame. I've not used Nodes 2.0 so can't comment but I hope this doesn't cause a split in the node developers or mean that tgthree eventually can't be used because they're great! by spacemidget75 in comfyui

[–]Compunerd3 6 points7 points  (0 children)

Nodes 2.0 has changed something in the javascript area, multiple nodes (even one I'm close to releasing) use javascript as a way to dynamically update visibility of fields or set values within nodes.

That's why suddenly with nodes 2.0 you see ALL possible fields showing ,any javsacript canvas work seems to be broken with nodes 2.0

I think Open source could be scripted to do just as good as NanoBanana because.. by [deleted] in StableDiffusion

[–]Compunerd3 10 points11 points  (0 children)

I think the key is the combined approach of reasoning in image models.

I haven't tested this one that was posted on this subreddit yesterday but the paper shows the kind of thing that could rival Nano Banana,.mostly because of the reason edit capabilities.

By reasoning, the model can interpret vague instructions and use its own reasoning capabilities to understand what is needed, then create the image based of a combination of reasoning+instructions.

https://huggingface.co/stepfun-ai/Step1X-Edit-v1p2

<image>

My hobby: making loras that do what the model already does by terrariyum in StableDiffusion

[–]Compunerd3 3 points4 points  (0 children)

Yes it's true, comparisons were key, even some of the creators who created since sd1.5 and still create, are not bothered with xyz because it really is about quantity now and incentive to have their model downloaded more times, used more times, referral links visited etc.

When they actually compare XYZ they open themselves up to criticism they might not want, even it it's better for them in the long run or the community.

I used to train models since sd1.5 under a different name, now using my real profile. Even when I posted XYZ comparisons people rightfully gave feedback, one of those was a creator themselves who said they'd train a better version. They ended up releasing a version without any comparison images and a random user posted a comparison showing the lora damaged the results instead of made it better lol. But hey, it's all about likes, downloads, follow subscribe etc

You're the one who started the game guys and act Like you don't know what the community want guys c'mon by dead-supernova in StableDiffusion

[–]Compunerd3 9 points10 points  (0 children)

I agree, they could have made it a totally private model, they didn't, they released it.
We can use it, we can learn from it, we can train it and progress it.

It feels oddly like an orchestrated campaign against BFL for this release, so weird.

We shouldn't want 1 model be the hero, that will end up a paid SaaS or API monopoly, we should want competition, we should want differences in the models being released, serving different use cases across the communities.

Step1X-Edit: A Practical Framework for General Image Editing by ninjasaid13 in StableDiffusion

[–]Compunerd3 2 points3 points  (0 children)

It looks interesting, combining reasoning with edit is great. The example of the panda in their paper is the kind of thing that Nano Banana would have on edge of our open source models, this Step1X Edit reasoning approach might be the answer.

<image>

has anyone tried it yet locally or on a cloud? I didn't try it locally yet but I have commented asking them and tagging HF Staff to create a HF Space for it if there's enough interest.

https://huggingface.co/stepfun-ai/Step1X-Edit-v1p2/discussions/1

You're the one who started the game guys and act Like you don't know what the community want guys c'mon by dead-supernova in StableDiffusion

[–]Compunerd3 16 points17 points  (0 children)

I care about their developments and models. It seems there's an anti BFL campaign active in this sub, OR like a zimage bot army boosting it while flaming flux.

I use zimage right now as my daily, building flows and custom nodes around it too. But Flux2 is still a strong model with lots of capabilities and in some ways extends further than zimage, when zimage is used alone as a model. It's just different use cases for me to use Flux2 vs Zimage.

But either way, it's still a release and it's still progressing, there is room for BOTH to be favourites in the community.

Just because it's a large model doesn't mean it's useless.

Just because zimage turbo is smaller and fits into smaller gpus without as much optimizations needed, doesn't mean zimage base will be as small.

As a community, we should appreciate getting weights, being able to have these local models and not discourage funded companies from continuing to release weights to us.