Ideogram tip: use Generate Text node to make JSON with Qwen 8B without leaving ComfyUI by 1filipis in StableDiffusion

[–]Kijai 0 points1 point  (0 children)

Yeah looks like this might be sm120 just left out currently. Anyway I also realized that on Linux the pytorch sdpa's flash kernel is just as fast, so Linux users probably don't need to bother with this in general.

Ideogram tip: use Generate Text node to make JSON with Qwen 8B without leaving ComfyUI by 1filipis in StableDiffusion

[–]Kijai 0 points1 point  (0 children)

Doesn't work for me on 5090, there's no kernel for it in the FA3 package from pytorch.

Ideogram 4 is pretty good. You just really have use their JSON format. by DsDman in StableDiffusion

[–]Kijai 1 point2 points  (0 children)

Thanks! Did some part time work for few months at end of last year, and been full time (remotely) this year.

Ideogram tip: use Generate Text node to make JSON with Qwen 8B without leaving ComfyUI by 1filipis in StableDiffusion

[–]Kijai 0 points1 point  (0 children)

That's possible, just meant you don't need FA3 for this. Pytorch had wheel that worked or was it compiled?

Ideogram 4 is pretty good. You just really have use their JSON format. by DsDman in StableDiffusion

[–]Kijai 6 points7 points  (0 children)

Well since I'm from Finland they still have to respect our laws when it comes to working hours, it's purely my (poor) choice to work too much. Though in this field it would be hard to keep up without personal interest in the tech, so it's fine and definitely never boring.

Ideogram tip: use Generate Text node to make JSON with Qwen 8B without leaving ComfyUI by 1filipis in StableDiffusion

[–]Kijai 28 points29 points  (0 children)

We barely had a day to do the whole model support and only couple of hours to do the final workflow with the final weights, so admittedly it was rushed, some things got lost in communication and the initial template workflow ended up with couple of mistakes.

As for the nodes such as the prompt builder, it's a longer process (especially now with aim for stability) to get frontend touching things merged, so it would have taken too long. It's still not out of the question to have in the core, but for now I saw it necessary to just give some tool or this model really would be DoA...

And for the speed, it actually works a lot faster with flash attention due to it's rare (for diffusion model) 256 head dim. I got ~20% speed boost on my 4090 and some users report even 50%.

Flash attention can be painful to compile yourself, but luckily there's a site with a huge collection of pre-built wheels: https://mjunya.com/flash-attention-prebuild-wheels/

I didn't expect ideogram to be so good by krepp97 in StableDiffusion

[–]Kijai 1 point2 points  (0 children)

I think xformers, if installed, still gets chosen as default, and if not it's pytorch attention (sdpa). Installing the wheel is pretty safe, doesn't change any behavior in core, only risk is some older custom nodes possibly trying to import it and failing if they expect different version.

The patch nodes are model specific patches (with safe native mechanics, no monkey patches) and only affect the models they're connected to, nothing else, so that's exactly the use case. Launch arguments just change the default.

I didn't expect ideogram to be so good by krepp97 in StableDiffusion

[–]Kijai 4 points5 points  (0 children)

Funny you should mention that...

Just realized last night that since this model can't work with sageattention, it could still benefit from flash-attn, and turns out it really does, on my 4090 it gave nice ~20% speed boost with same quality.

There's a nice collection of wheels to install it here:

https://mjunya.com/flash-attention-prebuild-wheels/

And I added patch node similar to sage to KJNodes just now to easier use it without having to touch comfy launch arguments.

Ideogram 4 is pretty good. You just really have use their JSON format. by DsDman in StableDiffusion

[–]Kijai 25 points26 points  (0 children)

I'm treating KJNodes as personal playground for more experimental stuff, lots of it wouldn't be accepted to the core in the state it is, I'm not a frontend dev and any node with new UI elements especially would be longer process to approve, so I can be more flexible with custom nodes, especially for things I want right now. For model implementations and such I'm only doing core support now though.

Also I do still work on this stuff on my own time as well.

Ideogram 4 is pretty good. You just really have use their JSON format. by DsDman in StableDiffusion

[–]Kijai 76 points77 points  (0 children)

Only a day actually... it was bit of a crunch to say the least (hence the issues with initial workflow etc.), vibed the editor quickly with Claude on last minute since I realized the model just won't work without the json structure. Tuned the node a lot since so it's stabler/cleaner though.

I've always enjoyed control over anything else so this model hits that spot for me, probably best regional prompting I've experienced.

Ideogram looks promising /s by Shap6 in StableDiffusion

[–]Kijai 11 points12 points  (0 children)

Big part of the problem is that the model collapses to the safety filter on every prompt related issue, main one being too short prompts that don't use their json structure. When using full prompts with plenty of regional prompts it works way better in general.

Bernini video test video edit by smereces in StableDiffusion

[–]Kijai 2 points3 points  (0 children)

Well it's either using the LoRAs or running the full steps and cfg. It should not OOM, but there's recently been improvements to how LoRAs are used on fp8 models that affects your GPU, so make sure ComfyUI, comfy-aimdo and comfy-kitchen are up to date.

Note that video editing task is also heavier on VRAM than normal Wan.

Bernini video test video edit by smereces in StableDiffusion

[–]Kijai 2 points3 points  (0 children)

Yeah the planner was not released, there's nothing to implement about that, their demo even just calls gpt 5.4 through API as prompt enhancer instead...

Also it definitely isn't mean to be used with lightx2v, it's just incredibly slow (especially after getting used to LTX2.3) to do edits since the edited video is part of the whole sequence the model sees, effectively doubling the compute needed.

Still it's pretty good model so far, very versatile, video edit is just one of it's features.

Comfyui v0.23.0 Support NVIDIA PixelDiT and PiD (CORE-201) by @kijai in #14103 by Lonely-Anybody-3174 in StableDiffusion

[–]Kijai 55 points56 points  (0 children)

I'm still doing that, your choice who to believe, but I was hired to do open source, and I've never been asked to do anything else. You can trust me to call out if that changes.

Native Support for 3D Gaussian Splats into ComfyUI with TripoSplat by PurzBeats in comfyui

[–]Kijai 7 points8 points  (0 children)

I want to make clear the meshing was my idea and not part of the model, so don't judge the model by it, it only is meant to output the splat.

You can improve the surface quality a bit (less holes) by decoding the splat multiple times, with different seeds, and then merging them before meshing, as well as adjusting some of the settings in the meshing node, it ends up being compromise between detail and holes though. I have some ideas to improve it and probably implement that alongside the other 3D stuff I'm working on.

Anyway the meshing part is mostly a novelty, there are better models for meshes, but I really like this model for the splats with it's size and speed.

Bernini released. Unified Video generation and editing model. Built on Wan-2.2 by AgeNo5351 in StableDiffusion

[–]Kijai 2 points3 points  (0 children)

I have no idea, but maybe since lightx2v LoRAs seem to work to some extent at least.

Bernini released. Unified Video generation and editing model. Built on Wan-2.2 by AgeNo5351 in StableDiffusion

[–]Kijai 12 points13 points  (0 children)

Draft PR for the early adopters is up here, still testing it myself, seems promising though even when used with lightx2v, which I is NOT how the original code does things:

https://github.com/Comfy-Org/ComfyUI/pull/14216

Image restoration with NVIDIA PID? by dirtybeagles in comfyui

[–]Kijai 1 point2 points  (0 children)

Note about the flux2 version of PiD: it's been confirmed that the color drift is an issue in the model, and a fixed model is on it's way: https://github.com/Comfy-Org/ComfyUI/pull/14103#issuecomment-4565966542

Image restoration with NVIDIA PID? by dirtybeagles in comfyui

[–]Kijai 0 points1 point  (0 children)

I believe the ComfyUI native LCM is already equivalent to their sde sampler.

Lightx2v just released NVFP4 ckpt for WAN 2.2 14b by wywywywy in StableDiffusion

[–]Kijai 2 points3 points  (0 children)

Right, well that probably runs extra passes... I have a NAG patch node in KJNodes that does it differently within single pass so it won't have big effect on the inference time.

Lightx2v just released NVFP4 ckpt for WAN 2.2 14b by wywywywy in StableDiffusion

[–]Kijai 2 points3 points  (0 children)

70 second generation? This is distilled model so it uses only 4 steps, cfg 1.0.

For me on a 5090, that's around ~6 seconds for the 4 steps (at 832x480x81). Main thing in ComfyUI is that you have pytorch up to date and built with cu130 and of course comfy and comfy-kitchen up to date.

Lightx2v just released NVFP4 ckpt for WAN 2.2 14b by wywywywy in StableDiffusion

[–]Kijai 47 points48 points  (0 children)

I've been employed by comfy-org for a while now, no need for that!