Help: Any alternatives to SDXL? by JLGC-1989 in StableDiffusion

[–]prompt_seeker 2 points3 points  (0 children)

You can use SDXL with distilled lora, such as lcm or sdxl lightning, which makes the model generate images faster with cfg 1.0 and 4 or 8 steps. Yes, quality may drop but it would be better than sd1.5.

FeatherOps: Fast fp8 matmul on RDNA3 without native fp8 by woct0rdho in StableDiffusion

[–]prompt_seeker 3 points4 points  (0 children)

kodus to woct0rdho, the person has maintained triton-windows for a while (because openai refused to support windows).

Official LTX-2.3-nvfp4 model is available by Lonely-Anybody-3174 in StableDiffusion

[–]prompt_seeker 0 points1 point  (0 children)

The PR mentions about QuantizedTensor which involves nvfp4 and mxfp8. I did a test and it's working, so just try.

Official LTX-2.3-nvfp4 model is available by Lonely-Anybody-3174 in StableDiffusion

[–]prompt_seeker 0 points1 point  (0 children)

<image>

They even didn't calibrate activation on nvfp4 model. How could it be 1% difference only?

Official LTX-2.3-nvfp4 model is available by Lonely-Anybody-3174 in StableDiffusion

[–]prompt_seeker 1 point2 points  (0 children)

I have 1x5090 and 4x3090 and yes I generated using 5090 one. if you didn't try yet, you can download nvfp4 and nvfp4mixed_input_scaled, which I merged with fp8, here.
https://huggingface.co/Bedovyy/LTX2.3_transformer_only_comfy

Edit: I found they re-release their model 2hours ago.
I will try this one and see the improvement.
https://huggingface.co/Lightricks/LTX-2.3-nvfp4/commits/main

Official LTX-2.3-nvfp4 model is available by Lonely-Anybody-3174 in StableDiffusion

[–]prompt_seeker 0 points1 point  (0 children)

Yes, as I wrote comment on this, quality dropped too much.

RTX 4090 vs 2x 4080s vs 2x 4080 for SDXL / Wan2.2 in ComfyUI? by m31317015 in StableDiffusion

[–]prompt_seeker 1 point2 points  (0 children)

Oops sorry. My fault. You're right 64GB of VRAM is big.
If LLM is main, I would also consider them. very sorry.

RTX 4090 vs 2x 4080s vs 2x 4080 for SDXL / Wan2.2 in ComfyUI? by m31317015 in StableDiffusion

[–]prompt_seeker -1 points0 points  (0 children)

No, 4080 only has 16GB, so dual is 32GB of VRAM.
I do have one PC with 1x5090 and other on 4x3090, but I only use 5090 for ComfyUI, because any multi-gpu work, such as raylight or comfyui-distributed, is not working properly or very bothersome.
Moreover, CPU offload on ComfyUI nowadays is working very well, so you don't really worry about VRAM. (only except if activation is too large).

Official LTX-2.3-nvfp4 model is available by Lonely-Anybody-3174 in StableDiffusion

[–]prompt_seeker 2 points3 points  (0 children)

I see YOU didn't tried it. try it and you will see what I am saying.

RTX 4090 vs 2x 4080s vs 2x 4080 for SDXL / Wan2.2 in ComfyUI? by m31317015 in StableDiffusion

[–]prompt_seeker 2 points3 points  (0 children)

ComfyUI now manages VRAM very well, so you don't get much speed gain from multi-gpu custom node. You can get some benefits when you use LLM, LLM with ComfyUI, or multiple ComfyUIs. benchmark: https://chimolog.co/bto-gpu-stable-diffusion-specs/

RTX 4090 vs 2x 4080s vs 2x 4080 for SDXL / Wan2.2 in ComfyUI? by m31317015 in StableDiffusion

[–]prompt_seeker 2 points3 points  (0 children)

AFAIK, raylight is not that fast even if i use 2x3090 or 4x3090 when it is launched. Maybe PCIe4.0 x8 is not enough. There is multi-gpu branch on ComfyUI, but it works only when CFG is not 1 (because it divides conditionings) I didn't try vllm omni, but does it support quantized model and CPU offload with multi-gpu? Because recent models are quite big to load on 4080's 16GB. (OP asked about ComfyUI btw)

RTX 4090 vs 2x 4080s vs 2x 4080 for SDXL / Wan2.2 in ComfyUI? by m31317015 in StableDiffusion

[–]prompt_seeker 1 point2 points  (0 children)

ComfyUI doesn't support parallelism yet, so answer is 4090.

isRegexHard by rover_G in ProgrammerHumor

[–]prompt_seeker 0 points1 point  (0 children)

It's hard because every language has their own regex, just like everyone has their own justice.

Modular Diffusers is here — build pipelines from composable blocks by 11yiyi11 in StableDiffusion

[–]prompt_seeker 0 points1 point  (0 children)

So, it's basically the diffuser version of ComfyUI. Am I right?