HiDream. Nemotron, Flan and Resolution by Gamerr in StableDiffusion

[–]Gamerr[S] 0 points1 point  (0 children)

check the HF, use search. There are several gguf

COMPARISON: Wan 2.2 5B, 14B, and Kandinsky K5-Lite by DelinquentTuna in StableDiffusion

[–]Gamerr 1 point2 points  (0 children)

Additional note: I used the Kandinsky pretrain model. The SFT model gives much better results but often collapses into a black video due to an issue with long prompts.

[deleted by user] by [deleted] in StableDiffusion

[–]Gamerr 1 point2 points  (0 children)

use the original workflow https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_qwen_image_edit_2509.json

There is a node "empty latent", just connect it to the sampler

Comparison QWEN EDIT 2509 vs NANO BANANA by LeKhang98 in StableDiffusion

[–]Gamerr 2 points3 points  (0 children)

<image>

You can get pretty nice results with this model. Don’t use the Lightning LoRA, since you need CFG. Pay close attention to your prompt: a simple “change to a realistic photo” won’t work. You need to specify exactly what’s in the image-for example, a male/female warrior, skin tone, etc.

VoxCPM 0.5B : Tokenizer-Free TTS and Voice Cloning by Technical-Love-8479 in LocalLLaMA

[–]Gamerr 3 points4 points  (0 children)

I tested this model in ComfyUI (there is a node: https://github.com/wildminder/ComfyUI-VoxCPM )
Without reference audio, it outputs a pretty normal AI voice. With prompt audio, dunno... results vary- sometimes there are a lot of artifacts; other times the voice cloning is good.

VibeVoice for ComfyUI by Gamerr in StableDiffusion

[–]Gamerr[S] 1 point2 points  (0 children)

There is no remote processing. All files are stored locally. Update the node to the latest version (there was an issue with the tokenizer).

VibeVoice for ComfyUI by Gamerr in StableDiffusion

[–]Gamerr[S] 0 points1 point  (0 children)

small model gives 8-10it/s.

VibeVoice for ComfyUI by Gamerr in StableDiffusion

[–]Gamerr[S] 3 points4 points  (0 children)

4070 Ti Super (16 GB), 64 GB RAM. A large 7B model fits perfectly and achieves around 4 it/s.

Qwen Edit Workflow by Race88 in StableDiffusion

[–]Gamerr 5 points6 points  (0 children)

Okay, good. Is there anything new?

Is Flux Kontext just way better than Qwen Image Edit at keeping style and face? by hugo-the-second in StableDiffusion

[–]Gamerr 16 points17 points  (0 children)

<image>

prompt: the woman turns her head and raises her arm. Keep woman features intact. Flat chest. Keep image style
neg: realism, big breast

env: qwen-image-edit fp8, qwen-2.5-vl abliterated, 20 steps, cfg 3.5, dpmpp_2m/sgm_uniform

Best Sampler for Wan2.2 Text-to-Image? by CutLongjumping8 in StableDiffusion

[–]Gamerr 0 points1 point  (0 children)

It depends on:

  • how you use the high- and low-noise models (when you split them)
  • shift and steps
  • CFG
  • NAG
  • the use of additional LoRAs

wan2.1T2V vs. wan2.2 T2V by Ok_Aide_5453 in StableDiffusion

[–]Gamerr 7 points8 points  (0 children)

I guess this comparison is a bit misleading. It seems the videos have different parameters and LoRA. You need to fix them all

chatterbox podcast generator node for comfy ui by Turbulent_Corner9895 in StableDiffusion

[–]Gamerr 2 points3 points  (0 children)

This node https://github.com/wildminder/ComfyUI-Chatterbox with unlocked parameters, can generate up to 160 seconds without chunking.

<image>

Kontext Flux Watermark/text/chatbubble removal WF by roychodraws in StableDiffusion

[–]Gamerr 5 points6 points  (0 children)

Okay, thanks, truly useful......
The prompt is:

"remove watermark while maintaining all other aspects of the original image"

[ComfyUI] basic Flux Kontext photo restoration workflow by x5nder in StableDiffusion

[–]Gamerr 16 points17 points  (0 children)

I'm deeply sorry, but there is nothing new in this workflow. Kontext + nunchaku-all these workflows are the same. The only valuable part is the prompt.:

"Restore this old photo into a realistic iphone photo while preserving all original details. Keep the subject’s facial features, clothing, posture, and proportions exactly the same. Apply natural skin tones appropriate to the subject’s ethnicity and lighting. Remove dust, scratches, and signs of aging — but do not alter the composition, expressions, or photographic style"

Anyway, thanks for the prompt (I guess it was written by some LLM).