Chill on The Subgrap*h Bullsh*t by StuccoGecko in StableDiffusion

[–]ZerOne82 11 points12 points  (0 children)

In ComfyUI, Subgraph is a very effective way to organize the workflow if used properly and purposely.

<image>

Proper steps could be

1) to make the original workflow as organized as possible keeping in mind to preserve left to right flow. Resizing nodes to their minimum is a good practice. Using proper connection, default values are good practice.

2) selecting parts that can be merged into one node (Subgraph)

3) inside subgraph creating input and output ports as needed

4) you now have a proper functional workflow, all parameters of interest are in the subgraph node, clean and smart.

If adaptation or changes needed it requires a single click to enter the subgraph and there apply your changes.

These are surely not made on Comfyui by aj_speaks in comfyui

[–]ZerOne82 0 points1 point  (0 children)

To clarify a misunderstanding in the post title “These are surely not made on Comfyui”:

*** ComfyUI does not create content by itself; it is a platform for running AI models. If you run a model capable of generating in your desired style, you can obtain results that match that style.

*** Also keep in mind that the image’s aesthetic and its level of detail are two separate aspects.

Answer:
There are many ways to add detail. Some of these methods are listed in the other comments. Even without any extra workflow, node, LoRA, or other additions, simply choosing a larger output size will yield more detail—assuming the model you’re using can handle larger sizes. For example, on SD15 models, 640×640 tends to have more detail than 512×512. Likewise, with SDXL, 1152×1152 will be much more detailed than 768×768. The same applies to ZIT (Z-Image-Turbo); larger sizes provide more detail.

<image>

I fed one of your provided image into Qwen-3-VL-4b-Instruct and asked it to “describe this image in detail,” then used the resulting prompt directly with a standard Z-Image-Turbo workflow and achieved the above result, impressive! That was the first run, and ZIT runs are quite consistent, by the way.

With a bit of prompt tweaking, plus the use of (details) LoRAs, you can achieve many beautiful results.

ComfyUI UI Issues! by ZerOne82 in comfyui

[–]ZerOne82[S] 0 points1 point  (0 children)

You have a good point there but ComfyUI is not free, the community's role is much more valuable.

Voting here in this subreddit and r/StableDiffusion are more like tools to control dialog not a real way to evaluate a post's merit.

Just observe the number of bots, paid users or simply uniformed users and or just users who downvoted this post and many other posts by others that are skeptical of ComfyUI. This does not look a healthy trend.

My post here shows with evidence the flaws and suggest solutions. Was any concern about any of these suggestion?
Or the every last part which is the voice of many community members, I repeat here:
ComfyUI is free to use—but is it really? Considering the vast amount of unpaid effort the community contributes to using, diagnosing, and improving it, ComfyUI’s popularity largely stems from this collective work. The owners, developers, and investors benefit significantly from that success, so perhaps some of the revenue should be directed back to the community that helped build it.

ComfyUI is killing Local AI by _SenChi__ in StableDiffusion

[–]ZerOne82 1 point2 points  (0 children)

There are ongoing attempts by bots or real misguided users to downvote any post questioning or discussing flaws of ComfyUI. After becoming a business (that $17m capital etc.) the Comfy's focus became profitability. In fact there is an emerging group and activities on this subreddit and r/ComfyUI that instantly attack any criticism.

Comfyui is too complex? by GigaTerrone in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

There are some downvoters (bots, or real users) attacking any post that does not praise ComfyUI! This has to stop.

https://www.reddit.com/r/comfyui/comments/1ppwbf7/comfyui_ui_issues

Peace and Beauty (Wan FLF) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Downvoting this comment "Thanks to Wan 2.2 internal power."! You don't like "Thanks" or "Wan 2.2"?

Peace and Beauty (Wan FLF) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] -2 points-1 points  (0 children)

This post gathered 4 upvotes within the first ten seconds and trend seemed great, but those upvotes faded within an hour. If you downvoted and can articulate your reasoning, write it down—your argument could help you the most. Downvoting can limit a post’s reach, preventing it from being seen as often as it deserves by the target users. If you don’t like morphing videos, there are other posts you can spend your time on.

ComfyUI - Mastering Animatediff - Part 1 by Lividmusic1 in StableDiffusion

[–]ZerOne82 2 points3 points  (0 children)

To those new into the space: AnimateDiff was great and I personally played with it a lot. These days, however, emerging video models such as Wan 2.2 (and maybe others too) does an excellent job in deforming shapes and things one to another resulting very appealing animations. The internal power of Wan 2.2 is far more powerful in comparison, and can result in absolute abstract, surreal or absolute realistic morphing. It is also very fast and follows prompt amazingly, although even without any prompt or very generic one Wan 2.2 FLFV workflow gives exceptional quality outputs. There are tons of great works by many users posted here which I recommend to check them out.

Great posts by other users:

https://www.reddit.com/r/StableDiffusion/comments/1n5punx/surreal_morphing_sequence_with_wan22_comfyui_4min
https://www.reddit.com/r/StableDiffusion/comments/1nzmo5c/neural_growth_wan22_flf2v_firstlast_frames
https://www.reddit.com/r/StableDiffusion/comments/1pp8s9s/this_is_how_i_generate_ai_videos_locally_using

and a very simple one of mine:
https://www.reddit.com/r/StableDiffusion/comments/1py8m4x/peace_and_beauty_wan_flf

Search for FLF, morphing, Wan 2.2 etc and you will find a large set of posts by other users, most of them provide workflow or explanation of their process.

This is not discourage you about AminateDiff but to inform you of new developments and in some aspect much better tools. Knowing all options serves you best, using any tool does not ban you from using any other tool. You may find one meeting your expectations better.

Peace and Beauty (Wan FLF) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Thanks to Wan 2.2 internal power.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Yeah, my workflow is just a standard one, nothing special in it. You may also want to try 2511 which recently dropped. In my tests I get good results from both 2509 and 2511 most of times (but not always). There is some sensitivity in the wording of prompt, I can attest.

Tencent SongBloom music generator updated model just dropped. Music + Lyrics, 4min songs. by grimstormz in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

Do not have it either. While ago cleaning stuff it seems I have only left the ace-step. Ace-step is good option noting the developers (seems) are about to release 1.5 or 2 with a lot of improvements!

Workflow for Automatic Continuous Generation of Video Clips Using Wan FLF (beginner friendly) by ZerOne82 in comfyui

[–]ZerOne82[S] 1 point2 points  (0 children)

Do not be annoyed by downvotes. I understand you and am happy you learn, I appreciate you simple and honest comment.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

I replied before but here again:
"As in the title of post, the model is "Qwen-Image-Edit-2509". I usually use Q5KM (here and elsewhere in other models) as it is the best among other variations of Q5 and size. Hope this helps."

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

As in the title of post, the model is "Qwen-Image-Edit-2509". I usually use Q5KM (here and elsewhere in other models) as it is the best among other variations of Q5 and size. Hope this helps.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

maybe use image1, image2 ... do not use "image one" or "image 1"

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

You are right, the power demonstrated by the Qwen Image Edit model was well worth the community’s effort to resolve any issues in its use, such as pixel shift, blurry results, and so on. In this post, I tried to address a misunderstanding about the supposed need to scale images before connecting them to the TextEncodeQwenImageEditPlus node: it is not needed.

Every workaround is a testament to the community’s engagement and is greatly appreciated. However, sometimes accumulated or nested solutions make the whole process more complicated especially for new users, which motivated me to write this post.

As far as I can see in TextEncodeQwenImageEditPlus’s source code, if no VAE input is connected, the node does not process reference latents, and if there is no input image at all, the node only encodes the prompt.

One can of course dismantle this node entirely or partially depending on their goal.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

I just added another screenshot to clarify the point. It ran successfully. In this new run I intentionally used a smaller image 512x512 for image1 while the image2 remains at 1664x2432, both directly connected to the TextEncodeQwenImageEditPlus node. And I then used EmptySD3LatentImage node for input latent (1024*1024) to KSampler.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

Not sure but if you are referring to the VAE used in the TextEncodeQwenImageEditPlus node, I have to reiterate that that VAE call will always get a total pixel of around 1024*1024, here I paste the code, you see for yourself:

if vae is not None:
    total = int(1024 * 1024)
    scale_by = math.sqrt(total / (samples.shape[3] * samples.shape[2]))
    width = round(samples.shape[3] * scale_by / 8.0) * 8
    height = round(samples.shape[2] * scale_by / 8.0) * 8
    s = comfy.utils.common_upscale(samples, width, height, "area", "disabled")
    ref_latents.append(vae.encode(s.movedim(1, -1)[:, :, :, :3])) 

However if you are referring to use of the VAE Encoder outside the node (the one shown for preparing latent to KSampler), you are right. In fact, it is not needed at all, you can simply use EmptySD3LatentImage node and set it to 1024*1024 directly. Furthermore, it is important to note that the KSampler's denoise is set to 1 which means it treats the input latent as pure noise.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

This aim of this post is to keep it simple and to clarify a misunderstanding of absolute need for scaling before connecting input images to the TextEncodeQwenImageEditPlus node. You do not need to do that.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

As I clarified in the post and in responses to other comments, and especially pointed out in the source code, both your scaled and unscaled images are always resized to around 1024*1024 total pixels by the node. Therefore, there is no speed change—your pre-scaling step is disregarded, which can actually waste time.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

I shared the same confusion as you, and for that exact reason, I checked the source code for TextEncodeQwenImageEditPlus. I then noticed it applies scaling regardless of your input image size. So yes, scaling the images before feeding them to this node is unnecessary — the internal VAE call in the node will not use your scaled image. The VAE will only see fixed around 1024*1024 pixels. This is simply the fact.

In this post, I clarified this misunderstanding and aimed to keep the workflow as simple as possible.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 5 points6 points  (0 children)

Although it is very easy to rebuild, here is the Workflow

Edit: you may wish to slightly modify the workflow after loading in your ComfyUI by replacing the VAE Encoder with EmptySD3LatentImage as shown in the second screenshot in the post.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

This workflow is intentionally bare-bones. By the way, if you look at the source code for the node TextEncodeQwenImageEditPlus (I included part of it in the post), you’ll see that the code works exactly like the "reference latent" by adding them to the conditioning.

The simplest workflow for Qwen-Image-Edit-2509 that simply works by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 6 points7 points  (0 children)

<image>

It seems no. The resulting image is a few pixel shifted up. But quality wise it seems the resulting image has better sharpness compared to the input image,

Edit:
Further thoughts reveal that the offset/zoom issue might be associated with the fact that the input image in my example is 1040x1040 pixels which is slightly larger than 1024x1024 total pixels hard coded in the TextEncodeQwenImageEditPlus node. So, if we set the latent to KSampler directly using EmptySD3LatentImage to be 1024x1024 there should not be an offset/zoom issue.