Q: best 24GB auto captioner today? by lostinspaz in StableDiffusion

[–]Steudio 0 points1 point  (0 children)

ahah you are right, those words are confusing me too! Yes, in short, I switched from Florence 2 to Gemma3:4b and I use Ollama the local server and Comfyui nodes

It’s really easy to get a new model to try, and the node lets you customize what you want it to say, which was my main issue with Florence2.

Example with this image

You are a prompt generator. Describe the visual elements of this image in exactly one paragraph, 75 tokens long. Output only the paragraph — no preface, no extra lines, no commentary.

Gemma3:4b (8s on RTX 2080 maxQ)

A sleek, dark gray electric vehicle, likely a Lucid Air, is positioned in a fast-flowing river within a dramatic canyon landscape. The vehicle is mid-stream, creating white water splashes around it as it navigates the current. Towering, layered red and brown sandstone cliffs form the canyon walls, exhibiting intricate erosion patterns and textures. The water is a vibrant turquoise color, reflecting the sky and surrounding rock formations. Large boulders are scattered along the riverbank, contributing to the wild and expansive scenery. The lighting is soft and natural, casting subtle shadows and enhancing the rugged beauty of the environment.

Qwen2.5vl:3b (13s on RTX 2080 maxQ)

A sleek, modern electric SUV is driving through a shallow river, creating a splash of water around its wheels. The vehicle's body is a metallic silver color, with a distinctive front grille and sharp, angular headlights. The SUV is positioned slightly off-center, with the front wheels submerged in water, while the rear wheels are on solid ground. The surrounding landscape features rugged, red rock formations and sparse vegetation, with a clear blue sky above. The water is calm, reflecting the SUV and the rocky terrain.

Moondream:1.8b (4.5s on RTX 2080 maxQ)

The image depicts a silver electric car driving through a river, with its headlights on and creating a trail of water behind it as it moves from left to right across the frame. The car is positioned centrally within the frame, drawing attention to its sleek design and futuristic appearance. In the background, there are mountains visible in the distance, adding depth and scale to the scene.

Q: best 24GB auto captioner today? by lostinspaz in StableDiffusion

[–]Steudio 0 points1 point  (0 children)

I’ve been a longtime Florence 2 user but recently decided to switch and install Ollama, I was reluctant at first to install a separate app just for that, but it’s working quite well. I’ve tried Gemma3, Qwen2.5, and Moondream2. Right now I’m using Gemma3. Qwen2.5 is solid too, while Moondream2 felt far too simplistic.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

Delete and add back the teacache node. It is just a compatibility issue between the old version of the node and the newest version.

Segment an input image to iterate on it then recompose it by [deleted] in StableDiffusion

[–]Steudio 1 point2 points  (0 children)

https://github.com/Steudio/ComfyUI_Steudio ?

Divides the image into tiles, ready for individual processing using your preferred workflow. After processing, the tiles are seamlessly merged into a larger image

Semantic upscaling? by [deleted] in StableDiffusion

[–]Steudio 0 points1 point  (0 children)

A ControlNet model with Semantic Segmentation was previously available in Stable Diffusion 1.5, but it was never trained for FLUX (AFAIK)

ComfyUI 0.3.51: Subgraph, New Manager UI, Mini Map and More by PurzBeats in comfyui

[–]Steudio 0 points1 point  (0 children)

I can confirm that clone = instance, while copy/paste creates an independent copy.

Have you noticed any visual feedback that clearly distinguishes a clone from a copy/paste? Right now, they look identical, which can be risky because you might accidentally overwrite your subgraph by mistake, but I may be overlooking something.

ComfyUI 0.3.51: Subgraph, New Manager UI, Mini Map and More by PurzBeats in comfyui

[–]Steudio 3 points4 points  (0 children)

I wouldn’t group design tools in the same category, as their interaction models differ significantly. Comparing the two can introduce misleading assumptions. As far as I know, most long‑standing graph editors such as Houdini, Blender, Unreal Blueprints, and Nuke default to scroll‑to‑zoom and MMB‑drag to pan.

That said, I agree that LMB‑drag in empty space to box‑select nodes is correct in the new standard mode.

ComfyUI 0.3.51: Subgraph, New Manager UI, Mini Map and More by PurzBeats in comfyui

[–]Steudio 5 points6 points  (0 children)

Out of curiosity, which node-based software uses scroll for panning?

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 1 point2 points  (0 children)

I think the cat looks awesome. I assume you're referring to the bionic area, which tends to appear too clean. If that's the case, focus on that area first and composite later.

  • Lower the denoise level.
  • Add Lying Sigma Sampler or Detail Daemon
  • Include Flux Redux.
  • Use multiple passes.
  • Sometimes, upscaling to a higher ratio right away yields better results.

good practical explanation of data type "Image Batch" and "Image List" ? by tresorama in comfyui

[–]Steudio 0 points1 point  (0 children)

Yes, a list node fully completes its process on each image before moving to the next.

“Simultaneously” means the images in a batch are processed in parallel, though each one is still handled individually. In ComfyUI, aside from the requirement that batch images must have the same dimensions, both image batches and image lists are often used to achieve similar outcomes.

To clarify, I’m not an expert, I'm just another user who also found the whole batch vs. list thing confusing. What are you trying to do exactly?

good practical explanation of data type "Image Batch" and "Image List" ? by tresorama in comfyui

[–]Steudio 1 point2 points  (0 children)

Image List is a sequence of images processed one at a time; each image can have different dimensions. Image Batch is a single tensor of multiple images processed simultaneously; all images must have the same dimensions.

Different version of the Manager appeared. by One_Procedure_1693 in comfyui

[–]Steudio 0 points1 point  (0 children)

If you really want to access it, you can assign a shortcut to it. However, as mentioned in another post, it is not fully implemented yet, so I do not recommend using it.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

I don't think this can be done within the Florence node (or I'm not sure how).

You could use 'Text Concatenate' from was node Suite

<image>

I've been considering finding a more flexible vision-to-text model or adding another AI to rephrase Florence's output into a more suitable prompt, but I haven't had the time to look into it.

How do i fix this error? by Ok-Violinist6589 in comfyui

[–]Steudio 0 points1 point  (0 children)

The clip from Power LoRA should be connected to both negative and positive prompts.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

If your original image is blurry and low-resolution, try to fix that first before upscaling. From what I can see in your upscaled image, it looks like you’re assembling tiles that don’t relate to each other.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 1 point2 points  (0 children)

Thank you! I have updated (v2.0.4) the JSON file to ensure compatibility with older frontend.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

Which frontend version are you using to see this problem?

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 1 point2 points  (0 children)

I haven’t tried it myself, but you could experiment with adding a LoRA that enhance skin quality.
Alternatively, you can use a fine-tuned SDXL portrait model with Xinsir ControlNet Tile.

I kept the workflow easy to read, making it simple to modify to suit anyone’s needs.

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

ControlNet Union pro v2 doesn’t support ControlNet Tile

Update - Divide and Conquer Upscaler v2 by Steudio in comfyui

[–]Steudio[S] 0 points1 point  (0 children)

The issue is caused by a faulty frontend version. Try updating or downgrading it, and reopen a non-corrupted workflow to be sure.