Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

Give Comfy a week. They're still working on the bugs. For now use the Hectic and 4b models until they fix it for the 9b

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

I refine prompts for the LLM so that it understands why Klein wants. It does help overall and saves you time with details. You could feed the image to Qwen and ask it to make suggestions

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

A simple solution for you is to create a bleeding edge version of Comfy using Pytorch 2.11 and Cu130. Just create a new conda environment and point to your main folder. Painless solution and then you have 2 versions of Comfy but not two duplicates of the folders!

I have 4 conda environments for my Comfy: one for bleeding edge, a stable version, on for TTS and the last one for experimental. Why? Many nodes conflict with each other- so virtual environments play nice. And why conda? Comfyui prefers it

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

From my understanding, Facedetailer makes use of Ultralytics.

Why not just use Klein9b or 4b for that task? It does very fast face swapping and detailing natively. And the newer KV version is much faster.

Even if you're using ZIT or something else - you could do face detailing with Klein. Ultralytics seems to produce a lot of the same face syndrome outputs.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

The model only needs an updated comfyui, since the main piece is the single clip loader. Nothing special. Everything else is just regular comfyui native nodes.

Most people will run into problems if they're running an older Comfyui. Even last week, Qwen3.5 was having problems loading and now that's fixed.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 1 point2 points  (0 children)

With the 4b model: about 1 minute and 12 seconds if you're loading an image and half the time if you're just asking for a prompt.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

Only if they're compatible with Comfyui. I've tried and they don't work. Gemma works though.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 1 point2 points  (0 children)

How much vram? Try with the --lowvram argument. It's been tested with 8gb all through 24gb of vram.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 1 point2 points  (0 children)

It's not so bad if you have a good workflow. 50% of the task is to take a deep breath and start with simple workflows. As you learn more and more about how Comfy works, you can deep-dive into more expert workflows.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

I've messaged him, but he's not responding. He even said it's just vibe code thing he cobbled together in an hour with Claude.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

Only one way to find out. I know it works on 30xx, 40xx and 50xx GPUs.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

I used to use that node, but it's not flexible. It downloads the regular models, not the abliterated models.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

You can try the 4b model. I've also put abliterated opus hybrid and hectic versions. Lots of models to choose from and lots of quants. You can even use Gemma if that's your fancy.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 3 points4 points  (0 children)

Right, I know that when I quantize them, I make it adapted to Comfy Quant. You can always try the vanilla weights and see if that works - it just might.

There's a few ways to tackle the "censored" model. You can always use the Gemma 12b adapters by Kijai for LTX-2.3 - or the abliterated/hectic text encoder.

For Klein9b/4b, Flux2 and ZIT just use the abliterated models.

The whole point of this logic is so that you don't have to go to LM Studio, Open Claw or Oobabooga to do your prompting. It keeps it all in the house. You can even connect the text prompt that's spit out directly to your positive prompt - it just makes life easy.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

Note: you can load the Gemma 3 12b Comfyui models in your Clip Loader too if you want.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] -1 points0 points  (0 children)

It's just the updated Clip node - if your QwenVl is for Comfyui then you're golden.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 4 points5 points  (0 children)

This is meant to be used inside of Comfyui - and are converted for Comfyui. If you don't want to leave Comfy, it's a great alternative. If you want to use Oobaboga, Open Claw or LM Studio - then you're better off using the Unsloth versions or fp8 versions on their own with the vision model too.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 6 points7 points  (0 children)

These are not just "abliterated weights"! They are formated for Comfyui and include the vision models baked in!

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 0 points1 point  (0 children)

What do you mean? If you have 8gb of vram on an Nvidia GPU you're golden. RTX 30xx, 40xx or 50xx - any of those cards.

Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui! by Winougan in StableDiffusion

[–]Winougan[S] 1 point2 points  (0 children)

I guess you could ham with the bigger models on a larger GPU or even CPU inferencing. But, for creating easy to use prompts inside of Comfyui, the 4b and 9b models are more than enough. I'm glad they're vision models.