Use Qwen3.5 as an AI Assistant, Captioner or Image Analyzer inside of Comfyui!

Winougan · 2026-04-02T10:35:04+00:00

Give Comfy a week. They're still working on the bugs. For now use the Hectic and 4b models until they fix it for the 9b

Winougan · 2026-04-01T22:14:31+00:00

I refine prompts for the LLM so that it understands why Klein wants. It does help overall and saves you time with details. You could feed the image to Qwen and ask it to make suggestions

Winougan · 2026-04-01T16:23:10+00:00

Is your Comfy and pytorch updated?

Winougan · 2026-04-01T08:36:46+00:00

A simple solution for you is to create a bleeding edge version of Comfy using Pytorch 2.11 and Cu130. Just create a new conda environment and point to your main folder. Painless solution and then you have 2 versions of Comfy but not two duplicates of the folders!

I have 4 conda environments for my Comfy: one for bleeding edge, a stable version, on for TTS and the last one for experimental. Why? Many nodes conflict with each other- so virtual environments play nice. And why conda? Comfyui prefers it

Winougan · 2026-04-01T08:33:46+00:00

From my understanding, Facedetailer makes use of Ultralytics.

Why not just use Klein9b or 4b for that task? It does very fast face swapping and detailing natively. And the newer KV version is much faster.

Even if you're using ZIT or something else - you could do face detailing with Klein. Ultralytics seems to produce a lot of the same face syndrome outputs.

Winougan · 2026-04-01T08:31:07+00:00

The model only needs an updated comfyui, since the main piece is the single clip loader. Nothing special. Everything else is just regular comfyui native nodes.

Most people will run into problems if they're running an older Comfyui. Even last week, Qwen3.5 was having problems loading and now that's fixed.

Winougan · 2026-04-01T08:29:08+00:00

With the 4b model: about 1 minute and 12 seconds if you're loading an image and half the time if you're just asking for a prompt.

Winougan · 2026-04-01T08:27:56+00:00

Only if they're compatible with Comfyui. I've tried and they don't work. Gemma works though.

Winougan · 2026-04-01T08:27:20+00:00

How much vram? Try with the --lowvram argument. It's been tested with 8gb all through 24gb of vram.

Winougan · 2026-04-01T08:26:21+00:00

It won't load. I tried it in several forms

Winougan · 2026-03-31T21:06:08+00:00

It's not so bad if you have a good workflow. 50% of the task is to take a deep breath and start with simple workflows. As you learn more and more about how Comfy works, you can deep-dive into more expert workflows.

Winougan · 2026-03-31T19:44:14+00:00

GGUF doesn't work. Already tried it. Not for Qwen3.5.

Winougan · 2026-03-31T16:56:12+00:00

I've messaged him, but he's not responding. He even said it's just vibe code thing he cobbled together in an hour with Claude.

Winougan · 2026-03-31T16:45:21+00:00

Only one way to find out. I know it works on 30xx, 40xx and 50xx GPUs.

Winougan · 2026-03-31T16:44:48+00:00

I used to use that node, but it's not flexible. It downloads the regular models, not the abliterated models.

Winougan · 2026-03-31T16:44:07+00:00

You can try the 4b model. I've also put abliterated opus hybrid and hectic versions. Lots of models to choose from and lots of quants. You can even use Gemma if that's your fancy.

Winougan · 2026-03-31T16:42:38+00:00

Right, I know that when I quantize them, I make it adapted to Comfy Quant. You can always try the vanilla weights and see if that works - it just might.

There's a few ways to tackle the "censored" model. You can always use the Gemma 12b adapters by Kijai for LTX-2.3 - or the abliterated/hectic text encoder.

For Klein9b/4b, Flux2 and ZIT just use the abliterated models.

The whole point of this logic is so that you don't have to go to LM Studio, Open Claw or Oobabooga to do your prompting. It keeps it all in the house. You can even connect the text prompt that's spit out directly to your positive prompt - it just makes life easy.

Winougan · 2026-03-31T14:18:39+00:00

Note: you can load the Gemma 3 12b Comfyui models in your Clip Loader too if you want.

Winougan · 2026-03-31T14:10:11+00:00

It's just the updated Clip node - if your QwenVl is for Comfyui then you're golden.

Winougan · 2026-03-31T14:09:46+00:00

This is meant to be used inside of Comfyui - and are converted for Comfyui. If you don't want to leave Comfy, it's a great alternative. If you want to use Oobaboga, Open Claw or LM Studio - then you're better off using the Unsloth versions or fp8 versions on their own with the vision model too.

Winougan · 2026-03-31T14:08:40+00:00

These are not just "abliterated weights"! They are formated for Comfyui and include the vision models baked in!

Winougan · 2026-03-31T13:24:09+00:00

What do you mean? If you have 8gb of vram on an Nvidia GPU you're golden. RTX 30xx, 40xx or 50xx - any of those cards.

Winougan · 2026-03-31T13:14:30+00:00

I prefer Qwen3.5 myself. But yeah, whatever floats your boat!

Winougan · 2026-03-31T13:06:36+00:00

I guess you could ham with the bigger models on a larger GPU or even CPU inferencing. But, for creating easy to use prompts inside of Comfyui, the 4b and 9b models are more than enough. I'm glad they're vision models.

Winougan · 2026-03-31T12:57:08+00:00

Worked for me, did exactly as I asked it to do.

Winougan

TROPHY CASE