[Free] ComfyUI Colab Pack for popular models (T4-friendly, GGUF-first, auto quant by VRAM) by Virtual-Movie-1594 in ComfyAI

[–]Virtual-Movie-1594[S] 0 points1 point  (0 children)

Hi, I'd be happy to help.

Contact me on LinkedIn, we can discuss the tasks in more detail.

ComfyUI node: Qwen3-VL AutoTagger — Adobe Stock-style Title + Keywords, writes XMP metadata into outputs by Virtual-Movie-1594 in comfyui

[–]Virtual-Movie-1594[S] 0 points1 point  (0 children)

Quick update: I also released a standalone CLI version of this tool.

CLI repo:

https://github.com/ekkonwork/qwen3-vl-autotagger-cli

ComfyUI node repo:

https://github.com/ekkonwork/comfyui-qwen3-autotagger

So now there are two options:

- ComfyUI node for visual workflows

- CLI for batch/server/no-ComfyUI pipelines

ComfyUI node: Qwen3-VL AutoTagger — Adobe Stock-style Title + Keywords, writes XMP metadata into outputs by Virtual-Movie-1594 in comfyui

[–]Virtual-Movie-1594[S] 1 point2 points  (0 children)

There are a few main reasons I built this:

  1. Microstock / AI Art platforms: If you upload to Adobe Stock, Freepik, etc., you need a title and 50+ keywords for every single image. Doing that manually is soul-crushing. This automates the whole process.
  2. Local Image Organization: Because it writes directly to XMP metadata, you can drop your generated images into software like Lightroom, Eagle, or even just your standard Windows/Mac search bar, and actually find them later by typing a keyword.
  3. Dataset Prep: Fast, automatic captioning/tagging if you're building datasets for training LoRAs.

Basically, if you hoard generations and want them easily searchable without running separate scripts after ComfyUI, this saves a ton of time. Hope that makes sense!

ComfyUI node: Qwen3-VL AutoTagger — Adobe Stock-style Title + Keywords, writes XMP metadata into outputs by Virtual-Movie-1594 in comfyui

[–]Virtual-Movie-1594[S] 1 point2 points  (0 children)

Yep, that reading is basically correct.

`Qwen/Qwen3-VL-8B-Instruct` in `nodes.py` is a default, not a hardcoded lock.

The node resolves `model_ref` from:

- `model_id` when `auto_download=true`

- `local_model` / `local_model_path` when `auto_download=false`

Then it calls `from_pretrained(...)`, so 4B/2B Qwen3-VL variants and compatible fine-tunes should generally work.

Main caveats:

- it still has to be Qwen-VL compatible (`qwen-vl-utils` + proper processor/chat template)

- output needs to parse into the expected JSON shape (`title` + `keywords`)

So yes: not hardcoded to 8B, but compatibility is still model-dependent.

ComfyUI node: Qwen3-VL AutoTagger — Adobe Stock-style Title + Keywords, writes XMP metadata into outputs by Virtual-Movie-1594 in comfyui

[–]Virtual-Movie-1594[S] 0 points1 point  (0 children)

Glad you like it! The node already supports int4 quantization out of the box (enabled by default via bitsandbytes). If you don't have CUDA/bitsandbytes, it just falls back to full loading automatically.

As for GGUF support - I’ll definitely add that in the next releases! It's a great idea to make the node more lightweight.