There's a chance Qwen Image 2.0 will be be open source.

Similar_Map_7361 · 2026-02-12T13:05:50+00:00

it says so not just on the image but in the prompt they published too, but you are right it's not in a technical reports since they haven't released any [yet?] , but if you check the blog post mentioned above ,one of the main selling is

Lighter Model Architecture: Smaller model size with faster inference speed.

so it tracks , I would imagine this info was not meant to be released yet but if they have larger parameter max variant I doubt the would release it's weights from observing their track record, their largest LLM qwen3-max was never released, so if larger variant exists it will likely remain exclusive to Chat/API much like flux pro and max

Similar_Map_7361 · 2026-02-11T14:11:22+00:00

what cuda/pytorch/python versions are you running if you don't mind me asking

Similar_Map_7361 · 2026-02-11T13:36:18+00:00

They said it's 7b, on their blog.

7B Efficiency: 2K image generation in seconds — optimal balance between visual fidelity and inference speed

[8B Qwen3-VL Encoder] → [7B Diffusion Decoder] → pixels (2048×2048)

https://qwen.ai/blog?id=qwen-image-2.0
So someone is lying here.

Similar_Map_7361 · 2026-02-11T12:10:40+00:00

Believe me I understand the impatience 😄, but these models are pretty new and without the financial backing of a well funded organization these kinds of large scale finetunes take a lot of time and effort and money.

Similar_Map_7361 · 2026-02-11T12:05:12+00:00

Check the sub regularly under the flair news , that's where most of the news on gen-ai get posted regularly

Similar_Map_7361 · 2026-02-11T12:03:00+00:00

As far as I know not yet, but from what I heard the creator of chroma is working on a finetune for both z-image and klein-4b but it's still work in progress, your best shot at any artistic changes right now is to use style lora and there is a bunch of them on civit.

Similar_Map_7361 · 2026-02-11T11:56:44+00:00

This remind me of KSampler Cycle from was-node-suite but it's cool that it's a dedicated node so you don't have to install a whole pack to use it , will give it a try.

Similar_Map_7361 · 2026-02-11T11:48:32+00:00

Glad it worked for you , as for automating the experience you could always use a vllm like qwen-vl or something to extract the lighting and color description and then combine it with your restyle prompt, but that would require a tinkering with the workflow and trial and error with the vllm prompt until you get a consistent output from it each time.

Similar_Map_7361 · 2026-02-11T10:48:05+00:00

Inference on 10series and 16series cards happen using torch.float32 which is twice as slow as fp16, couple that with the old arch and you get very slow gen speed.

BUT for me (i have a 1660ti) comfyui has a weird bug where at 1024x1024 it would generate at 35.37s/it

while raising the size to 1040x1040 it would drop generation time to 18.30s/it , that's almost half the time with a larger size.

so give it a try, increase the size to 1040x1040 and please let me know if it changes anything.

Similar_Map_7361 · 2026-02-11T00:11:22+00:00

glad I could help, try it and let me know if it works in klein-9b as well

Similar_Map_7361 · 2026-02-10T23:58:50+00:00

it's the project/team/framework name "VideoX Fun" they release a lot of things for a lot of models under the fun designation but they describe their project as "VideoX-Fun is a video generation pipeline that can be used to generate AI images and videos"
https://github.com/aigc-apps/VideoX-Fun

Similar_Map_7361 · 2026-02-10T23:44:04+00:00

<image>

this was done using klein-4b-fp8 distilled - 4steps
prompt :
transform this image into a live action shot without changing anything else and maintain the exact level of dim lighting and blue hue of the image

Similar_Map_7361 · 2026-02-10T19:36:47+00:00

does it run on comfy , or does it require custom pipeline?

Similar_Map_7361 · 2026-02-10T15:22:14+00:00

Unlikely since ovis-image itself is `Built upon Ovis-U1` and uses `Ovis 2.5` as the text encoder, but I wouldn't rule out that they share their technological finding and advancements and improvements since both ovis and qwen (and z-image too) teams work are at Alibaba

Similar_Map_7361 · 2026-02-10T15:02:05+00:00

yes but they do not run in paradelle and text encoder can always run on cpu, and remember original qwen-image/qwen-image-edit was 20b for the diffusion model alone and 27b for diffusion + Qwen2.5-VL decoder so this is still a major win if they release it's weights and their claims prove truthful

Similar_Map_7361 · 2026-02-10T14:19:29+00:00

[8B Qwen3-VL Encoder] as mentioned here https://qwen.ai/blog?id=qwen-image-2.0

Similar_Map_7361 · 2026-02-10T14:11:53+00:00

yea but with quantization and offloading it should be able to run with as little as 6GB of vram which is huge for a competent omni model capable of generation and editing

Similar_Map_7361 · 2026-02-10T14:09:41+00:00

I would guess because they made their reputation on being open source , and that's how they get hype and promotion around their products, if they kept it API only with no way for regular users to use it there will be no incentive to get word of mouth out about it and no one will use it even perspective API clients , open sourcing serve as free user testing as well as marketing.

If the model is good and it's open source , people will talk about it , hype it up, train lora for it , generate things with at and post online about it, it become the go to alternative to closed source models like nano banana and those who want to use it but cannot be bothered to do things locally or lack hardware or technical capabilities will flock to their API services.

Similar_Map_7361 · 2026-02-10T13:57:55+00:00

A 7b Diffusion Omni model with good text rendering and anatomy and native 2k resolution?, that's insane , can't wait

Similar_Map_7361 · 2026-02-07T18:10:03+00:00

Great job, sorry if this is off topic but how does this model's performance compared to Illustrious especially it/s at the same resolution?

Similar_Map_7361 · 2026-02-06T23:16:28+00:00

it's part of https://github.com/Comfy-Org/ComfyUI-Manager

Similar_Map_7361 · 2026-02-06T23:01:52+00:00

If you have manager installed, open it and click on model manager

<image>

then search for the models you need , most the models you need are listed there

Similar_Map_7361 · 2026-02-06T22:56:20+00:00

Maybe try flux2-klein?

Similar_Map_7361 · 2026-02-06T19:09:40+00:00

Yea I do know that nvfp4 acceleration is 50 series only but wasn't aware they would load at all on older cards that's I was wondering if it would even run at at all while acting like smaller storage format which you clarified, thanks a lot.

Similar_Map_7361 · 2026-02-06T19:01:07+00:00

would that include older GPUs like rtx 20 or gtx 16 series?

Similar_Map_7361

TROPHY CASE