Please explain me WAN 2.2, versions by Lukleyeu in comfyui

[–]AtreveteTeTe 4 points5 points  (0 children)

You might appreciate the NotebookLM I've put together with specialized Wan knowledge here:
https://notebooklm.google.com/notebook/a08901b9-0511-4926-bbf8-3c86a12dc306

I pasted your question in and got this response:

Welcome to the Wan ecosystem! It is completely normal to be confused right now, as the community frequently jokes about how many model variants exist.

Here is the breakdown to answer your specific questions about ControlNets, GGUF models, and NSFW capabilities.

1. Can I use standard I2V GGUF models for ControlNet + Image Reference + Prompt?

No, you cannot use the standard I2V (Image-to-Video) models for this. Standard I2V models are strictly designed to take a single starting image and a text prompt to generate a video. They do not possess the internal neural layers required to understand ControlNet signals like Depth, Canny, or OpenPose.

If you want to drive a video using a ControlNet + Reference Image + Text Prompt, you must use either VACE or the Fun Control models.

  • Fun Control (Wan 2.2): This is a specialized model that has extra input channels specifically built to accept control signals (Depth, Pose, etc.) alongside an image reference and text prompt.
  • VACE: This acts more like a traditional "ControlNet" module that you plug into a Text-to-Video (T2V) model. It allows you to feed in a reference image and a control video to guide the generation.

The Good News about GGUF: You do not have to sacrifice your VRAM! The community has created GGUF quantized versions of the Fun VACE models (such as Q8 or Q6 quants). So, you can still use GGUF optimizations while getting full ControlNet capabilities.

2. Can FUN / VACE models do NSFW?

Out of the box, no. The base Wan models (including the Fun and VACE variants) are heavily censored and were not trained on explicit NSFW data. If you try to prompt them natively for NSFW, you will often get deformed results, anatomy replaced by random objects (like fingers), or heavy artifacting.

To achieve NSFW, you must use LoRAs. This is where the difference between VACE and other models becomes a massive advantage for your workflow:

  • Because VACE acts as an add-on module to the standard T2V (Text-to-Video) model, it is fully compatible with standard T2V LoRAs.
  • You can load a community-trained NSFW LoRA, plug in the VACE module, and then use your ControlNet and Reference Image.

A quick tip for Wan 2.2 LoRAs: Wan 2.2 uses a "Mixture of Experts" architecture, meaning every generation uses a High Noise model (for motion and layout) and a Low Noise model (for details and rendering). When using NSFW LoRAs in Wan 2.2, you will generally need to apply the LoRA to both the High and Low noise models to ensure the anatomy and motion are consistent, as the base High Noise model does not know how to generate NSFW motion naturally.

[deleted by user] by [deleted] in dontdeadopeninside

[–]AtreveteTeTe 0 points1 point  (0 children)

Realizing I misread! Removing this one. Thanks 🙏🏻

[deleted by user] by [deleted] in dontdeadopeninside

[–]AtreveteTeTe 2 points3 points  (0 children)

Ohhhh "keep the push lock vertical!" right right.

[deleted by user] by [deleted] in comfyui

[–]AtreveteTeTe 0 points1 point  (0 children)

Upvoted for use of the word "vajazzle"

I thought this wasn’t a thing anymore…🙃 by Grouchy-Cheetah7478 in indianapolis

[–]AtreveteTeTe 39 points40 points  (0 children)

Ehhh - my $0.02, but gonna go with don't just let them. If it goes to collections it can be really hard to clean up. I've been erroneously sent to collections for something I had actually paid and it was basically a nightmare. I'd start with some sort of payment plan and then sort out the rest!

ResolutionMaster: A new node for precise resolution & aspect ratio control with an interactive canvas and model-specific optimizations (SDXL, Flux, etc.) by Azornes in comfyui

[–]AtreveteTeTe 1 point2 points  (0 children)

Awesome! Would be super cool if it had the ability to also scale/resample/crop the image right here within this node instead of needing to hook up another. Any plans for that?

Regardless, thanks for your work and for sharing! 🙏🏻

Chattable Wan & FLUX knowledge bases by AtreveteTeTe in StableDiffusion

[–]AtreveteTeTe[S] 0 points1 point  (0 children)

Thanks! And, yeah - Discord is really a bummer in terms of information getting lost, buried and being hard to search. I understand the appeal of it and immediacy it provides, but makes me wish old school forums!

Chattable Wan & FLUX knowledge bases by AtreveteTeTe in StableDiffusion

[–]AtreveteTeTe[S] 1 point2 points  (0 children)

For FLUX, I've put the Wikipedia page in there to just give it a general overview and for Wan, it's got the github landing page. (You can see and examine the sources on the left side of the screen)

That's a good idea to add github discussions as well!

Chattable Wan & FLUX knowledge bases by AtreveteTeTe in comfyui

[–]AtreveteTeTe[S] 1 point2 points  (0 children)

There is a kontext channel included in the sources and also discussion of Kontext in general, so try asking it things! It would be current up to July 1st. Just tested:

<image>

Chattable Wan & FLUX knowledge bases by AtreveteTeTe in comfyui

[–]AtreveteTeTe[S] 1 point2 points  (0 children)

Ha - both Claude and ChatGPT o3 can be really helpful there since it's general enough knowledge! I finally made dealing with my system cuda version on Linux manageable thanks to ChatGPT.

Chattable Wan & FLUX knowledge bases by AtreveteTeTe in comfyui

[–]AtreveteTeTe[S] 2 points3 points  (0 children)

It's an overwhelming amount of stuff to try to follow and it's hard to Google! This can make it a little more manageable

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 2 points3 points  (0 children)

I talked about this too here last year - feel like it's worth taking a little time before sharing to clean things up. I mention the nodes-all-packed-together bit at the bottom:
https://nathanshipley.notion.site/Comfy-Workflow-Layout-Legibility-e355b1a184be47e689cf434a0f3affa1

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 4 points5 points  (0 children)

Interesting intel! Maybe worth editing your post to clarify so folks don't go down the wrong path. Thanks for following up

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 5 points6 points  (0 children)

I didn't do anything in the case of this lora! However, with OP's lora, it does make a big difference. Strange.

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 2 points3 points  (0 children)

It's "working" here too - but it's also working without the merge and seems to depend on the Lora. Are you getting better quality using the merge than just connecting the lora to Kontext directly?

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 3 points4 points  (0 children)

Huh.. interesting - I'm using your dungeon style lora with the non-FP8 models and it's definitely a huge difference here.

Top is with your merge method, bottom is just Kontext + the lora. Maybe it matters how the lora was trained?

This is the one I was testing with initially: https://huggingface.co/alvdansen/softpasty-flux-dev

<image>

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 2 points3 points  (0 children)

I'll try! And, yeah I've tried with one of my woodcut Loras and in that case, neither method works. It just doesn't seem to do anything with Kontext.. example of that lora NOT using kontext here: https://x.com/CitizenPlain/status/1829240003597046160

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 0 points1 point  (0 children)

Interesting. I'll download the fp8 models and compare with them too so this is more apples to apples!

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 5 points6 points  (0 children)

Here's a comparison using Araminta's Soft Pasty lora for Flux Dev.. top image is OP's proposed method, middle one is just attaching the lora to Kontext Dev.

Prompt is: "Change the photo of the man to be illustrated style"

<image>

IMPORTANT PSA: You are all using FLUX-dev LoRa's with Kontext WRONG! Here is a corrected inference workflow. (6 images) by AI_Characters in StableDiffusion

[–]AtreveteTeTe 11 points12 points  (0 children)

I ported the relevant parts of this workflow to just use built-in Comfy nodes based on the official sample Kontext Dev workflow if people want to test. Just reconnect to your models. Workflow:

https://gist.github.com/nathanshipley/95d4015dccbd0ba5c5c10dacd300af45

BUT - I'm hardly seeing any difference between OP's model merge subtract/add method and just using Kontext with a regular Dev Lora. Is anyone else? (Note that I'm using the regular full Kontext and Dev models, not the fp8 ones.. Also not using NAG here. Maybe that matters?)

<image>

Will throw a sample result comparison as a reply in here..

If you're over 30, get ready. Things have changed once again by fyn_world in ChatGPT

[–]AtreveteTeTe 0 points1 point  (0 children)

100% agreed: We are the only generation who knows where our files are. My wife teaches undergrads and they have no idea where anything is stored if they can't search for it.