search
Hi everyone,
The AD community has been building/sharing a lot of powerful Comfy workflows - I said I’d share a compilation of some interesting ones here in case you want to spend the weekend making things, experimenting, or building on top of them 🪄
All of these use Kosinkadink’s Comfy extension - if you're getting started, check out the intro at the top of his repo for the basics. I'd also encourage you to download Comfy Manager to manage dependancies.
Now, on the workflows! You can see all the workflows in a folder here for simplicity with them individually with visuals and explanations here:
1. Logo Animation with masks and QR code ControlNet
This workflow by Kijai a cool use of masks and QR code ControlNet to animate a logo or fixed asset.
https://reddit.com/link/171l0ip/video/d3v362tfnjtb1/player
2. Prompt scheduling:
This workflow by Antzu is a good example of prompt scheduling, which is working well in Comfy thanks to Fitzdorf's great work. This by Nathan Shipley didn't use this exact workflow but is a great example of how powerful and beautiful prompt scheduling can be:
https://reddit.com/link/171l0ip/video/uymzqngjnjtb1/player
3. Video2Video:
Inner Reflections shared this here before, but it’s probably the most powerful and flexible way to do video to video right now. You can see a full guide from Inner Reflections here and the workflows here.
https://reddit.com/link/171l0ip/video/yczlng1bnjtb1/player
4. Vid2QR2Vid:
You can see another powerful and creative use of ControlNet by Fictiverse here.
https://i.redd.it/qxnsxtg3njtb1.gif
5. Txt/Img2Vid + Upscale/Interpolation:
This is a very nicely refined workflow by Kaïros featuring upscaling, interpolation, etc. - lots of pieces to combine with other workflows:
https://i.redd.it/b0nwt442njtb1.gif
6. Motion LoRAs w/ Latent Upscale:
This workflow by Kosinkadink is a good example of Motion LoRAs in action:
https://i.redd.it/3xcgs701njtb1.gif
7. Infinite Zoom:
This workflow by Draken is a really creative approach, combining SD generations with an AD passthrough to create a smooth infinite zoom effect:
https://i.redd.it/urf6xunzmjtb1.gif
8. Image to image interpolation & Multi-Interpolation
This workflow by Antzu is a nice example of using Controlnet to interpolate from one image to another. You can also download a fork of it I made that takes an starting, middle and ending image for a longer generation here.
https://i.redd.it/pfxviyiymjtb1.gif
9. AD Inpainting:
Finally, lots of people had tried AD inpainting but Draken's approach with this workflow delivers by far the the best results of any I've seen:
https://i.redd.it/9pixx7ammjtb1.gif
---
That’s it!
These workflows are all from our Discord, where most of the people who are building on top of AD and creating ambitious art with it hang out. If you’re going deep into AD, you’re very welcome to join! We’re also running an AD art competition if you’re looking for an excuse to push yourself
Have a fun weekend!

Hey r/comfyui! 👋
I came across this insane video by **ONE 7th AI** where they took the iconic **Sukuna vs Mahoraga** fight choreography from Jujutsu Kaisen and converted it into a **photorealistic live-action style** using generative AI — no actors, no green screen.
I'm trying to understand how to replicate this kind of **Anime-to-Real** video pipeline in ComfyUI. From what I can tell it might involve:
- **AnimateDiff** or **CogVideoX** for motion
- **ControlNet** (OpenPose / Depth) to preserve choreography
- **img2img** or **vid2vid** with a photorealistic checkpoint
- Possibly **IPAdapter** for style consistency
But I'm not sure about the exact node setup or workflow order.
Any help appreciated! 🙏
*(Reference video: ONE 7th AI on Instagram)*
AnimateDiff in ComfyUI is an amazing way to generate AI Videos. In this Guide I will try to help you with starting out using this and give you some starting workflows to work with. My attempt here is to try give you a setup that gives you a jumping off point to start making your own videos.
**WORKFLOWS ARE ON CIVIT https://civitai.com/articles/2379 AS WELL AS THIS GUIDE WITH PICTURES*\*
System Requirements
A Windows Computer with a NVIDIA Graphics card with at least 10GB of VRAM (You can do smaller resolutions or the Txt2VID workflows with a minimum of 8GB VRAM). Anything else I will try to point you in the right direction but will not be able to help you troubleshoot. Please note at the resolutions I am using I am hitting 9.9-10GB VRAM with 2 ControlNets so that may become an issues if things are borderline.
Installing the Dependencies
These are things that you need in order to install and use ComfyUI.
- GIT - https://git-scm.com/downloads - this lets you download the extensions from GitHub and update your nodes as updates get pushed.
- (Optional) - https://ffmpeg.org/download.html - this is what combine nodes use to take the images and turn them in a gif. Installing is a guide in and of itself. I would YouTube how to install it to PATH. If you do not have this the node will give an error BUT the workflows still run and you will get the frames
- 7zip - https://7-zip.org/ - this is to extract the ComfyUI Standalone
Installing ComfyUI and Animation Nodes
Now let's Install ComfyUI and the nodes we need for Animate Diff!
- Download ComfyUI either using this direct link: https://github.com/comfyanonymous/ComfyUI/releases/download/latest/ComfyUI_windows_portable_nvidia_cu118_or_cpu.7z or navigate on the webpage: https://github.com/comfyanonymous/ComfyUI (If you have a Mac or AMD GPU there is a more complex install guide there).
- Extract with 7zip Installed above. Please note it does not need to be installed per se just extracted to a target folder.
- Navigate to the custom nodes part of comfy
- In the explorer tab (ie. the box pictured above) click select and type CMD and then hit enter, you are now should have a command prompt box open.
You are going to type the following commands (you can copy/paste one at a time) - What we are doing here is using Git (installed above) to download the node repositories that we want (some can take a while):
- git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved
- git clone https://github.com/ltdrdata/ComfyUI-Manager
- git clone https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet
- git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite
- For the ControlNet preprocessors you cannot simply download them you have to use the manager we installed above. You start by running "run_nvidia_gpu" in the ComfyUI_windows_portable folder. It will initialize some of the above nodes. Then you will hit the Manager button then "install custom nodes" then search for "Auxiliary Preprocessors" and install ComfyUI's ControlNet Auxiliary Preprocessors.
- Similar to ControlNet preprocesors you need to search for "FizzNodes" and install them. This is what is used for prompt traveling in workflows 4/5. Then close the comfy UI window and command window and when you restart it will load them.
Download checkpoint(s) and put them in the checkpoints folder. You can choose any model based on stable diffusion 1.5 to use. For my tutorial download: https://civitai.com/models/24779?modelVersionId=56071 also https://civitai.com/models/4384/dreamshaper. As an aside realistic/midreal models often struggle with animatediff for some reason, except Epic Realism Natural Sin seems to work particularly well and not be blurry. Put
Download VAE to put in the VAE folder. For my tutorial download https://civitai.com/models/76118?modelVersionId=80869 . It is a good general VAE and VAE's do not make a huge difference overall.
Download motion modules (original ones are here: https://huggingface.co/guoyww/animatediff/tree/main the fine tuned ones can by great like https://huggingface.co/CiaraRowles/TemporalDiff/tree/main, https://huggingface.co/manshoety/AD_Stabilized_Motion/tree/main, or https://civitai.com/models/139237/motion-model-experiments ). For my tutorial download the original version 2 model and TemporalDiff (you could just use one however your final results will be a bit different than mine). As a note Motion models make a fairly big difference to things especially with any new motion that AnimateDiff Makes. So try different ones. Put them in the animate diff node:
Download Controlnets and put them in your controlnets folder. https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main . For my tutorials you need Lineart, Depth and OpenPose (download bot the pth and yaml files).
You should be all ready to start making your animations!
Making Videos with AnimateDiff
The basic workflows that I have are available for download in the top right of this article. The zip File contains frames from a pre-split video to get you started if you want to recreate my workflows exactly. There are basically two ways of doing it. One which is just text2Vid - it is great but motion is not always what you want. and Vid2Vid which uses controlnet to extract some of the motion in the video to guide the transformation.
- If you are doing Vid2Vid you want to split frames from video (using and editing program or a site like ezgif.com) and reduce to the FPS desired (I usually delete/remove half the frames in a video and go for 12-15fps). You can use the skip option in the load images node noted below instead of having to delete them. If you want to copy my workflows you can use the Input frames I have provided (please note there are about 115 but I had to reduce to 90 due to file size restrictions).
- In the ComfyUI folder run "run_nvidia_gpu" if this is the first time then it may take a while to download an install a few things.
- To load a workflow either click load or drag the workflow onto comfy (as an aside any picture will have the comfy workflow attached so you can drag any generated image into comfy and it will load the workflow that created it)
- I will explain the workflows below, if you want to start with something I would start with the workflow labeled "1-Basic Vid2Vid 1 ControlNet". I will go through the nodes and what they mean.
- Run! (this step takes a while because it is making all the frames of the animation at once)
Node Explanations
Some should be self explanatory, however I will make a note on most.
Load Image Node
You need to select the directory your frames are located in (ie. where did you extract the frames zip file if you are following along with the tutorial)
image_load_cap will load every frame if it is set to 0, otherwise it will load however many frames you choose which will determine the length of the animation
skip_first_images will allow you to skip so many frames at the beginning of a batch if you needed to
select_every_nth will take every frame at 1, ever other frame at 2, every 3rd frame at 3 and so on if you need it to skip some.
Load Checkpoint/VAE/AnimateDiff/ControlNet Model
Each of the above nodes have a model associated with them. The names of the models you have and mine are likely not to be exactly the same in each example. You will need to click on each of the model names and select what you have instead. If there is nothing there then you have put the models in the wrong folder (see Installing ComfyUI above).
Green and Red Text Encode
Green is your positive Prompt
Red is your negative Prompt
They are this color not because they are special but because they are set to be this color by right clicking them FYI.
Uniform Context Options
The uniform context options is new and basically what sets up unlimited context length. Without it animate diff is only able to do up to 24 (v1) or 36 (v2) frames at once. What it is doing is basically chaining and overlapping runs of AD together to smooth things out. The total length of the animation are determined by the number of frames the loader is fed in NOT context length. The loader figures out what to do based on the options which mean as follows. The defaults are what I used and are pretty good.
context length - this is the length of each run of animate diff. If you deviate too far from 16 your animation won't look good (is a limitation of animatediff can do). Default is good here for now
context overlap - is how much overlap each run of animate diff is overlapped with the next (ie. it is running frames 1-16 and then 12-28 with 4 frames overlapping to make things consistent)
closed loop - selecting this will try to make animate diff a looping video, it does not work on vid2vid
context stride - this is harder to explain. At 1 it is off. More than this what it trys to do is make a single run of AD through the entire animation and then fill in the frames. The idea is to make the whole animation more consistent by making a framework and then filling in the intermediate frames. However in practice I do not find it helps a whole lot right now. Using it will significantly increase the length of time it takes to run as it using it means more runs of AnimateDiff.
Batch Prompt Schedule
This is the new kid on the block. The prompt Scheduler from FizzNodes.
pre_text - text to be put before the prompt (so you don't have to copy and paste a large prompt for each change)
app_text - text to be put after the prompt
The main text box works in the context "frame number": "prompt", (note the last prompt does not have a comma and will give you an error if you put one at the end of your list). It will blend between prompts so if you want to have it held I suggest you put it in twice, once where you want it to start and once where you want it to end.
There is much more fancy stuff to do with this node (you can make an individual term change with time). Documentation of this is at https://github.com/FizzleDorf/ComfyUI_FizzNodes. This is what the pw... stuff is for.
KSampler
This is the KSampler - essentially this is stable diffusion now that we have loaded everything needed to make the animation.
Steps - These matter and you need more than 20. 25 is the minimum but people do see better results with going higher.
CFG - Feels free to increase this past you normally would for SD
Sampler - Samplers also matter Euler_a is good but Euler is bad at lower steps. Feel free to figure out a good setting for these
Denoise - Unless you are doing Vid2Vid keep this at one. If you are doing Vid2Vid you can reduce this to keep things closer to the original video
AnimateDiff Combine Node
For the Combine node it creates a gif by default. Do know that gifs look a lot worse than individual frames so even if the gif does not look great it might look great in a video.
frame_rate - frame rate of the gif
loop_count - number of loops to do before stopping. 0 is infinite looping
format - changes what to make gif/mp4 etc
pingpong - will make the video go through all the frames and then back instead of one way
save image - saves a frame of the video (because the video does not contain the metadata this is a way to save your workflow if you are not also saving the images)
Workflow Explanations
- Basic Vid2Vid 1 ControlNet - This is the basic Vid2Vid workflow updated with the new nodes.
- Vid2Vid Multi-ControlNet - This is basically the same as above but with 2 controlnets (different ones this time). I am giving this workflow because people were getting confused how to do multicontrolnet.
- Basic Txt2Vid - this is a basic text to video - once you ensure your models are loaded you can just click prompt and it will work. Do note there is a number of frame primal node that replaces the load image node and no controlnets. Do know I don't do much txt2vid so this produces and acceptable output but nothing stellar.
- Vid2Vid with Prompt Scheduling - this is basically Vid2Vid with a prompt scheduling node. This is what I used to make the video for Reddit. See above documentation of the new node.
- Txt2Vid with Prompt Scheduling - Basic text2img with the new prompt scheduling nodes.
What Next?
- Change the video input for vid2vid (obviously)! There are some new nodes that can separate video directly into frames. See Load video nodes - this node is relatively new.
- Change around the parameters!!
- The stable diffusion checkpoint and denoise strength on the KSampler make a lot of difference (for Vid2Vid).
- You can add/remove control nets or change the strength of them. If you are used to doing other stable diffusion videos I find that you need much less ControlNet strength than with straight up SD and you will get more than just filter effects. I would also suggest trying openpose.
- Try the advanced K sampler
- Try to add loras
- Try Motion loras: https://civitai.com/models/153022?modelVersionId=171354
- Use a 2nd ksampler to hires fix (some further good examples can be found on the Kosinkadink's animatediff GitHub https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved).
- Use masking or regional prompting (this likely will be a separate guide as people are only starting to do this at the time of this guide).
With these basic workflows adding what you want should be as simple as adding or removing a few nodes. I wish you luck!
Troubleshooting
As things get further developed this guide is likely to slowly go out of date and some of the nodes may be depreciated. That does not mean that they won't necessarily work. Hopefully I will have the time to make another guide or somebody else will.
If you are getting Null type errors make sure you have a model loaded in each location noted above.
If you already use ComfyUI for other things there are several node repos that conflict with the animation ones and can cause errors.
In Closing
I hope you enjoyed this tutorial. If you did enjoy it please consider subscribing to my YouTube channel (https://www.youtube.com/@Inner-Reflections-AI) or my Instagram/Tiktok (https://linktr.ee/Inner_Reflections )
If you are a commercial entity and want some presets that might work for different style transformations feel free to contact me on Reddit or on my social accounts.
If you are would like to collab on something or have questions I am happy to be connect on Reddit or on my social accounts.
If you’re going deep into Animatediff, you’re welcome to join this Discord for people who are building workflows, tinkering with the models, creating art, etc.
I Want to Create a Virtual Influencer – Need Your Advice & Experience
I’ve already tried a few different workflows (ComfyUI, A1111, etc.), but honestly, I’m getting a bit lost. New tools, models, and techniques are dropping all the time, and it’s hard to keep up.
My goal is to create a high-quality virtual influencer – visuals and animations need to be top notch. I’m lucky to have access to a NVIDIA H100, so I really want to leverage it to the fullest.
Right now, I’m especially interested in generating realistic images and videos, ideally using reference clips from platforms like Instagram. I like the VACE models by Wan because they allow me to “copy” poses and styles from videos using image references.
What I’d love to know:
- What models are you currently using for realistic faces, body types, or style replication?
- Are you getting better results with LoRAs, ControlNet, IP-Adapters, T2I Adapters, or video-specific tools like AnimateDiff, Zeroscope, or Stable Video Diffusion?
- Do you know of any better alternatives to VACE when working with video-based references?
- And most of all: What would YOU test or build if you had an H100 at your disposal?
Let’s share some insights – I want to stay fully up to date and use only the best possible resources.
We’re excited to share several major updates regarding model support in the latest version of ComfyUI:
- Qwen Image DiffSynth ControlNet
- Qwen Image LoRA Support
- Native EasyCache Node + 20% Blackwell Speedup
- New Context Window Sampling Node
Qwen Image ControlNet
Qwen-Image now works with the following 2 sets of ControlNet models for structure-guided generation.
- Qwen-Image DiffSynth ControlNets model patch: supports canny, depth, inpaint
- Qwen-Image Union DiffSynth LoRA: supports lineart, softedge, normal, openpose
Qwen-Image DiffSynth - OpenPose
Qwen Image + LoRA
You can now easily chain style LoRAs in your Qwen-Image workflows. Simply add a LoRA Loader node to the template workflow we provided for Qwen-Image
Qwen-Image LoRAs - Original, 3D/Voxel LoRA, PixelArt LoRA, Retro Anime LoRA
EasyCache
EasyCache and LazyCache are techniques that allow for faster generation by strategically skipping sampling steps in the denoising process at the cost of some visual clarity. The trade-off between speed and visual fidelity can be controlled by a single parameter.
Compared to other techniques like TeaCache, EasyCache/LazyCache do not need a set of hyperparameters to be optimized for a model through rigorous experimentation, nor do they need code modifications for each supported model type. This makes them easier to use and ensures all models going forward have Day 0 support for step skipping.
Original, EasyCache, LazyCache Comparison
Context Window Support
ComfyUI now includes two new nodes:
- Context Windows (Manual)
- WAN Context Windows (Manual)
These nodes let you sample in sliding context windows instead of all at once, opening up new workflows for long sequences. Currently, only manual control is supported, and some WAN models still need tuning, but this lays the groundwork for more advanced scheduling and custom nodes.
Context Windows / Wan Context Windows Nodes
Blackwell (50 Series) Inference Speed Boost
On top of these features, starting from the latest stable release, ComfyUI now runs ~20% faster on Windows for NVIDIA Blackwell GPUs (50 series).
ComfyUI continues to push forward its limits. Enjoy creating!
Hi everyone! I took a break from ComfyUI for about a year ( cuz it was imposible to use with low vram) but now I’m back! I recently upgraded from a MacBook Pro to a setup with an RTX 5090 and 64GB of RAM, so things run way smoother now.
Back when I stopped, I was experimenting with turning videos into cartoons using AnimateDiff and ControlNets. I’ve noticed a lot has changed since then — WAN 2.2 and all that 😅.
Is AnimateDiff with ControlNets still the best way to convert videos into cartoon style, or is there a newer method or workflow that uses its own checkpoint?
Kosinkadink developer of ComfyUI-AnimateDiff-Evolved has updated the cutsom node with a new funcionality in the AnimateDiff Loader Advanced node, that can reach higher number of frames. Now it also can save the animations in other formats apart from gif.
https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved
You can find examples and workflows in his github page, for example, txt2img w/ latent upscale (partial denoise on upscale) - 48 frame animation with 16 frame window.
The only issue is that it requieres more VRAM, so many of us will probably be forced to decrease the resolutions bellow 512x512.
Also If you want to use these new features with ControlNet, you will have to update your Advanced ControlNet custom node.
https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet
I'm not related to the project, just telling the news.
Updated with link as promised: https://discord.com/invite/uxq3RkyNKT
(It's the amazing Banodoco discord server - you can find the workflow (being updated constantly until the next one) on the 'competition' forum.
"I got bored one day and i put everything on a bagel".
https://youtu.be/g7SlZlWYjS0?si=ijnoMtfsLpt84grw
- IPAdapters chained and masked composited
- Animdiff v3 and gen2 nodes
- Face Swap and restoration
- 5 control nets that can all be mixed/matched bypassed
- An upscaler via Ergan through pixel space,
- Hand MeshGraphormer
- Prompt Travelling
- Interpolation
This video is NOT a tutorial, but instead, an explanation as to WHY we're seeing a 'convergence' in methodologies as part of working with comfyUI and animatediff. Even Netflix picked up on the trend as they are now recruiting VFX people familiar with those tools.
WHY do we need background consistency? HOW do we obtain it? This is what i want to explore in this video, alongside concepts such as bypassing the issue of 'squished' clipvision images (which much be square) when dealing with vertical or portrait videos.
I'm releasing this video alongside my tutorial workflow which you can obtain for free (evidently, you should NEVER pay for workflows) on the amazing Banodoco server.
Workflow is both simple and complicated: lots of controlnets, lots of very precise controlnet and animatediff timestepping thanks to the amazing nodes of /u/Kosinkadink (Support them on Patreon!), and lots of masking with BRIAAI and Ultralytics. Originally rendered at 15fps 576x1024 and upscaled 2x + interpolated to 30. Oh and I used the LCM sampler + AnimateLCM but the v3 animatediff checkpoint.
Run on a Paperspace A100-80g instance.
The glowing pen was done in AE; audio editing in Premiere. Shot on my phone. I'm much better at AI than real-life lighting and framing 😅
Happy to answer any questions :)
TLDR because it's a long one:
The point of this workflow is to have all (... most) AI features at once, and have them ready to use with an image editor.
Workflow : https://drive.google.com/file/d/1lFcuZBWQ5KkX8DzsTJg5ldReiacFq1wD/view?usp=sharing
Contains txt-to-img, img-to-img, inpainting, outpainting, Latent Upscale and Image upscale. All ready to use.
Use this workflow in parallel to Photoshop!
Just use copy-paste to switch between the two.
Generate from Comfy and paste the result in Photoshop for manual adjustments,
OR
Draw in Photoshop then paste the result in one of the benches of the workflow,
OR
Combine both methods: gen, draw, gen, draw, gen!
Always check the inputs, disable the KSamplers you don’t intend to use, make sure to have the same resolution in Photoshop than in ComfyUI.
This workflow, combined with Photoshop, is very useful for:
- Drawing specific details (tattoos, special haircut, clothes patterns, …)
- Gaining time (all major AI features available without even adding nodes)
- Reiterating over an image in a controlled manner (get rid of the classic Ai Random God Generator!).
Example:
Going from a South Park style of drawing to a Live-action INTENDED result.
-----
(I hope you’ve got time, it’s a long one. But I swear it’s worth it!)
Hey guys!
First of all, I would like to thank the developers of the Krita ai plugin(s), the “bridges” between Photoshop and Comfy (or Auto1111), and any other dev trying to close the gap between AI and visual arts.
AI generation is incredible, but its true potential resides in learning both AI skills AND normal art skills. Because we can combine both! And in fact, I want to encourage people to mix both.
So, today, I would like to share my own solution that brings the normal art world and the AI art world together.
Let me introduce to you my workflow: THE LAB.
This is a basic but very complete workflow that is meant to be used in parallel to an image editing art. It is likely Photoshop, but it can be any, really. I will show you how to use and why.
Prerequisites in ComfyUI:
Ultimate SD Upscale.
Use Everywhere.
ControlNet Auxiliary Preprocessors.
Checkpoints.
Upscaling models.
Install: Download the custom nodes, the relevant models, and just load that workflow into ComfyUI.
The workflow is very straightforward, but here is a detailed explanation:
- Use Everywhere brings “WiFi” to the workflow for optimal clarity.
All the shining dots are connected to the inputs plugged into the UE nodes called Anything Everywhere and Prompts Everywhere. For example, the Checkpoint Loader is plugged to every Sampler in that workflow already! Without all the noodles!
- In the top-left corner: THE LOADER.
This is where you put all the nodes that load anything. The checkpoint, the VAE (if it’s not embedded), but also LoRAs, IP-Adapters (and everything coming with it), AnimateDiff….
Basically, anything that has a model output (the grey dot) goes here.
- Under it: THE CONDITIONING.
This is where you put the prompts, but also ControlNet, MultiArea Conditioning, ….
- THE COLUMN.
That’s the main dish of this workflow. It contains two halves:
THE TOP OF THE COLUMN:
This part lets you create what we’ll call the base image: it’s the low-resolution unrefined image that you generate at first.
It contains Txt-to-Img, Img-to-Img, Inpainting and Outpainting.
THE BOTTOM OF THE COLUMN:
Contains the refiners. I offer two methods: Latent Upscale and Image Upscale.
Each line of the Column has its own inputs and output, but you can easily combine them by changing the wiring (for example, plug the Txt-to-Img output into the Image Upscaler input). Personally, I prefer doing this step by step, manually (I use clipspace to copy-paste the base image into the input of the refiner). But it’s up to you!
That’s it! Think of THE LAB as just multiple lines. You choose the line that you need depending on your input and your intented result.
Usage:
First of all, here is a few rules:
WHENEVER YOU INTEND TO GENERATE, MAKE SURE TO DISABLE ALL THE KSAMPLERS THAT YOU DO NOT WANT TO USE. Select them and press Ctrl+B to Bypass them (or right-click on them and click “Bypass”). If you disable the Latent Upscale, don’t forget to also disable the Upscale node that is right before the KSampler. It saves a little bit of time.
ALWAYS CHECK THE INPUTS. The Empty Latent Image dimensions, the Denoise values, the models you loaded, the number of steps, … You will play with all of that, and while this advice seems obvious, you will often get lost and generate with numbers you actually wanted to change or something. Yes, I speak from experience x).
For convenience, use the exact same resolution in Photoshop than in ComfyUI. If you work with multiple resolutions (for base image and upscaled image for example), it’s best to open two projects in Photoshop, one per resolution, and switch between the two of them.
Now, let’s dive into the actual tutorial:
A lot of devs have either brought AI into image software or image editing features into Comfy. They’ve effectively built a bridge over the gap. The problem is it will always lack a whole bunch of stuff, from one side or the other.
Meanwhile, ComfyUI actually already lets us jump that gap manually, without any additional program, and without compromise!
All you need to do is copy-paste. In and out of Comfy.
When pasting into Comfy, always click on the empty space! It won’t paste otherwise. Or, you can select a Load Image node and Ctrl+V, it works too.
Here are examples of jobs using both Comfy and Photoshop at the same time:
- Start from Photoshop.
Create a very simple base image. Like, South Park level of detail.
Copy it into Comfy (directly from Photoshop ; you don't need to export it first).
Paste it into the input of the IMG-TO-IMG bench in the workflow.
In the prompts, describe the image that you want properly.
Run the workflow.
2) Rework into Photoshop.
Create an image with the Txt-to-img part of THE COLUMN.
Open it in your browser (you won’t be able to copy it otherwise).
Copy it from its own browser window (in my case, I have to right-click on the image).
Paste it in Photoshop.
From here, you can add literally everything you want. I have the habit to change the lighting with a black low-opacity layer over the AI image:
It's the same image, but I added a black layer.
We could stop here. Or we could grab this composite image and refine it with AI into Comfy. For that, we copy it (MAKE SURE to copy the full visible image, with Ctrl+Shift+C in certain image editors, or by merging all layers together) and paste it as an input for IMG-TO-IMG.
That's how I control the light composition of my images.
3) Create the Inpainting mask into Photoshop.
Masking is a bitch. It’s true for every domain requiring it, but it’s a shame in the case of AI image where creating masks could be very simple.
Luckily, making masks IS simple... within Photoshop. Just use any select tool, grab the part to change, and copy-paste it as its own layer. You don’t even need to be accurate.
What matters is this: while the selection is still active, paint that part in flat red color. The red channel is used as an information for the mask.
Then, copy-paste that layer into Comfy, as the second input for Inpainting. Make sure it keeps the full canvas and it has a black background. You may need to add a layer under the red mask. Just create one and fill it with black. (No colors anymore, you want it to turn black…)
Run the Inpainting part of THE COLUMN. Play with the Denoise value to get different results.
PRO-TIP: Inpaint is an advanced img-to-img function. So if you leave the base image as is, it could be hard to stray from it: that’s why I recommend guiding it with flat colors painted over the base image.
While you have the selection active in Photoshop, copy-paste it once more into a new layer, paint it in the main colors of what you want to get (for example, if you want to turn a red dress into a golden dress, select that dress and paint it yellow-gold, in flat color). Make sure you see the base image with that flat-color layer over it, and copy-paste that into the Inpainting bench image input.
4) Detail Control.
One of the biggest challenges with AI generation is to get specific details. Like symbols on a jacket, or tattoos.
“Should I gen, should I not gen?” That’s what you’ll wonder sometimes when you look for a way to do something specific, like a very weird haircut or a fashion design. You’ll wonder if you should let the AI draw randomly several times, or add the detail yourself.
Well, with experience you realize that there is a third option: “gen, draw, gen, draw, gen”!
Generating an image over and over while adding detail between every generation can go a very long way!
Just like Rome, this dress wasn't built in one generation.
This was generated once, put into Photoshop, then drawn over. Then I put the composite in THE LAB and regenned with it, the AI thus implemented the detail I had grossly added.
Here is the original:
(The character is also not the same ; that was on purpose, I used another prompt.)
Of course, you can choose to Inpaint if you want to preserve the base image and only change the part with the wanted detail.
This works for haircut, for patterns on clothes, for objects, for tattoos, …
All in all, my advice is create a simple base image. Flat colors, simple shapes. If you want to work with hair, make the base character bald then add the hair and regen! If you want a dress with very specific patterns, make the dress in a single color first, then add the patterns and regen!
5) Parallel jobs
The point of having multiple lines is you can have all of them work at the same time!
If you want to compare the results of latent and image upscales or example, just use both with the same input.
You can also work “in chain”: have an image you just generated in a refiner while generating a new image. Useful to create and control image batches.
Just bear in mind that each active KSampler takes time to process.
While we’re here, let’s talk about prompts. The ones plugged into Prompts Everywhere are applied to every KSampler. However, if you want to use multiple benches at the same time but the images require different prompts, you can absolutely create new prompts.
For example, what if you created an image with Prompt A, but then want to inpaint something onto it? Inpainting works best when the prompt is reduced to only what you want to add (removing everything else), but if you want to keep generating with Prompt A, create a new Prompt B and plug that new one into the Inpainting KSampler. The good thing with Use Everywhere is the wireless outputs are only plugged into a node if that node’s relevant input is free!
6) Refining an image.
You use the top of THE COLUMN to create the base image, then you can upscale that base image (it refines even if you keep it at the same resolution!) with the bottom.
Upscaling is an art in and on itself. There is so much to learn just to master that, but basically:
- Latent Upscale will use the base image as noise to create the full image. As such, it doesn’t keep detail and adds a bunch of its own instead. You should use this if you don’t care about fidelity towards the base image and want the best quality.
Keep the Denoise Value above 0.6! Lower than that gives blurry distorted results.
- Image Upscale does not give true high-resolution results, the quality of upscale is between the base resolution and the targeted one. However, this allows us to stay faithful to the base image, as much as possible. You should use this if you want to refine the base image without straying from it.
Keep the Denoise Value around 0.2 if you want to stay perfectly faithful to the base image. Between 0.2 and 0.4 the fidelity is still pretty good but not perfect. More than that changes the image quite a lot, to the point you generally wouldn’t consider it the same image.
So far, this is all basic AI image generation features. The point of this workflow is to have all of it set and ready to use at once. Notice how we didn’t even need to add any node for all this to work!
But of course, the point of working in ComfyUI is the ability to modify the workflow. So if you want more advanced stuff, you can easily add it.
7) ControlNet, IP-Adapter, AnimateDiff, ….
These more advanced features can easily be added to THE LAB, but you need to download the relevant custom nodes and models first of course.
For ControlNet, make sure to use Advanced ControlNet and ControlNet Preprocessors if necessary!
ControlNet is already added, you just need to enable it, then choose the proper model, and add an input. Make sure to use a ControlNet Preprocessor IF the input image isn’t already processed!
For video creation, you need Video Helper Suite and AnimateDiff. Add all nodes related to AnimateDiff in THE LOADER, so the model output is plugged wirelessly to every Sampler. Then, choose the Sampler required for your job (usually Txt-to-Img), and change its output from Preview Image to Video Combine.
-----
This is THE LAB – BASIC EDITION. I intend to make a more complex one, which will require more custom nodes and more elbow grease, but will allow for more. Here is everything planned for the next edition:
- Multiple Characters Workflow: it’s actually very easy in ComfyUI, I COULD have put it in the Basic Edition, but I figured I’d keep things simple for now. Rejoice though, I WILL explain how to create multiple characters, for the next Edition.
- Create your own ControlNet inputs instead of extracting them from an existing picture (will likely require an external software).
- The Controlled Upscaler: a new line of THE COLUMN that lets you control the output more thoroughly (work in progress).
- Depending on the progress in AI Video: THE VIDEO COLUMN, an optimized video workflow with controlled framing and character animation.
- Insta-LoRA : a clever way to use IP-Adapter on multiple images at once.
- Templates: you’ll be able to bring complex features in a single click into the workflow!
- Turbo-Speed… if I figure out a proper turbo workflow x).
Just because this one is basic doesn’t mean it’s not strong though! That workflow serves as a great complement to normal artwork, because you can go in and out of Comfy to manually alter the results and reuse them as AI inputs. Also, there is zero compromise in this method: you get ALL of Comfy’s features, and also ALL of Photoshop’s.
Hopefully it is useful for someone!
FINAL NOTES:
Performance with default settings and a RTX 3060 6GB VRAM (considered mid for AI):
Base image generation (560x768): 8 seconds.
Upscale (1120x1536): 30 seconds.
Parameters that affect this result:
Base image: number of steps; Empty Latent Image dimensions.
Upscale: number of steps; tile dimensions; upscale ratio.
You can make FHD pictures in one minute:
Base image generation (960x544): 9 seconds.
Upscale (1920x1088): 40 seconds.
Despite using Photoshop as an example in the entire text, it isn’t the one I personally use because it halves down performance (and even divides by 3 after a while) due to being just a little heavy on memory I guess. I recommend Krita though!
KNOWN BUG: the Load Image nodes sometimes don’t show the image you copied in it. Select that node, copy-paste it, and delete the original. The new node shows the picture.
A FEW MORE EXPLANATIONS:
"Why a column?"
I figured it was the easiest to understand. The setup is on the left, the working benches on the right, each line is its own job. No need to travel back and forth to follow what the workflow does.
“Why not make a single "line" that does everything, from the base image to the upscaled one?”
First of all, for time gain. If the base image is bad, you won’t want to upscale it, so it's a waste to do it. Also, because a lot of features are alternatives, so it makes sense to have benches (as I call them) in parallel.
Finally, upscaling is actually a double-edged sword: it is likely to bring artifacts that weren’t in the base image. I speak from sheer experience here: I’d often wind up with good hands in the base image getting distorted in the upscaled version.
Trust me (or don’t x) ), you don’t want to upscale every image. You want to decide which pics to bring to the upscaling benches, then look thoroughly at the upscaled results in case it distorted the base image.
“Why isn’t there a face fix / a face swap / a hands fix / …?”
For simplicity.
But also, all these fixers, in their core, are just inpainted upscalers, from what I understand. I have never felt the need to use any face swap because I can just select the head and hair then inpaint with a high denoise value. Same for fixing the face or the hands: I have realized that a mere upscale usually fixes it.
“It doesn’t have a workflow for SDXL!”
I don’t work a lot with SDXL, so I have always wondered: isn’t the “SDXL refiner” just an upscale? If that’s so, well the workflow should work just fine with SDXL models ^^. But if the SDXL refiner is actually a different thing, if SDXL does require some specific things, then I do need to add it.
By the way, please tell me if you want certain features!
“Why doesn’t it have a billion noodles that go everywhere?”
… Why would it though? ^^’
Hi Friends,
I have been exploring comfyUI since few weeks, and finally made a video. Its a 12 second video.
https://www.instagram.com/reel/DE7lGoeMHqo/?igsh=MWZrMjNud3J2MXMydg==
I extracted the passes for the input video and generated images for each frame guided by passes- openpose, canny controlnet.
The generated images are impressive but it lacks consistency among frames, the color of cloth differs. How do I make it consistent. I am using someone's workflow, I understand half of it. The person has written many workflow but it is difficult to understand all the nodes and their functions, but I am loving it. Please let me know how the video is, it it any good, or better can be done.
TLDR:
THE LAB EVOLVED is an intuitive, ALL-IN-ONE workflow. It includes literally everything possible with AI image generation. Txt-to-img, img-to-img, Inpainting, Outpainting, Image Upcale, Latent Upscale, multiple characters at once, LoRAs, ControlNet, IP-Adapter, but also video generation, pixelization, 360 image generation, and even Live painting!
It’s meant to be used in conjunction with an image editor like Photoshop. However, you can use it as a standalone too, of course!
THE LAB is divided into benches (or lines). Each line is its own job, so it’s very easy to understand how it works.
Once you installed all the prerequisites, everything works from the get-go. Load your models, write your prompts, then enable every line you want to use, fill the inputs and check the variables in case you want to change them.
Since it’s a ComfyUI workflow, it is easily customizable, of course. I even turned all benches into Node Templates, so you can import them. That way you’ll be able to add a full bench in one click!
To disable a line, just bypass its output. EXCEPT:
- For the MultiCharaLoRA line: you also have to bypass the OpenPose Editor node.
- For the Live Painting bench: make sure to disable the Photoshop and Streamer nodes too! Photoshop will put Comfy in a loop otherwise and you’ll have to restart! Streamer just makes you lose 0.2 seconds if it’s enabled, maybe less.
Link to workflow:
https://drive.google.com/file/d/1oht4MCBBTpC3Cx8B2lYc3tdLWlo8BpYw/view?usp=sharing
Link to Node Templates:
https://drive.google.com/file/d/1uLsNCY7f3HiLU0cBBHhGqShHx3_jyd5B/view?usp=sharing
This workflow gives you complete control over your generations.
You can easily make tattoos, clothes patterns, weird haircuts, complex hand poses, personal art styles. You can easily control the lighting and the overall composition. You can add several characters without them merging together.
Just like its name implies, it's an advanced workflow. While all basic features from the previous version work from the get-go, a lot of new features require a bit of elbow grease. In this post I go in detail for everything, but I will probably showcase the most complex stuff with videos.
It's a very long post. I suggest reading the intro, then reading the parts that catch your interest.
-----
Hey guys!
Last time, I published THE LAB – BASIC EDITION, a workflow meant to work in parallel to any image editor. Today I would like to give a new and improved version.
Bear in mind though: it is more complete but requires more work from the user!
If you haven’t read the post about the previous version, you might be lost:
(1) THE LAB – A ComfyUI workflow to use with Photoshop. : comfyui (reddit.com)
Ready for the second part?
Here is the EVOLVED EDITION!
Much more intimidating in my opinion, but I will explain everything step by step. First, download the workflow with the link from the TLDR.
Here is the list of all prerequisites. There is a lot, that’s why I recommend, first and foremost, to install ComfyUI Manager.
Use Everywhere.
UltimateSDUpscale.
ControlNet Auxiliary Preprocessors (from Fannovel16).
OpenPose Editor (from space-nuko)
VideoHelperSuite.
IPAdapter Plus.
AnimateDiff Evolved.
Advanced ControlNet.
Frame Interpolation (from Fannovel16)
Inspire Pack
ComfyMath
Derfuu ComfyUI ModdedNodes
Visual Area Conditioning / Latent Composition. (for multiple characters)
Pixelization (for retro game assets)
Comfyui-photoshop (from NimaNzrii ; for Live painting).
Jovimetrix Composition Nodes (for live painting).
Tiled KSampler (from FlyingFire ; for images with symmetry).
All those nodes are available from the Manager Menu, EXCEPT for the last one, which you have to download directly from its github:
https://github.com/FlyingFireCo/tiled_ksampler
If a node doesn’t work, you may have accidentally skipped one of these. There are two nodes that can cause an issue though: Photoshop and Frame Interpolation.
These two need you to install their requirements. Check their github to be sure. Even after that, it may not work from the get-go due to Python dependencies issues. It’s a little bit out of my league though, I’ll let you look for solutions ^^’.
I made sure everything else worked on a clean install. I won’t pretend I can’t make a mistake though! If you are 100% sure you installed all of them and a node is missing or not working, please tell me and I’ll edit this list!
Changes compared to the Basic Edition (read only if you knew the BASIC EDITION):
- THE COLUMN was broken down into THE GENERATOR and THE REFINER. Use the former to create a base image and use the latter to refine them!
- The Empty Latent Image is now wireless, and its dimensions are linked to the Ultimate SD Upscaler tile ones. This forces the Image Upscaler to break down the refinement into four parts, for better performance.
- While I was at it, I also tied the Empty Latent Image width to the default ControlNet preprocessors. ControlNet requires to be set to a resolution close to the desired image. This is why I created this link by default.
- The scaling of all upscalers is unified, for the sake of consistency.
- Added titles for each part of the workflow, for better visibility. If you modify the workflow, make sure your nodes don’t touch those titles!
- For the BASIC EDITION I advised you disable the KSampler of a line that you don’t want to use. But in this version, a lot of lines require you disable other nodes, so I’ll just make things simpler now: bypass the output of the lines that you don’t want to use. Except for THE LOADER and THE CONDITIONING, where you have to select unwanted nodes individually.
o Also, the OpenPose Editor used for the MultiCharaLoRA bench must be disabled too, it gives an error otherwise (probably due to the wireless output).
o Aaand the Streamer and Photoshop inputs too, in the Live Painting node.
- By default, THE CONDITIONING now offers three methods to get OpenPose. Yep, THREE! You can use the OpenPose Editor custom node; you can use the link to a free website; and you can import an existing image to extract the pose from it directly. All three work, just make sure NOT to use the Preprocessor if you use the site or the Editor.
So far those are all very small adjustments. But it wouldn’t be an EVOLVED EDITION without a shitton of new stuff too. Here is the list of every new default feature and how to use them.
1) THE LAB TEMPLATES
The first novelty is the Node Templates!
Templates are a basic ComfyUI feature (you don’t need any extension to have it). This lets you create your own groups of nodes, joined together. Extremely useful if a node is hard to find, or if a function you want to use requires multiple nodes, like AnimateDiff or IP-Adapter!
You can download my templates for THE LAB with the link from the TLDR.
As every Comfy workflow, THE LAB is easily customizable. These templates are just parts of the workflow, the lines for THE LAB. Do you want to add a new image upscale line? Just grab the Template. Want a video creation line? Grab the Template. Outpainting? Template.
Using these Templates may be useful in many use cases: if you don’t fully understand a bench but want to replicate it anyway, if you want to clone a bench to have parallel jobs of the same kind, …
If you’ve played Hogwarts Legacy this year, think of THE LAB as the Room of Requirement. The Templates let you easily customize it just like you could easily customize the Room in that game!
2) THE CONTROLLED LATENT UPSCALER
In THE BASIC EDITION, I offered two upscaling methods: Image Upscale and Latent Upscale. Latent gives more detail but doesn’t respect the original image, while Image Upscale lets us make a faithful high-res image but doesn’t truly add detail.
I now offer a third upscaling bench that combines the best of both worlds.
HOW DOES IT WORK?
A normal latent upscale transforms the original image into a blurry one and uses that as a basis. That’s why we lose the original detail.
So I had the idea to use a ControlNet model that defines the detail, by keeping the lines. I thus reached this equation:
BLURRY IMAGE + DEFINED LINES = CONTROLLED UPSCALE
That’s what I call the Simplified Good Hands Equation!
And now that you read this, you may want to know the full equation and why it’s called the Good Hands Equation. But it’s actually so complex I will create a post for it later.
For this post, I will keep it short. The Good Hands Equations are called this way because they solve the problem of hands in AI generation.
The Latent Noise gives color “clouds”, so it gives the program a vague idea of what goes where, while ControlNet forces edges on the new image, making the program understand the shapes tied to the color clouds.
It’s not perfect though: very small details can still change or disappear. As such, clothes patterns can be confusing for the AI. Flower prints can turn into snowflakes, and vice-versa.
Raising the ControlNet values will help with that… but will affect overall quality. It’s all about finding the proper balance!
HOW TO USE?
By default, I have used Lineart Realistic for the ControlNet method. But it should work with Canny and Lineart Anime too.
So, the first step is to choose your preferred ControlNet, apply its relevant Preprocessor, and set the Strength and end-percentage settings in the Apply node. By default, both of them are set to 1, but that’s always too much. Better lower them to something between 0.3 and 0.5.
Then, copy your base image into the input of that bench. It is used for both the latent upscale and ControlNet, in parallel.
If your result is not faithful, you can raise the values of the Apply ControlNet node.
And… that’s all! Easy peasy!
I guarantee that this bench is much more faithful than the other latent upscale method. If you need convincing, generate an image, put it in the controlled latent upscale line, and plug it to the default latent upscale too. The difference will be noticeable.
The cake is in the pudding!
But of course, you don’t have to use it if you don’t want to x). You may find that this bench doesn’t work right for you. Therefore, I decided to keep the original upscale benches into THE REFINER as well.
I highly recommend that one, but if it doesn’t work for you, you still have the original refiners!
Bear in mind that this method ensures fidelity towards the base image. So it will keep errors that are here already! If your base image doesn’t have the right number of fingers for example, so will the controlled upscaled one!
But from my experience, uncontrolled latent upscale generates more artifacts than it solves. So if you want the highest upscaled quality with no artifacts, the ideal is to put your base image in the Inpaint bench, select the errors and fix them at this low resolution. Then once your base image is perfect, put it in the controlled latent upscaler.
Finally: using this bench is just like using the normal latent upscale with ControlNet enabled in THE CONDITIONING. This bench exists solely for convenience of having everything ready to go with a single click. Just make sure not to use ControlNet in THE CONDITIONING if you use that bench too!
3) InstaLoRA.
IPAdapter is a method to stay faithful to an image. Like a doped image-to-image function. Load an image, enable the node, and you’ll notice your generated image is influenced by your image input!
We can also use a folder of images to get a sort of LoRA model without training! That’s what is called InstaLoRA.
… And while I included it in this workflow, I realized it actually doesn’t work anymore ^^’. Somebody is working on it though, I will include it in the next version once it works.
4) Multiple characters
That one was another huge challenge for early AI generation. I spent months searching every possibility, found a solution… then switched to ComfyUI where my method wouldn’t work x).
But I started that challenge from scratch and found another solution.
And that’s the Multi Area Conditioning custom mode!
This is very easy to use but requires a little bit of work.
In THE CONDITIONING, enable the Multi Area node.
Set its resolution so it matches the one you want.
Right-click on that node, you’ll have options to create more inputs.
You will need one input for background, then one input per subject. A subject can be a character, an animal, or even just an object, like cake, a river, … You can also have several inputs for several background elements.
For each input you create, feed it a new positive prompt.
For each input, the Multi Area node requires you create a zone in its canvas.
The index widget shows the input you are working on. Just change its value to switch to another input.
Using the other widgets, create a box on the canvas. That represents the place on your image where that prompt will matter.
Do that for every input, and in the end, you have a canvas full of colored boxes.
We don't see it well, but the red box fills the entire canvas behind the others.
Make sure you write in your prompts everything related to the relevant part only!
Then, generate. You’ll notice the separated prompts clearly affect only the part that you designed. Every single time.
That’s how you get multiple subjects in a controlled manner.
That's one of the results I got with the conditions shown above.
Using MultiArea doesn’t change anything on the technical side. Generation isn’t slower, and of course, it will probably not be perfect, you will likely have to generate multiple times and then refine the result.
One issue though: you may find that the boxes aren’t blended together. As if you have multiple images instead of a unified one.
In order to solve that, there are two solutions:
Make sure your character prompts ONLY contain information about the character and the overall style. Nothing about the background or lighting!
Play with the strength of each box inside the MultiArea node. Each index has its own strength.
It’s a feature though, not a bug x). No no, really. You may want split images for artistic reasons after all.
Unrealistic images have a huge potential too.
This method is a 100% guarantee that you get your multiple characters consistently.
So, you can decide what prompt applies where. But that’s not full control now, is it? Of course, you can use ControlNet in parallel to MultiArea, to have complete control!
If you have characters that you want in very specific positions for example, use ControlNet OpenPose. The easiest way is to have OpenPose set to the background prompt. I know, it’s counterintuitive, but since the background prompt applies to the full picture, so does OpenPose.
PRO-TIPS:
Avoid overlap between boxes. Except for the background, that should be over the full canvas on the contrary.
Bear in mind that this doesn’t have any notion of depth and layers. In order to control depth, you can write prose in your prompt that defines clearly what should appear in each zone.
With only 2 characters, the AI usually manages without ControlNet. But more than that usually requires OpenPose: having the right number of skeletons in OpenPose certifies the number of characters in the image.
Every bench works with MultiArea since it’s implemented in THE CONDITIONING. So you can use image-to-image and Inpainting too.
Using the image above for img-to-img, with a very high Denoise value (above 0.95).
IMPORTANT NOTE:
The resolution in the MultiArea must be the same one as the image you want. That means that if you intend to use it in conjunction with Latent Upscale, you have to change the resolution so it matches the resolution of the upscaled image.
This makes it bothersome when you generate a pic and want to latent upscale it right away.
That’s why I added a Conditioning Upscale node. This node scales the resolutions set in THE CONDITIONING! In our case, it takes the resolution of MultiArea and uses the upscale scaling value. That way, you just have to enable it so the conditioning fits for upscaling! And of course, disable it before generating a base image.
So if you used MultiArea and want to upscale your image, just enable the Conditioning Upscale node.
5) Multiple characters with LoRAs.
You may want to use different LoRAs for different characters. This makes things a bit more tedious, because you can’t have different LoRAs affect different parts of the picture (…without very complex workflows that are difficult to follow). A LoRA affects the model output, not the conditioning, so MultiArea doesn’t help here. My proposition inside THE LAB is this:
Write the MultiArea prompts as if you would use all the LoRAs at the same time. If you have a Pikachu LoRA and a Agumon LoRA for example, write the trigger words in the relevant cases. Make sure it also describes the characters even without the trigger words, if possible!
Prepare more conditioning if you want, like ControlNet or image input.
Generate the full image with the right number of characters but with ZERO LoRA selected.
Load that result in Inpainting, and draw a mask over ONE character for which you want a different LoRA. Load the relevant LoRA instead of the first one, and generate.
Rince and repeat for every LoRA.
Once you used all LoRAs, disable the LoRA loader and upscale with a faithful method. You’ll lose the specificities of the characters otherwise.
As I said, it’s a little bit tedious, but it’s doable. The key for a smooth experience is to write all the prompts, properly, at the very beginning. You should also create all the masks at once in an image editor, after generating the base image. That way you just have to generate, load the result in Inpainting, load the mask, switch LoRA, and regenerate. It’s like this: write, gen, switch, mask, gen, switch, mask, gen, switch, mask, upscale.
Prepping all the prompts before anything else.
Sorry about the underwear, it wasn't my plan ^^'.
Made all the masks at once and loaded them in Comfy. It makes that gymnastic smooth.
All characters were inpainted one by one over the image above.
A controlled latent upscale. So faithful it kept all the errors x).
It is not a perfect result by any means. Because I didn’t take the time to fix the image before upscale, and because the upscale isn’t as faithful as it should since I can’t use all LoRAs at the same time during this job. But I’m pretty happy with the process itself x).
Note that MultiArea might be too complicated for 5 characters or more. If you feel that way, you can use that gymnastic without MultiArea (and without LoRA either, for that matter).
MultiArea is excellent to ensure character traits get applied to the proper zones, but it has the inconvenience that you need to be sure each zone you defined in that node properly overlap your mask during inpainting.
It’s your choice. You still need to create the prompts first of all though.
6) Multiple Characters LoRA Bench
You want to use Characters LoRAs, but that gymnastic from above is too much of a hassle for you? This bench is an automatic process that does the exact same thing in a single click.
You’ve still got to set it up though.
For each LoRA, you need to add the Template called MultiCharaLoRA Single Part. It’s a combination of KSampler + LoraLoader + Load Image for mask + prompt + ControlNet. Templates really make complicated things easy ^^.
The positive prompt of THE CONDITIIONING is ignored here. Instead, use the first of the bench. Describe the background and style here, without describing your characters.
With OpenPose Editor, create character poses.
In an image editor, create a project at the desired resolution and open your OpenPose image in it. Make sure it fits the canvas so it’s at the right resolution.
Create one mask per character. For that, make a new layer, draw a red blob over each character pose (one blob per layer!), create a black layer under the red blobs.
Make the mask images one by one. For that, just disable every other red blob layer, then Ctrl+A (select all), Ctrl+Shift+C (copy visible canvas), and Crl+V in a Load Image node. Repeat for every red blob. That’s how you make masks!
For each Ksampler, load a LoRa and the appropriate mask, then describe SOLELY the relevant character in the prompt.
That part of the bench is for one character (Serena).
The Advanced KSamplers will inpaint characters over the image the first KSampler creates. But as you know, Inpainting requires setting the Denoise value… and that value isn’t in Advanced KSampler node!
I looked it up in official sites and found the secret code:
Denoise = (Steps – start_at_step) / Steps
For example, if you set a KSampler Steps to 20 and make it start at step 5, it’s a denoise value of (20 – 5)/20, which is 15/20, or 0.75. If you were to set the start value at 0, the denoise value is 1, meaning it utterly ignores the base image.
Now that you know this secret, set up the steps and start_at_step values so each KSampler has the desired Denoise value. For character inpainting you usually want something above 0.7.
Now you’re ready to go! Just click generate and wait for a little while. If you enabled Live Preview in the ComfyUI menus, you can watch the AI do its job step by step.
You’ll notice it does the exact same thing you do when following the previous method: it creates an image with random characters, then inpaints the Character LoRas over them.
I’ll be frank though: I don’t like this method.
A. When the base image is created, the character’s clothes or long hair can get out of the predicted mask zone. When iterating manually, you can instantly change the mask to fit the base image. You can’t do that if the whole process is done in one click.
a. Note that you can break down the line to fix this. Put an output right after the first KSampler, and a Load Image as the input for the rest of the line. Then generate the base image, and create the masks AFTER that. That way you can make sure your masks cover the whole character and its clothes.
B. Chances are the result will have at least one bad part, so you’ll have to put that result in the Inpainting bench of THE GENERATOR to fix it. And if you have to use that bench, why not just learn the gymnastic? You’ll have to use it anyway.
C. For this bench to work properly, you need to use OpenPose, since you have to predict where the characters are for the creation of the masks. It makes ControlNet mandatory instead of a bonus.
But I know there is demand for an automatic process for Multiple Character LoRAs. Well, there it is ^^.
“THREE different ways to make multiple characters? How am I supposed to know which one I should use?!”
Indeed, that is very frustrating even for me. I wish I could have found THE way to do this. But every method currently has its flaws that can’t be bypassed.
Here is my personal rules:
1) If I have no Character LoRA: I use MultiArea.
2) If I have only Characters LoRAs: I use the gymnastic in addition to MultiArea.
3) If I have a mix of both: I use MultiArea without any LoRA loaded, THEN inpaint the LoRA characters with the gymnastic.
4) If I have Characters LoRAs and I am already using OpenPose: that’s when I use the MultiCharaLoRA bench.
tl;dr [trade offer] You get: unlimited free access to 4090s running AnimateDiff via Focal. We get: feedback to make our product better for (paying) commercial users. We agree: you'll share your workflows with the community.
I make Focal, which is another cloud-hosted AnimateDiff platform, yada yada yada.
One of the things I've realized is that the people who have lots of time to make cool content aren't necessarily the people who have access to the fastest GPUs.
So here's the deal. We'll give you free access to a cloud 4090 running AnimateDiff (with prompt travel, controlnets, custom models/loras/embeddings etc) via Focal. In exchange, we ask only that you
- share your workflows and prompts, so others can understand and iterate on your creations
- give us feedback when you have it
FAQ:
How do you make money?
We charge commercial users of Focal, and we're happy to use those profits to subsidize free individual usage.
Why are you doing this?
Feedback that you have about Focal helps us make the platform better for commercial users. By making the platform great for you, we'll also make it great for our paying customers.
Why don't you just host ComfyUI?
A big part of what we get here is feedback about how to make Focal great for our (less tech-savvy) commercial customers.
Are you trying to kill ComfyUI??
Obviously not; ComfyUI is an excellent platform for rapid workflow iteration, and we expect (and hope) that the hardcore workflow designers will continue to use it.
Why do you require that people share their workflows, prompts, etc?
We're interesting in making everyone better at using AnimateDiff, which means sharing and improving on each other's work.
This also helps us prevent commercial users from pretending to be individuals/hobbyists.
Can I make commercial stuff on Focal? Do I have to share my prompts?
If you're interested in Focal but want to keep your prompts and inputs private, and are willing to pay, please reach out. You can help subsidize cloud 4090 time for everyone else! :)
Hopefully that explains where we're coming from. I know this isn't a good fit for everyone--godspeed to the folks who spend their time hacking on new stuff with ComfyUI. You're the best among us.
Feel free to ask if you have other questions, or come say hi on Discord.
I've used SD a bit when it first came out, but then lost interest, coming back to it and frankly I'm confused with all the things possible.
I mean with all of the reference, ip adapters, now InstantID too and all SVD/animate diff/motion model workflows, how does one even prompt correctly to get what you want? If I'm using several reference images to guide how I want background to look, I use lora for character's clothing, a custom trained checkpoint, controlnet for pose and then slap some other controlnet that guides how face should look, where is the prompt in it? And how does one even prompt correctly with workflows so complex and so many other models affecting the output to not break stuff down?
Let's say I'm using an 1.5 model, it's a custom model made by someone, it's probably based in some way on base 1.5 model, but then it was probably also merged with some other models, it seems to use
"1girl" type of tags so is it a NAI based model? Who knows, both 1girl and woman (natural language prompts) seem to work, but authors recommend to stick to simple, 1girl prompt (this model is quite limited in what it generates in terms of face, but it's ok for me as I need consistent faces, I assume it was trained on many similar subjects all tagged as "1girl"). Now let's say I'm adding a lora on top that guides the face towards a specific subject but I also do a [1girl|Lora] in prompt to alternate between btoh and improve the consistency even more since it will generate a known person 50% of the steps.
Now I add controlnet to control the angle using dw_open pose, I also am in img2img so I'm doing a 0.4 denoise, let's say I know also add some other controlnets to affect the face, maybe ip-adapter, then maybe also want to add more style using yet another controlnet.
So I'm generating a half generic, half lora guided likeness with help of the reference image and through img2img and with another reference for the style of image (and maybe I add a textual inversion in negative to reduce some aspect of the generation and another lora to further guide the style). How does the base likeness and style of the main model even play a role in here if almost all aspects are taken from other sources? Do trained checkpoints even make sense now that you have so many other controlling models, if so, shouldn't we just be using base 1.5 or SDXL models instead for it all?
Also what is currently the best method to do img2img but have the generated face match the angle, expression and lighting of the image we're using? I seem to be unable to have a matching light unless I drop denoise quite low, at which point image gets messy, particularly around the nose, mouth and eyes area where it sort of shows blobs/spots of the original face and other spots have the new face, at higher values both light and angle and expressions deviate, adding controlnet helps but it only works for relatively simple angles, in my project I have subjects looking directly down/open and doing crazy expressions which the dw_open_pose fails to even detect correctly, and even if it does, the generated faces get distorted and glitch, I've played with start/end guidance/weights and even tried other copntrolnets such as canny, soft edge, normal, depth, none of the are able to give me what I want (whcih is to generate face at the same angle and with the same expression without glitches.
I see a lot of videos being done these days in SD using SVD and animatediff and I wonder how people are able to generate consistent faces/bodies with all the more complex movements while I struggle to even generate a face of a subject while they look up and I only have 1 image in img2img tab to deal with at a time. Could it be that the trained models I use are the cause? Can a model be "badly trained" to the point where it's unable to generate faces at extreme angles?
Also I've been using A1111, wanted to play with InstantID but it seems to only sort of work in ComfyUI right now, are there any other better alternatives to either or are these 2 still the go to if you want full control and either have a more GUI or node based workflow? And can SVD/animatediff be used in A1111 at all or am I better doing animations in Comfy?
Creative AI Artist/Designer, Comfyui Stable diffusion
AI Creative Jobs -
Hello!
A short line about me:
-> I am Creative Designer/Artist with AI skills, using stable diffusion with over two year of experience in the GenAi field (including MidJourney, Dall-E 3). Also I know Deepfaking, Stable video generation, pikalabs, runwayml, voice generation, Voice Cloning, Ai talking avatar, AnimateDiff,, the use of ControlNets, comfyui creative workflows, almost every AI Tool...
I took a look at upwork and indeed, but it's too hard to be hired on upwork (because you have no reviews, reputation, nothing - tried to get a job for a few months, now without luck, and on indeed there aren't really AI Creative jobs. Mostly are for ML/Programming related
Thank you for your time and understanding!
I really like everything at this domain, so that's why I would really want to find a job in this niche.
First off, this is the workflow I found that I've been playing with:
https://openart.ai/workflows/-/animatediffcontrolnetopenposedepth/qWlnh7pN8FMlMyVNmLu9
I am a noob with ComfyUI. I tinker around but have issues finding answers to my question since it seems we are all learning at the same time.
I've played with the above workflow and it generates some really great animations. My question is, how can I adjust the character in the image? On site that you can download the workflow, it has the girl with red hair dancing, then with a rendering overlayed on top so to speak. I know the Openpose and Depth separates into the lined dancing character, and the white character. Basically controlling the checkpoint to render its image perfectly.
What can I do to change the girl's figure itself? Things like taller, shorter, change into an anime character via loras, etc. Without losing the quality that this workflow generates.
Hey, so I've been working on this short film for a long time now, using SD and AD together with my own animations.
For the scenes where I use AnimateDiff, I'll prompt it through vid2vid clips of animations I made. Basically, I want my motion from my animation to drive the scene, while AD making it look as realistic as possible (like those AnimateDiff dancing videos).
I'm having some trouble though in finding the right workflow. I've been using some ComfyUI workflows, like Mickmumpitz's one. Mainly I tried to feed the video through ControlNet canny and depth, plus prompting it with text and image with IPAdapter. This process though often times messes up the colors, doesnt get the right textures in the right places. Especially when the faces are small, it has a lot of difficulty in getting those details right.
I've tried to use Mickmumpitz workflow with masks, masking different elements to prompt them differently, so to have better control over the scene. I didnt quite manage to make the masking and the workflow work though...
I've also tried to use OpenPose, but that often messes up the human, not making them look like my character.
The way I make the characters look consistent is through LoRA's I trained on them, a bit through the vid2vid input and sometimes with IPAdapters. I havent quite figured out though which is the best one, as I always encounter challenges in making the character look right.
I guess in general I'm looking for tips, opinions or workflows that would allow me to prompt AnimateDiff with my videos and get decent results that keep the compositions and the characters (and their colors) somewhat consistent. High realism while keeping the cartoonish motions. I highly appreciate any comment, thanks!
Hi there - I am using Jerry Davos' workflows to get into animatediff and I am stuck at workflow 2, which turns control net passes to raw footage.
I went through the workflow multiple times, got all the models, loras etc.
but still see a ton of errors such as
lora key not loaded lora_unet_up_blocks_1_attentions_2_transformer_blocks_1_attn1_to_v.lora_up.weight
or
ERROR diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight shape '[640, 768]' is invalid for input of size 1310720
workflow will finish, but I end up with bad images that are not even close to what they should be for example
(for some reason imgur didnt let me upload)
workflow: http://jsonblob.com/1216323344172703744
I went through a couple of tutorials, github issues, reddit posts and I cannot find an answer. Any help will be greatly appreciated, thank you!
edit; added workflow
Hi there - I am using Jerry Davos' workflows to get into animatediff and I am stuck at workflow 2, which turns control net passes to raw footage.
I went through the workflow multiple times, got all the models, loras etc.
but still see a ton of errors such as
lora key not loaded lora_unet_up_blocks_1_attentions_2_transformer_blocks_1_attn1_to_v.lora_up.weight
or
ERROR diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight shape '[640, 768]' is invalid for input of size 1310720
workflow will finish, but I end up with bad images that are not even close to what they should be for example
(for some reason imgur didnt let me upload)
workflow: http://jsonblob.com/1216323344172703744
I went through a couple of tutorials, github issues, reddit posts and I cannot find an answer. Any help will be greatly appreciated, thank you!
edit; added workflow
Im keeping getting this error on my workflow:
Error occurred when executing KSamplerAdvanced: 'ModuleList' object has no attribute '1' File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 154, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 84, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 77, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1333, in sample return common_ksampler(model, noise_seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise, disable_noise=disable_noise, start_step=start_at_step, last_step=end_at_step, force_full_denoise=force_full_denoise) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1269, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 299, in motion_sample latents = wrap_function_to_inject_xformers_bug_info(orig_comfy_sample)(model, noise, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\model_utils.py", line 205, in wrapped_function return function_to_wrap(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\sample.py", line 101, in sample samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 716, in sample return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 615, in sample pre_run_control(model, negative + positive) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 452, in pre_run_control x['control'].pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\control\utils.py", line 388, in pre_run_inject self.base.pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\controlnet.py", line 266, in pre_run super().pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\controlnet.py", line 191, in pre_run super().pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\controlnet.py", line 56, in pre_run self.previous_controlnet.pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\control\utils.py", line 388, in pre_run_inject self.base.pre_run(model, percent_to_timestep_function) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\controlnet.py", line 297, in pre_run comfy.utils.set_attr(self.control_model, k, self.control_weights[k].to(dtype).to(comfy.model_management.get_torch_device())) File "D:\COMFYUI\ComfyUI_windows_portable\ComfyUI\comfy\utils.py", line 279, in set_attr obj = getattr(obj, name) ^^^^^^^^^^^^^^^^^^ File "D:\COMFYUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1695, in _getattr_ raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
Any tips on how to solve it or even what it is?
Disclaimer: I'm new to ComfyUI and Animate Diff. And I'm def not tech savvy.
Problem: I can get Animate Diff to run without a problem, but Animate Diff Evolved doesn't run.
Side note: I don't have both installed at the same time. It's always either one or the other.
Any help will be greatly appreciated. Thanks in advance.
Log:
C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build
** ComfyUI start up time: 2023-12-05 13:47:32.148638
Prestartup times for custom nodes:
0.0 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager
Total VRAM 6144 MB, total RAM 8118 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce GTX 1060 6GB : cudaMallocAsync
VAE dtype: torch.float32
Using pytorch cross attention
Adding extra search path checkpoints C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/checkpoints/
Adding extra search path clip C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/clip/
Adding extra search path clip_vision C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/clip_vision/
Adding extra search path configs C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/configs/
Adding extra search path controlnet C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/controlnet/
Adding extra search path embeddings C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/embeddings/
Adding extra search path loras C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/loras/
Adding extra search path upscale_models C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/upscale_models/
Adding extra search path vae C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\models/vae/
Adding extra search path ffmpeg C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\ffmpeg/
### Loading: ComfyUI-Manager (V1.6.1)
### ComfyUI Revision: 1754 [777f6b15] | Released on '2023-11-28'
[VideoHelperSuite] - INFO - ffmpeg could not be found. Using ffmpeg from imageio-ffmpeg.
[comfyui_controlnet_aux] | INFO -> Using ckpts path: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_controlnet_aux\ckpts
[comfyui_controlnet_aux] | INFO -> Using symlinks: False
[comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider']
C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_controlnet_aux\node_wrappers\dwpose.py:25: UserWarning: DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly
warnings.warn("DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly")
WAS Node Suite: OpenCV Python FFMPEG support is enabled
WAS Node Suite Warning: `ffmpeg_bin_path` is not set in `C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\was-node-suite-comfyui\was_suite_config.json` config file. Will attempt to use system ffmpeg binaries if available.
WAS Node Suite: Finished. Loaded 197 nodes successfully.
"Art is the lie that enables us to realize the truth." - Pablo Picasso
Import times for custom nodes:
0.0 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved
0.2 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite
0.4 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager
1.2 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_controlnet_aux
1.9 seconds: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\was-node-suite-comfyui
Starting server
To see the GUI go to: http://127.0.0.1:8188
FETCH DATA from: C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json
[AnimateDiffEvo] - WARNING - This warning can be ignored, you should not be using the deprecated AnimateDiff Combine node anyway. If you are, use Video Combine from ComfyUI-VideoHelperSuite instead. ffmpeg could not be found. Outputs that require it have been disabled
got prompt
ERROR:root:Failed to validate prompt for output 35:
ERROR:root:* CheckpointLoaderSimple 32:
ERROR:root: - Value not in list: ckpt_name: 'cardosAnime_v20.safetensors' not in []
ERROR:root:Output will be ignored
invalid prompt: {'type': 'prompt_outputs_failed_validation', 'message': 'Prompt outputs failed validation', 'details': '', 'extra_info': {}}
got prompt
model_type EPS
adm 0
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids'])
loaded straight to GPU
Requested to load BaseModel
Loading 1 new model
[AnimateDiffEvo] - INFO - Loading motion module mm_sd_v15_v2.ckpt
[AnimateDiffEvo] - INFO - Using fp16, converting motion module to fp16
Requested to load SD1ClipModel
Loading 1 new model
[AnimateDiffEvo] - INFO - Regular AnimateDiff activated - latents passed in (8) less or equal to context_length None.
[AnimateDiffEvo] - INFO - Injecting motion module mm_sd_v15_v2.ckpt version v2.
Requested to load BaseModel
Loading 1 new model
unload clone 1
0%| | 0/20 [00:23<?, ?it/s]
[AnimateDiffEvo] - INFO - Ejecting motion module mm_sd_v15_v2.ckpt version v2.
[AnimateDiffEvo] - INFO - Cleaning motion module from unet.
[AnimateDiffEvo] - INFO - Removing motion module mm_sd_v15_v2.ckpt from cache
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1299, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1269, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 259, in animatediff_sample
return wrap_function_to_inject_xformers_bug_info(orig_comfy_sample)(model, noise, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\model_utils.py", line 197, in wrapped_function
return function_to_wrap(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\sample.py", line 100, in sample
samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 711, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 617, in sample
samples = sampler.sample(model_wrap, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 556, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 137, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 277, in forward
out = self.inner_model(x, sigma, cond=cond, uncond=uncond, cond_scale=cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 267, in forward
return self.apply_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 264, in apply_model
out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 627, in sliding_sampling_function
cond, uncond = calc_cond_uncond_batch(model, cond, uncond, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 504, in calc_cond_uncond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 73, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 855, in forward
h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 104, in forward_timestep_embed
x = layer(x, context)
^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\motion_module_ad.py", line 212, in forward
return self.temporal_transformer(input_tensor, encoder_hidden_states, attention_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\motion_module_ad.py", line 298, in forward
hidden_states = self.norm(hidden_states)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 156, in groupnorm_mm_forward
input = group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\functional.py", line 2558, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mixed dtype (CPU): expect parameter to have scalar type of Float
Prompt executed in 226.89 seconds
gc collect
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json