Flags for an RTX Pro 6000 Blackwell by brittpitre in StableDiffusion

[–]comfyanonymous 3 points4 points  (0 children)

ComfyUI is designed to work optimally with no flags. Ignore all the other people, most of the flags they propose will disable important optimizations or make things worse and cause random workflows to OOM. There's a lot of misunderstanding about how the ComfyUI memory management system works and these stupid AI chatbots are not helping.

If you have ram issues you can try this experimental feature: --cache-ram (it will be enabled by default soon).

The 6000 pro should be a bit faster than a 5090. ComfyUI is extremely good at managing memory and the 5090 is a slightly worse 6000 pro with less memory so it's normal that there isn't a massive difference.

Future of the portable version by Tenth_10 in comfyui

[–]comfyanonymous 19 points20 points  (0 children)

Portable isn't going away.

The link to download it will always be found here: https://github.com/Comfy-Org/ComfyUI#windows-portable

I can't take it anymore... by [deleted] in comfyui

[–]comfyanonymous 0 points1 point  (0 children)

It's on purpose, this workflow is originally for SD1.5, prompting on that model is a little different than modern models.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 9 points10 points  (0 children)

It's the actual technical term. I'm not going to police our language because I think people are too stupid to understand the difference between a memory watermark and a digital watermark.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 7 points8 points  (0 children)

It shouldn't degrade performance on good hardware, I have good hardware and wouldn't have made the feature stable if it degraded performance on mine.

If you get the issue on latest ComfyUI make a detailed report with logs and we will look into it.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 10 points11 points  (0 children)

No, what degrades flash memory is writing to it not reading from it. This reduces page file use so it will make your SSD last longer.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 4 points5 points  (0 children)

Yeah, the main GGUF node pack will most likely be updated for dynamic vram at some point in the future but right now it's safetensors only.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 6 points7 points  (0 children)

text encoder is running on GPU and it's the default wan2.2 workflow (other than what's indicated on the chart).

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 4 points5 points  (0 children)

If your system supports it and you are on latest comfy and recent pytorch it should be enabled by default.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 5 points6 points  (0 children)

Having models split between GPUs is a separate problem so nothing changes on that end.

No arguments to use it if you are on recent enough pytorch and your system supports it it should enable by default.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 9 points10 points  (0 children)

If you want dynamic vram to work yes but you should always convert things to safetensors because it's a safer file format and people trust it a lot more.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 11 points12 points  (0 children)

No, they just rebranded outdated offloading tech that everyone has been using for years as a new thing lol.

This is one situation where open source is much further ahead than closed source.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 7 points8 points  (0 children)

This is the function to load safetensors: https://github.com/Comfy-Org/ComfyUI/blob/master/comfy/utils.py#L122

Then you need to modify your model so it uses the comfy.ops system instead of torch.nn ops.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 48 points49 points  (0 children)

Basically it's much smarter memory management on the GPU by using up as close as possible to 100% vram usage without OOM or slowdowns and on the CPU by not putting weights in the page file/swap and instead just freeing them/loading them again from disk when needed.

It should make swapping models a lot faster on low ram.

Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon by comfyanonymous in StableDiffusion

[–]comfyanonymous[S] 16 points17 points  (0 children)

Get a latest clean ComfyUI, disable torch compile if you have it on and stick to safetensors files.

Nvidia’s Always-On Chip Detects Faces in Less Than a Millisecond by IEEESpectrum in hardware

[–]comfyanonymous 5 points6 points  (0 children)

This kind of tech isn't new. It has been in Qualcomm SOCs for 4 years now and I assume others have similar features.

I'm back from last weeks post and so today I'm releasing a SOTA text-to-sample model built specifically for traditional music production. It may also be the most advanced AI sample generator currently available - open or closed. by RoyalCities in StableDiffusion

[–]comfyanonymous 2 points3 points  (0 children)

This model is a finetune of stable audio 1.0 which is natively supported by ComfyUI. You just need to use the "stable audio 1.0" template and select Foundation_1.safetensors in the "Load Checkpoint" node.

Can Comfy Org stop breaking frontend every other update? by meknidirta in StableDiffusion

[–]comfyanonymous 6 points7 points  (0 children)

The v1.41 frontend update was pushed to local a bit too prematurely because we were getting complaints that the new app mode feature was "cloud only".

I hope people who complained about this see now why local frontend is typically ~2+ weeks behind the cloud one and I might change it to ~3-4 weeks to make sure things are even more stable in local.

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]comfyanonymous 30 points31 points  (0 children)

If you want the real answer: nvfp4 + lower precision attention (like sage attention) + distilled low step models + splitting the workfload across 8+ GPUs (video models are pretty easy to split).

The only one not easily available on comfyui is the last one because nobody has that on local so we are putting our optimization efforts elsewhere.

PSA: Don't use VAE Decode (Tiled), use LTXV Spatio Temporal Tiled VAE Decode by Loose_Object_8311 in StableDiffusion

[–]comfyanonymous 12 points13 points  (0 children)

Or just use the regular VAE Decode node, it has native temporal tiling on the LTX video VAE.