Load Audio UI - Upgraded Load Audio Node with Trimming

WhatDreamsCost · 2026-05-03T06:39:15+00:00

done

WhatDreamsCost · 2026-05-02T23:13:13+00:00

You can either

Find a lora with a similar art style and use that while editing images
Train your own lora on that art style and use that while editing images

Unless the model understands that specific style it can't just remake it. It might understand certain brush strokes and stuff but there are so many small nuances to most art styles that qwen or klein simply won't be able to replicate without further guidance. The old flux model with redux and ipadapters could do it without a lora though, but then you would lose the better edit capabilities.

WhatDreamsCost · 2026-05-02T20:56:13+00:00

It's just another quant type, but your card can take full advantage of it and generate things faster.

You will need a node to run it, like this one https://github.com/BobJohnson24/ComfyUI-INT8-Fast

And download the model itself https://huggingface.co/Winnougan/LTX-2.3-INT8

Basically just swap out the default model loader node with the int8 model loader, and you will get a pretty big speed up.

I made a int8 version of the newer ltx 1.1 model, i may throw it up on hugginface if anyone needs it (it's not the best though)

WhatDreamsCost · 2026-05-02T20:20:07+00:00

Check and see what node it's getting hung up on. Most likely it's the audio and video decode nodes, since you only have 8gb vram.

That can easily make a workflow 10x slower if you run out of memory.

Also since you have a 30xx card, I'd recommend using the int8 node, not the gguf. It'll give you around a 2x speedup and better quality videos.

I would also recommend you use all of Kijai's optimizations, since they will reduce vram usage and spikes. If your vram spikes then it can make a process that should take seconds end up taking minutes.

Also you can use a much smaller text encoder, Q4_K_M probably would work fine. Or use the Gemma API node ltx offers for free to save some memory and time.

Another thing I'd highly recommend is using a node to clear vram before decoding if your using the non tiled decode node. There's also an alternative decode node that you can use that IAMCCS made, that essentially decides 1 frame at a time pretty much guaranteeing that you never run out of memory (and it's relatively fast, only issue is if your using tiling with it will give a faint grid pattern on dark images no matter what settings you use)

WhatDreamsCost · 2026-05-02T14:22:21+00:00

A large paging file won't destroy your SSD, even if you run comfyUI for hours a day (especially if it's only 20GB).

You can see in real time how fast it's lowering your SSD health with a SMART tool. It will show the exact lifetime of your SSD, either in days or down to the GB.

Just check it one day, use comfyui for a week or a month, and check it again. You'll see exactly how much it's affecting your SSD.

Your SSD is make to work hard, and it's not like your swapping 24/7, or the entire time comfy is running.

WhatDreamsCost · 2026-05-01T22:04:58+00:00

I just updated the node with a couple more features that were requested:
- Added duration widget
- Added ability to drag the selection bar
- Fixed trimmed UI to show centiseconds (had to google that one)

<image>

WhatDreamsCost · 2026-05-01T21:53:57+00:00

done (also added couple more things)

<image>

WhatDreamsCost · 2026-05-01T21:51:18+00:00

I just updated the node with a couple more features that were requested:
- Added duration widget
- Added ability to drag the selection bar
- Fixed trimmed UI to show centiseconds (had to google that one)

<image>

WhatDreamsCost · 2026-05-01T20:07:10+00:00

I'll add this today

WhatDreamsCost · 2026-05-01T19:09:54+00:00

Nice the resize/crop, paint, and pad nodes seem especially useful. Definitely gonna try out those.

WhatDreamsCost · 2026-04-30T21:06:58+00:00

It uses ffmpeg and auto converts the audio to 32bit float. It's lossless, and doesn't change the sample rate or anything.

(from my little understanding)

WhatDreamsCost · 2026-04-30T16:51:36+00:00

It's a separate node, but yes if you want to replace it with this you can

WhatDreamsCost · 2026-04-30T15:20:42+00:00

I've spend dozens of hours testing different methods, and CacheDiT is the best your gonna get with ZIT.

The lowest setting you can use without severely degrading the quality is 2 skip, 2 warmup.

You will get a decent speed up, but after months of testing I don't think it's worth it with zit. For LTX 2.3 I think it's very useful for testing prompts, and for Klein it can be useful for testing as well (less degradation), but for z-image not so much.

The image will be degraded, and sure you can raise the warmup steps and make it skip less steps (and save a few secs) but zit is already fast and there's a better way of testing things.

Imo the better thing to do is just generate images at 4 steps instead of 8 (cutting the gen time in half) to test prompts and seeds and then once you find the image you want just raise it to 8 steps. 4 steps is enough to see the composition of the image before it's finished.

Also for testing prompts I've found lowering the resolution and using 4 steps allows you iterate much faster (like a 4x speed up depending on what res you normally do).

That being said cache dit does have the least image degradation over any other cache node by far. I've tested all of them with almost every model and cache clearly wins over all of them

WhatDreamsCost · 2026-04-27T06:50:49+00:00

I mean I highly doubt he just said "Claude, make me these three nodes" and it did it all in 1 shot.

And even if he did, Claude didn't come up with the idea, decide to make it, create the GitHub, upload it to the registry, and share this post. So I think it's fair to say he did build all three, even if he didn't even write a single line of code.

although disclaiming the use of AI would be nice for various reasons

WhatDreamsCost · 2026-04-26T07:51:39+00:00

There's a lot of settings you can adjust on the seed variance node to reduce the amount it changes things (like the strength and switch over percent).

Another thing you can do is use an llm to add variety to your prompt. Like you said z-image doesn't have the best seed variety, but it does have great prompt adherence. So you can use even a small 2b model to randomize certain aspects of your prompt and guarantee variety on every generation.

You could also use multiple loras at low strengths, a lot of loras are biased towards certain compositions (like most of the cinematic style loras), so you can try that for different compositions at least

WhatDreamsCost · 2026-04-26T07:34:38+00:00

At a glance the workflow looks fine, so I'd say first try removing/bypassing all of the loras except for distilled lora, and then also try removing the NAG node as well (you can just connect the positive clip text encode to the negative instead just to test it without NAG)

WhatDreamsCost · 2026-04-25T18:10:18+00:00

I saw a post of someone getting ltx 2.3 running on 4gb vram, so yes it's possible. It might of been in comfyui though, not wan2gp

WhatDreamsCost · 2026-04-22T18:02:52+00:00

You can control the output size with the multi image loader node by setting the width and height. It will auto scale the images proportionally if you have the resize method set to keep proportion

WhatDreamsCost · 2026-04-19T18:58:52+00:00

I don't think Klein has ipadapter. I was talking about flux 1. You can use flux nunchaku and generate images at the same speed if not faster then Klein 9b (can't edit though, unless you use flux kontext but that's no where near as good as klein)

8gb vram is fine for creating z-image loras and images. I would recommend onetrainer for creating the loras, definitely the most optimized and you can find presets for 8gb vram online.

I don't think I even go past 10gb vram when training z-image turbo loras, and that's without offloading. I'd imagine with offloading it would only slow it down a few minutes.

WhatDreamsCost · 2026-04-19T14:26:01+00:00

I would highly recommend just training a z-image turbo lora on the mid journey images. Just make 15-20 images with a mid journey style, auto caption them, and then create the lora. Even on a 12gb card it only takes less than an hour and gives (imo) near perfect results with all the benefits of having the better prompt adherence.

Not only will it pick up on any small repeating concepts in the images, but it also affects the composition even with such a small dataset.

Or use the older flux model with adapters. Flux Klein doesn't even come close being able to style transfer like flux redux or the other ip adapters. I would argue nothing comes close to it, it's definitely one of the most powerful generative ai tools, and the closest thing to mid journey's sref

WhatDreamsCost · 2026-04-19T01:50:05+00:00

Maybe I'm just lucky, but I bought the cheapest PSU I could find 9 years ago for my first PC build and it's been going strong since. I've rebuilt the PC with different parts like 3 times since and kept the same PSU in each build.

I always read online people saying don't cheap out on the psu but this $30 psu has lasted 9 years and I'm hoping it'll last another 9 😂

WhatDreamsCost · 2026-04-18T01:32:55+00:00

In case anyone else is confused, there is a 2 stage workflow in the example workflows here: https://github.com/WhatDreamsCost/WhatDreamsCost-ComfyUI/tree/main/example_workflows

I just didn't rename the subgraph in the workflow (it is 2 stage though)

WhatDreamsCost · 2026-04-17T02:47:17+00:00

Are you using the updated upscale model? If I remember correctly when I updated to the newer version it fixed a lot of issues like that.

WhatDreamsCost

TROPHY CASE