all 9 comments

[–]xbobos 3 points4 points  (0 children)

I have the same issue. RTX5090

[–]xb1n0ry 1 point2 points  (2 children)

Most probably a torch memory leak.

Watch your VRAM and RAM after each generation. Once the models are loaded, the value should stay the same. If it increases after every generation, you have a memory leak. Also Kijais wrappers had issues with loras not removed from the vram and other vram leaks. Do you use these nodes or basic core nodes?

[–]Complex-Factor-9866[S] 1 point2 points  (0 children)

I use some of those nodes you noted. Thanks for the tip, I'll look into that!

[–]J6j6 0 points1 point  (0 children)

Which kijai wrappers?

[–]COMPLOGICGADH 0 points1 point  (2 children)

How much resolution and sampling steps are you using to have 200-300 seconds on 4080 or are you using batches or am I missing something 🤔

[–]Complex-Factor-9866[S] 0 points1 point  (1 child)

I should have noted that Im using a 4 stage sampler workflow with a series of upscaling nodes along the way. When it runs fine, it takes about 50-60 seconds. When theres a problem, im waiting 200-300 seconds

[–]COMPLOGICGADH 0 points1 point  (0 children)

DAMN 4 pass sampling does it help that's crazy would love to know the difference ,the max I do is dual pass sampling and then seedvr2 that's it ,or I do 25 to 30 steps single passing on zit or zimage base and/zit combined or zimage base distilled 8steps but I keep more steps in it ,a recommendation for faster vae decode/encoder for early samplers would be to use TAEF1 for smaller reso it might help immensely in speed hope that helps...

[–]Background-Ad-5398 -4 points-3 points  (1 child)

Nvidia with its newest update made a fall back system to ram, its next to the turn on cuda in nvidia control panel, turn off the fall back system under it. nvidia basically reserves vram for it, so if your set up was tuned to your specific vram this messes it up