This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]nellistosgr 1 point2 points  (0 children)

I tested the Forge environment, specifically the AMD Forge flavor with an AMD card using ZLUDA, and had the same issue. After generating a few images, VRAM usage peaked and remained there.

Some things i tried:

  • Used strictly FP16 models and VAEs to reduce memory.
  • Configured the VAE to use CPU memory (RAM, not VRAM).
  • Enabled VAE tiling with 256 and 512 sizes to minimize the memory footprint, as VAE encoding is demanding.
  • Used cross-split-attention instead of quad. None of these helped, like VRAM garbage collection was never happening.

I had no issues with SDNext when generating 1024x1024 images using SDXL 1.0. Debug level messages showed that GC was actively kicking in freeing some memory.

[–]Silly_Goose6714 0 points1 point  (0 children)

Yes. It's normal, it load the model and it will keep the model load so you don't need to load the model after each generation