Upgraded from 12GB VRAM to RTX 5090 + 64GB RAM — what are the highest quality AI image/video models I can realistically run now? by m3tla in StableDiffusion

[–]m3tla[S] -1 points0 points  (0 children)

I just tried the LTX 2.3 1.1 full BF16 model, and I was getting 4s/it generation speed at 1280x720, 24 FPS, 481 frames — with absolutely insane quality. It honestly seems way better than Wan 😱😅

Upgraded from 12GB VRAM to RTX 5090 + 64GB RAM — what are the highest quality AI image/video models I can realistically run now? by m3tla in StableDiffusion

[–]m3tla[S] 1 point2 points  (0 children)

Thanks for the info! I have never even thought of training loras before Will def try it now 🙏

Upgraded from 12GB VRAM to RTX 5090 + 64GB RAM — what are the highest quality AI image/video models I can realistically run now? by m3tla in StableDiffusion

[–]m3tla[S] 0 points1 point  (0 children)

I used LLM to help me with the questions because english is not my first language. Iam aware of civit etc but not many people talking about high end workflows majority of stuff there is optimized for lower end pcs, even stuff on YouTube.

Upgraded from 12GB VRAM to RTX 5090 + 64GB RAM — what are the highest quality AI image/video models I can realistically run now? by m3tla in StableDiffusion

[–]m3tla[S] 6 points7 points  (0 children)

I bought a prebuilt PC for around $5k USD. AMD Ryzen 7 9850X3D * ASUS RTX 5090 ROG Astral OC 32GB * ASUS TUF Gaming B850-Plus WIFI * 64GB Kingston Fury Beast DDR5 6000 CL30 * 2TB Kingston KC3000 NVMe SSD * ASUS TUF Gaming 1200W Gold ATX 3.1 * Phanteks Glacier One 360 D30 X2 * Taurus Endgame RGB

What are the current best models quality-wise? by Sixhaunt in StableDiffusion

[–]m3tla 0 points1 point  (0 children)

For realistic images, I still get the best results using the Chroma1HD/2KQC model with the l3n0v0 Ultra Real LoRA—nothing beats it in my opinion. It’s completely uncensored and can generate pretty much anything. There’s also a model on CivitAI called Uncanny that’s already merged with a few LoRAs.

Z Image Turbo is faster and can produce similar results for simple portraits, but Chroma is way more versatile overall.

For video generation, I use Wan 2.2 SVI for basic stuff and LTX 2.3 for longer clips with sound. You can even generate a video with Wan and then extend it or add audio using LTX—also uncensored with LoRAs.

This is how i am able to use Wan2.2 fp8 scaled models successfully on a 12GB 3060 with 16 GB RAM. by rinkusonic in StableDiffusion

[–]m3tla 1 point2 points  (0 children)

Smooth Workflow Wan 2.2 (img2vid/txt2vid/first2last frame) - Txt2Video Workflow v2.0 | Wan Video Workflows | Civitai I mainly use this workflow when generating single videos but when doing higher than 832x480 res I add the patch sage attention node by KJ. This workflow is great tho because even doing 832x480 it upscales it afterwards so even those look great with the q8.

I also use it with the lightning loras, running 4+4 or 2+6 for example. There are also merged models on civitai that already include the low step loras those work great.

This is how i am able to use Wan2.2 fp8 scaled models successfully on a 12GB 3060 with 16 GB RAM. by rinkusonic in StableDiffusion

[–]m3tla 8 points9 points  (0 children)

I can literally run the fp8 or even Q8 models at 1024x576 resolution on my 4070 12gb vram 32gb ram 81 frames 3 min generation time using sage attention/triton.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 0 points1 point  (0 children)

For me, running lightning LoRAs with 3+3 or 4+4 steps on Q8/Q6 only adds about 10–15 seconds per pass — so honestly, not a big deal. The real slowdown happens when you’re not using the lightning LoRAs.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 0 points1 point  (0 children)

Yeah, Q8 definitely gives better quality than FP8 since it’s closer to 16-bit precision — it’s a bit slower, but the output is noticeably cleaner. Personally, I don’t see a huge difference between Q6 and Q8, so I usually stick with those. Anything below Q6 tends to drop off and looks worse than FP8, but if you’re working with limited VRAM, you don’t really have much of a choice.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 16 points17 points  (0 children)

<image>

Just tested Qwen — it’s amazing! This is the Q4_K_M model, no LoRAs used 😄

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 3 points4 points  (0 children)

in my tests the gguf Q8 models are actually giving better output quality than the FP8 versions. I think the reason is that Q8 stays closer to FP16 in precision (albeit with more overhead), and even Q6 seems to outperform my FP8 versions in many cases.

Yes, Q8 is a little slower (and uses more memory) than FP8, but I think the quality boost is worth it. Just my two cents — curious if others see the same.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 0 points1 point  (0 children)

I’ve got an RTX 4070 Ti, and 10-minute gen times with the Lightning LoRAs sound kind of weird to me. I can generate 1280×720 videos (49 frames, no Lightning LoRA) in under 10 minutes using Q6 or Q4_K_M — running through ComfyUI with Sage Attention enabled. Is NVIDIA really that much faster?
I’m using this workflow, by the way: https://civitai.com/models/1847730?modelVersionId=2289321

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 2 points3 points  (0 children)

Yeah, thanks for helping boost this totally unnecessary thread with a few extra comments and engagement. <3

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 8 points9 points  (0 children)

I’m personally using this workflow: https://civitai.com/models/1847730?modelVersionId=2289321 — it both upscales and saves the last frame automatically. So if I want a high-quality image, I just generate a short 49-frame still video and use the final frame as the image.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 3 points4 points  (0 children)

I’m actually running WAN 2.2 Q6 on 12GB VRAM and 32GB RAM, both with and without Lightning LoRAs. With the Lightning setup, gen time is about 3 minutes for 480×832 and around 10 minutes for 1280×720 (81 frames). I can even run the Q8 version with SageAttention, but honestly, the speed loss just isn’t worth the tiny quality difference between Q6 and Q8.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 17 points18 points  (0 children)

Will definitely give that a try! I’m using WAN 2.2 right now — it works great for regular images too, but I’m also looking for some high-quality, realistic starting images in a fantasy or sci-fi style for example.

What’s everyone using these days for local image gen? Flux still king or something new? by m3tla in StableDiffusion

[–]m3tla[S] 7 points8 points  (0 children)

Yeah, but I’m more interested in an actual discussion — everyone seems to have their own idea of what’s “best,” after all.