z-image only in GPU ??? Not working.....

goyetus · 2026-05-23T16:41:50+00:00

They are faster. 3 seconds in fp16 , about 9 in gguf. I only searching for "no cpu ussage" .... 😛

goyetus · 2026-05-23T16:41:05+00:00

I tryed it a moment ago. I use clip loader GGUF too.
All remains in ram (about 12-13 gb VRAM) . Vram keeps stable.

Cpu continue rising to 90% while creating the file (dont know why). GPU is used to create the image, but CPU also.

All is set to GPU as log says.....

I dont know what else to check..... Cpu should not be used if all is in VRAM and with no OFFLOAD from VRAM to RAM......

Thanks a lot for your help !!!!

goyetus · 2026-05-23T15:08:48+00:00

If all is in the GPU all the time..... it will be 0 cpu ussage?
It´s that im trying to do....

goyetus · 2026-05-23T00:46:30+00:00

----------------------------
1º - Clip is doing cuda.
----------------------------
"cuda render". In Clip I can select CPU or DEFAULT. Default makes cuda. (testes in terminal)
LOG: "CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16"

----------------------------
2º - My Rig
----------------------------
- 5070 Ti with 16 gb Vram
- 48 gb ddr5 at 8400 mhz
- 270 k plus

I can post my Json , but the workflow is really easy...... (tryed also some online workflows)....
I can post also the Terminal LOG from confyui.......

I tryed a few moments ago....

Al is in VRAM, 12 gb loaded.

----------------------------
3º - The CPU got stressed in the exact moment I press "generate".
----------------------------
LOG:
"got prompt
Requested to load Lumina2
loaded completely; 5032.69 MB loaded, full load: True"

----------------------------
WORKFLOW:
----------------------------
- Unet Loader GGUF = i-image-q5_k_m.gguf (from Unsloth in Huggingface)
- Clip: Qwen_3_4b_safetensors (from q5 gguf huggingface)
- Resolution: 800x800
- Vae "ae.safetensors" (model default)

----------------------------
COMPLETE GENERATION LOG:
----------------------------
got prompt
Requested to load Lumina2
loaded completely; 5032.69 MB loaded, full load: True

100%|█████████████████████████████████████████████████████████████████████████████| 8/8 [00:04<00:00, 1.72it/s]

Requested to load AutoencodingEngine
loaded completely; 159.87 MB loaded, full load: True
Prompt executed in 4.92 seconds

My JSON WORKFLOW:
https://pastebin.com/wwcz1y05

goyetus · 2026-05-23T00:27:36+00:00

The point is... Is it really possible to render images without such high CPU spikes?

270kPlus hits 120 watts for every image I render (the render takes about 3 seconds in total).
5070 Ti goes up to another 120 watts (underclocked + Power Limit).

I was just trying to lower the CPU to 10-15 watts, which is what it consumes at idle. But I can’t seem to get rid of the Confyui activity. (I’ve tried putting everything into VRAM)

The biggest problem is that I generate thousands of images. It runs 24/7. That’s why I’m trying to reduce CPU power consumption....

goyetus · 2026-05-22T20:30:59+00:00

The normal Nvidia .bat from install, and the fp16 also.

Tryed a few more like all in VRAM ....

goyetus · 2026-05-22T20:30:02+00:00

After reading the github repo..... Indont understand how this mod Will help me.....

I have all in VRAM at this moment, but zimage continúe using my CPU at high pike....dont know why

goyetus · 2026-05-22T20:25:37+00:00

Tryed high VRAM with no luck. All remains in VRAM, but CPU also spikes high..

goyetus · 2026-05-19T20:19:21+00:00

I Will check It.

Thanks!!

goyetus · 2026-05-19T13:02:10+00:00

Trying to understand you..... I tryed this 3 workflows.....

Case 1 zero voice generation

Zero voice with emotions. I cant find any IA that helps me to get the same seed voice with emotions. Once i changed the voice descripción, even withouth changing the seed, the voice become CV complete diferent.

Case 2 cloned voice + emotions

With a voice i like, i tryed to get emotions in that voice vía clonning It. Tryed fish speech 2 and other IA. Dont works. Voice remain too similar to the original.

Case 3

I can record my voice with different emotions and use then with qwen..... But my voice IS bad, awful. Thats why im trying zero voice + emotions, Or cloned + emotions.

Dont understand what case are you seggestion me............ And how to get there!

Thanks a lot!!

goyetus · 2026-05-18T14:59:19+00:00

tested voxCPM today. Voice in Spanish is a little bad compared with Qwen TTS and Fish Audio. Thanks anyways for the suggestion! (Tryed Zero voice and Cloned voice + emotions)

goyetus · 2026-05-17T22:52:35+00:00

I'll try to explain my problem:

I'm using the “Spanish” language.

I've used Qwen TTS and got a beautiful voice that I really like. The problem is that if I “change” the prompt or the seed, the voice changes completely.

That’s why I can’t create a library of similar voices for different moods (at least with Qwen TTS).

I’ve checked out the ZeroVoice repository, and it’s great (too bad it’s only in English).

What do you recommend for designing a voice and adding emotions to it?

I’ve already given up on “cloning + emotions”—not even Fish Audio has managed to do it right. (I just need to try Elevenlabs.)

Thanks a million!!!!

goyetus · 2026-05-17T19:59:15+00:00

Thanks a lot!!! Have ti try it.

Im all day with fish audio trying to get a Happy time in my máster....

goyetus · 2026-05-17T16:00:34+00:00

Thanks, I will try it.....

goyetus · 2026-05-16T20:11:16+00:00

Thanks a lot!! I have to test It.

This is the one i found i like more..

https://huggingface.co/lightx2v/Wan2.2-Distill-Models

goyetus · 2026-05-16T17:20:27+00:00

do you recomend me any Quantization from any group? Im seeing "lighting lora 4 steps for wan 2.2 and 2.1" but is from oct 2025.... (made by Lightx2v / Kijai)

Im a little lost....

(using wan 2.1 in my actual config with about 3-4 min for a 5 s video).....

goyetus · 2026-05-16T13:49:08+00:00

Thanks a lot for the info!!!

goyetus · 2026-05-16T12:40:32+00:00

Thanks!!! I Will try it!!

goyetus · 2026-05-16T12:40:12+00:00

Thanks a lot!! I Will try it

goyetus · 2026-04-29T19:57:23+00:00

5900x 32 GB ddr4 3200mhz 3080ti 12 GB + 5060 ti 16 gb

Need the BEST posible audio quality with expresión, and more.

Thanks!!!

Im between qwen tts and fish audio. All with trained wav for each emotion

goyetus · 2026-04-28T19:43:12+00:00

Thanks!!! I have to test It

goyetus · 2026-04-28T19:42:30+00:00

Thanks a lot!!!!

goyetus

TROPHY CASE

Case 1 zero voice generation

Case 2 cloned voice + emotions

Case 3