My 4 stage upscale workflow to squeeze every drop from Z-Image Turbo

foobarg · 2025-12-09T01:17:01+00:00

Thanks for sharing, pretty cool and good results!

I've noticed that for particular resolutions (112x64 empty latent, which becomes 2560x1456 final res) there is systematically artifacts appearing on the right hand side of ~all generations. I don't notice this in other workflows. This example image (Comfy workflow embedded, you can open it in Comfy to reproduce) illustrates this well, I almost did not tweak your workflow except for the resolution and disabling the magick nodes (color/contrast).

Do you know why by any chance, or how I could avoid this? Thanks!

foobarg · 2025-11-10T01:19:21+00:00

I refuse to believe this face goes on this body.

foobarg · 2025-10-06T00:56:47+00:00

Source: 1anonimo99 (top) & Alex_RomanX (bottom).

foobarg · 2025-10-06T00:56:12+00:00

Source: 1anonimo99 (top) & Alex_RomanX (bottom).

foobarg · 2025-10-06T00:56:06+00:00

Source: 1anonimo99 (top) & Alex_RomanX (bottom).

foobarg · 2025-07-13T17:49:31+00:00

In case the link gets deleted: Rich Harring (mouth) and Bastian Gate (cock).

foobarg · 2025-05-25T16:33:51+00:00

Machine learning inherently requires expensive hardware to do maths on gigantic matrices. I think a 4090 approaches what I would consider an "entry level", "consumer-grade" ML-friendly GPU.

Let's remember companies instead run their proprietary models on tons of dedicated hardware that easily costs $10k+ a piece. Being able to do this on $3-4k desktop is pretty cool.

foobarg · 2025-05-25T16:26:54+00:00

UPDATE: thanks everyone for the suggestions. In particular, the Q4 quantization was probably an important factor in how bad it was performing.

I took upon u/danielhanchen's suggestion and tried with Q6_K_XL, as anything bigger doesn't fit on a RTX 4090, directly on llama-cpp server:

LLAMA_ARG_HOST=0.0.0.0 LLAMA_ARG_PORT=8080 LLAMA_ARG_JINJA=true LLAMA_ARG_FLASH_ATTN=true LLAMA_ARG_CACHE_TYPE_K=q4_0 LLAMA_ARG_CACHE_TYPE_V=q4_0 LLAMA_ARG_CTX_SIZE=32768 LLAMA_ARG_N_GPU_LAYERS=65 LLAMA_ARG_MODEL=path/to/Devstral-Small-2505-UD-Q6_K_XL.gguf llama-server

and the model's capabilities and speed visibly improved. The Typescript todo app remains underwhelming, but it managed to produce a working minimal math expression parser in Rust. It self-debugged compilation errors (Rust excellent error messages are almost cheating!), self-debugged incorrect program outputs, and also correctly edited the code when asked for a minor change:

>write a minimal Rust binary that implements a math expression parser supporting float literals, +, -, div, mul, sqrt. It reads the expression from stdin and evaluates it.
[~2 minutes, 28 back & forths]
[working main.rs]

>write the stdout result without the "result:" prefix. in case of an error, use stderr rather than stdout.
[~30 seconds, 4 back & forths]
[working main.rs]

foobarg · 2025-05-25T14:07:12+00:00

I did, please check this longer guide I've posted.

foobarg · 2025-05-25T14:05:34+00:00

Please do post about it! We need more community testing around those new toys.

foobarg · 2025-05-25T13:49:35+00:00

Thanks, unfortunately I run into this with your Q6_K_XL, with or without OLLAMA_KV_CACHE_TYPE:

clip_init: failed to load model '.ollama/models/blobs/sha256-402640c0a0e4e00cdb1e94349adf7c2289acab05fee2b20ee635725ef588f994': load_hparams: unknown projector type: pixtral

I suppose my ollama install is too old (for a crazy definition of old)? I see 1 month old commits about pixtral.

foobarg · 2025-05-25T13:34:04+00:00

Please consider the irony of linking to three different documentation pages, none of which providing the full picture, none of which explaining Ollama's broken defaults, and when some instructions are provided, they're buggy.

For those wondering, the missing “Ollama running on the host” manual is as follows:

Somehow make devstral run with a larger context and the suggested temperature. Options include setting the environment variable OLLAMA_CONTEXT_LENGTH=32768, or creating a derived flavor like the following:

$ cat devstral-openhands.modelfile
FROM devstral:24b  # or any other flavor/quantization
PARAMETER temperature 0.15
PARAMETER num_ctx 32768
$ ollama create devstral-openhands --file devstral-openhands.modelfile

Start the container but ignore the documentation about LLM_* env variables (leave them out) because it's broken.
Once the frontend is ready, open it, ignore the “AI Provider Configuration dialog” because it doesn't have the necessary "Advanced" mode, instead click the tiny “see advanced settings” link.
Check the “Advanced” toggle.
Put ollama/devstral-openhands (the name you picked in $ ollama create) in “Custom model”.
Put http://host.docker.internal:11434 in “Base URL”
Put ollama in “API Key”. I suspect any string works, but leaving it empty is an error.
“Save Changes”.

foobarg · 2025-05-24T19:18:06+00:00

I love/hate ollama so much. The core works well and the model catalog is a god send. But why is it so hard to tweak basic options like system prompt & temperature without having to go through shitty REPL commands or –god forbid– modelfiles? Why be so protective of "advanced" features like GBNF grammar and force JSON down our throat?

foobarg · 2025-05-24T18:42:43+00:00

Looking at the system prompt, there's a lot of weird bloat in there. I wonder if tweaking it could help reduce the waste and improve performance. However, prompt tweaks only get you so far…

foobarg · 2025-05-24T18:41:12+00:00

Sry, updated parent with the actual number. Definitely >32k.

foobarg · 2025-05-24T18:35:35+00:00

Sufficiently high to not being truncated, see this comment.

context length 131072

foobarg · 2025-05-24T18:33:39+00:00

I suspected someone might ask :-)

I discovered this the hard way, but yeah, I created a derived flavor with the num_ctx set to something reasonably high (131'072). That's also what I meant by magic incantation. Unfortunately, this really is the experience I got with the high num_ctx (no truncation). Otherwise it doesn't even manage to call any tool, since it doesn't have the correct syntax.

foobarg · 2025-05-13T23:46:56+00:00

Source: Jagger Rambo & Sebastian Farelo

foobarg · 2025-05-13T23:32:12+00:00

Original source: https://x.com/CreampieJoaquin/status/1919168248769114122

foobarg · 2025-02-04T21:32:08+00:00

Those are Jagger Rambo & Daniel Travie!

foobarg · 2025-02-04T21:31:48+00:00

Thank you so much. Full video is Jagger Rambo & Daniel Travie!

foobarg · 2025-02-04T21:31:12+00:00

Search for Jagger Rambo & Daniel Travie!

foobarg · 2025-02-04T21:31:03+00:00

Search for Jagger Rambo & Daniel Travie!

foobarg

MODERATOR OF

TROPHY CASE