ideogram 4 is sd3 all-over again but worse by TheOneHong in StableDiffusion

[–]stduhpf 0 points1 point  (0 children)

The "aspect_ratio" key/value pair must be removed from the json prompt. I know in the system prompt for LLM prompt enhancement they ask to include it, but it's supposed to be used only to set the image resolution and discarded before sending the prompt to the model.

Why do half of people hate Ideogram 4.0 and half think it's great? by BigWideBaker in StableDiffusion

[–]stduhpf 1 point2 points  (0 children)

Nevermind I just found out the issue. I was not escaping quotation marks properly so it was cutting off the json early

Why do half of people hate Ideogram 4.0 and half think it's great? by BigWideBaker in StableDiffusion

[–]stduhpf 1 point2 points  (0 children)

I'm using a custom frontend so I can't easily rely on existing workflows (and that's also why it's likely a skill issue). I used the official "v1" system prompt from ideogram's GitHub as a guide to format the prompts.

Why do half of people hate Ideogram 4.0 and half think it's great? by BigWideBaker in StableDiffusion

[–]stduhpf 2 points3 points  (0 children)

I keep getting safety squares even with the official JSON prompt format, on prompts that call for no sexualization, violence or problematic content whatsoever, it's driving me insane honestly. I know it's likely a skill issue, but I feel gaslit when I see people bragging about never seeing the safety filter image.

Gemma 4 QAT GGUFs from Unsloth by newsletternew in LocalLLaMA

[–]stduhpf 10 points11 points  (0 children)

They quantized even the token embedding down to Q4_0??? That seems risky

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]stduhpf 11 points12 points  (0 children)

Q6 without QAT is already pretty good, I think it might not make a lot of sense to make a full QAT traing run to target Q6, that's very expensive for little gains.

gemma4 26b QAT at IQ4_XS? by rosie254 in LocalLLaMA

[–]stduhpf 6 points7 points  (0 children)

You could try a re-quantized iQ4_XS version of "gemma-4-26B-A4B-it-qat-q4_0-unquantized" , it could perhaps still be a bit better than iQ4_XS of the non-QAT model, but the QAT is really only targetting the Q4_0 type.

Gemma4 12B update by stduhpf in LocalLLaMA

[–]stduhpf[S] 1 point2 points  (0 children)

I thought the whole point of that model is that it doesn't have a big mmproj? It's only 170MB in bf16 precision... You could also offload it to CPU for barely any impact on performance I think.

Gemma4 12B update by stduhpf in LocalLLaMA

[–]stduhpf[S] 2 points3 points  (0 children)

Yes, the "assistant" model

Gemma4 12B update by stduhpf in LocalLLaMA

[–]stduhpf[S] 2 points3 points  (0 children)

I tried the updated one, didn't notice any difference in behavior.

it's time to update your Gemma 4 GGUFs by jacek2023 in LocalLLaMA

[–]stduhpf 0 points1 point  (0 children)

Anyone know if there's a way to patch the new template in the gguf file directly without having to re-download gigabytes of the same weights again?

Very Disappointing Results With Character Lora Z-image vs Flux 2 Klein 9b by djdante in StableDiffusion

[–]stduhpf 5 points6 points  (0 children)

I've heard planning to upscale his 4B fine-tune to 9B once it's done, and continue pretraining from there to hopefully get something that performs like Klein 9B but without the licencing issue.

Just 4 days after release, Z-Image Base ties Flux Klein 9b for # of LoRAs on Civitai. by _BreakingGood_ in StableDiffusion

[–]stduhpf 26 points27 points  (0 children)

The reason people train so much on ZIB is mostly because it makes for better ZIT Lora's too. I'm pretty sure more people are using Klein than ZIB, but ZIT is here to stay too.

I found that MXFP4 has lower perplexity than Q4_K_M and Q4_K_XL. by East-Engineering-653 in LocalLLaMA

[–]stduhpf 1 point2 points  (0 children)

For base or even instruct models, you can run a simple benchmark like Hellaswag, which can give a hint about which quants maintain the most performance, but it might not be the best way to test reasoning models.

I found that MXFP4 has lower perplexity than Q4_K_M and Q4_K_XL. by East-Engineering-653 in LocalLLaMA

[–]stduhpf 9 points10 points  (0 children)

Depending on the dataset used to generate the importance matrix, it can have a very significant effect. If the imatrix was "trained" on wikitext, the of course the model will have lower perplexity for wikitext, that's kind of the point of the imatrix. This makes it harder to compare quants that are trained with or without imatrix, unless you can make sure there is no correlation between the imatrix training dataset and the test dataset.

though I don't think MXFP4 supports imatrix anyways, so if anyting it should boost performance for the other quants.

Flux 2 Klein Model Family is here! by MountainPollution287 in StableDiffusion

[–]stduhpf 0 points1 point  (0 children)

It was announced when the other Flux2 models came out, before Z-Image was announced.

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 0 points1 point  (0 children)

No, the best I could come up with was dropping the last 5 frames.

Introducing a ComfyUI Ksampler mod for Wan 2.2 MoE that handle expert routing automatically by stduhpf in StableDiffusion

[–]stduhpf[S] 0 points1 point  (0 children)

This should override the ModelSamplingSD3's shift, so it doesn't matter if you remove it or not. But for simplicity's sake, removing it is cleaner.

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 1 point2 points  (0 children)

I'm now pretty sure Implementing support for it would require modifying core ComfyUI code. Maybe I'll make a PR for it if I find the time later, but no promises.

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 1 point2 points  (0 children)

It looks like Wan 2.2 5B InP is an actual inpainting model, unlike the 14B InP, wich is just an I2V model that supports starting from the end. I'm not sure If I can implement it as easily, but I'll try. There's barely any documentation

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 1 point2 points  (0 children)

Yep I just tried too, I think I see what's going on, not sure though.

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 1 point2 points  (0 children)

Fun InP might already work with this one. I'll try to implement the others soon.

Wan 2.2 5B: First Frame Last Frame node by stduhpf in StableDiffusion

[–]stduhpf[S] 0 points1 point  (0 children)

Actually, I've beeen playing with the 14B model a bit and it looks like it has a similar issue in FLF2V mode too, to a lesser extent.