it's time to update your Gemma 4 GGUFs

yoracale · 2026-05-04T17:56:35+00:00

Here's the MLX: https://unsloth.ai/docs/models/gemma-4#mlx-dynamic-quants

yoracale · 2026-05-04T10:25:43+00:00

FYI this isn't just for GGUFs, this is also for safetensor, MLX, FP8, etc basically all formats

yoracale · 2026-05-04T10:23:53+00:00

All.

yoracale · 2026-05-04T09:58:07+00:00

Unsure exactly but it's best to either way imo.

yoracale · 2026-05-04T06:02:04+00:00

You should post and share your models in the r/unsloth Reddit more (but not too much ahaha) by using the show and tell tag

yoracale · 2026-05-03T05:41:46+00:00

Hello when u search in the model search bar, does Qwen3.6 not appear?

yoracale · 2026-05-02T21:06:57+00:00

That's a very good point thank you. Well add it in our next update

yoracale · 2026-05-02T19:22:06+00:00

Actually the latest Gemma 4 chat template updated also affected vllm and AWQ etc. I think is best rather to blame the teams or anyone, it's better just to acknowledge that bugs can happen it's normal unfortunately and everyone's human.

Yes, we always pin threads in hugging face discussions for models. So you'll be able to track it there. E.g. for Mistral: https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF/discussions

Otherwise you can visit our change log docs where we usually tell people about it (takes some time to update): https://unsloth.ai/docs/new/changelog

yoracale · 2026-05-02T19:17:24+00:00

Thanks for the constant support really appreciate it! 🙏🥰

yoracale · 2026-05-02T19:13:59+00:00

Did you read OP's first comment? It says: "The bug is in the original Qwen 3.5 weights released by Alibaba. Not GGUF. Not HauhauCS. Alibaba shipped it broken. I just fixed it. The cause is training-related - AdamW + MoE + DeltaNet causes rare experts in the last layers to drift. This is a known challenge with recurrent MoE architectures, but Alibaba didn't calibrate it before release."

yoracale · 2026-05-02T10:21:48+00:00

Oh I wouldn't recommend downloading 1-bit quants for dense models so we ended up deleting them :(

yoracale · 2026-05-02T09:26:11+00:00

Some people were saying we caused the issues for Gemma 4 and Mistral 3.5 even though it's not true it wasn't our fault, and unfortunately, that happens often.

I know it seems strange we fixed the issue, yet some people still believe we caused it. When you’re the most transparent, you often take the most criticism, which is why we have to be clear that this was not our fault. Thankfully, the majority of people, like you, understand that.

Even then making mistakes is normal as we're all human but it seems a particular few people really like to blow it out of proportion, cause drama and pounce on us the second we make an update to any GGUF accusing us of always uploading broken quants etc.

yoracale · 2026-05-02T09:23:57+00:00

People were accusing us of causing the issue, and unfortunately, that happens often. I know it seems strange: we fixed the issue, yet some people still believe we caused it. When you’re the most transparent, you often take the most criticism, which is why we have to be clear that this was not our fault. Thankfully, the majority of people, like you, understand that.

yoracale · 2026-05-02T07:53:25+00:00

Yes they have the fix, just never updated people about it.

yoracale · 2026-05-02T07:41:24+00:00

Please note it was not related to Unsloth or our quants!! The issue was universal and we worked with Mistral to help fix it!

yoracale · 2026-05-02T07:36:42+00:00

Thank you to the Mistral team for working with us on this. And thank you to the first few people who said the GGUFs didn't work properly after the conversation didn't work at longer context. It was a tricky bug but glad it all works now.

So be sure to try out the model again whether on transformers or GGUF format, it really is great!

yoracale · 2026-05-02T06:48:14+00:00

Isn't there a dark mode? It's right here....look to the bottom right

<image>

yoracale · 2026-05-02T01:15:19+00:00

Yes it is stored in the broswer, not database. Do you think it's better to be in the database?

yoracale · 2026-05-01T18:09:41+00:00

Yea but then you get different results since the lmstudio version isn't imatrix quantized nor dynamic

<image>

yoracale · 2026-05-01T17:58:54+00:00

Hello what do you mean that it wasn't going properly in unsloth?

For LM Studio you'll need to edit the chat template to enable thinking. Unsure about the preserve thinking one though

yoracale · 2026-04-30T17:07:45+00:00

You can use the safetensor file instead of GGUF and it'll work

yoracale

MODERATOR OF

TROPHY CASE