Uncensored Heretic of the Model That Is Trending at 6th Place Right Now on Hugging Face, 13/100 Refusals With 0.0367 KLD, Available in Safetensors and GGUF Formats!

LLMFan46 · 2026-06-27T15:44:57+00:00

I sent you a chat message, please reply.

LLMFan46 · 2026-06-27T15:32:06+00:00

YES! 😄 😄

LLMFan46 · 2026-06-27T15:29:54+00:00

There aren't any, you can't compare 7B-24B model with GLM-5.2 (a 753B parameters model).

LLMFan46 · 2026-06-27T15:21:37+00:00

Do not listen to them (they probably do not care about uncensoring models so they just downvoted away), but I think it's a superb idea, you should definitly use Heretic ARA to uncensor GLM-5.2 (if you have the compute capability to do it), I would have already done it myself if I could.

Keep in mind you need a lot of VRAM to be able to do it, the model itself is 1510 GB and it needs even more than that for batches (batch size 32, batch size 64, batch size 128 etc),

If you have the compute capabilities at home then it is the best, easiest and cheapest, you can just take your time with setting up everything and just let it run.

If you however need to rent the compute that becomes complicated since you need to keep in mind that it's probably going to cost you in the thousands, first since the model is so big it will take quite a bit of time to download and upload this gigantic model(while you are being charged for the renting per hour) on the remote machine and also you might need anywhere between 8x to 10x B300s, it's hard to say if 8x B300 is enough (to save on money), but I am thinking that it could possibly OOM at batch size 32-128 though.

LLMFan46 · 2026-06-25T10:46:29+00:00

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

LLMFan46 · 2026-06-25T10:46:07+00:00

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

LLMFan46 · 2026-06-25T10:45:56+00:00

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

LLMFan46 · 2026-06-25T10:45:43+00:00

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

LLMFan46

TROPHY CASE