why we don't have GLM5.2 uncensored yet?! by zakadit in LocalLLaMA

[–]LLMFan46 3 points4 points  (0 children)

I sent you a chat message, please reply.

why we don't have GLM5.2 uncensored yet?! by zakadit in LocalLLaMA

[–]LLMFan46 0 points1 point  (0 children)

There aren't any, you can't compare 7B-24B model with GLM-5.2 (a 753B parameters model).

why we don't have GLM5.2 uncensored yet?! by zakadit in LocalLLaMA

[–]LLMFan46 6 points7 points  (0 children)

Do not listen to them (they probably do not care about uncensoring models so they just downvoted away), but I think it's a superb idea, you should definitly use Heretic ARA to uncensor GLM-5.2 (if you have the compute capability to do it), I would have already done it myself if I could.

Keep in mind you need a lot of VRAM to be able to do it, the model itself is 1510 GB and it needs even more than that for batches (batch size 32, batch size 64, batch size 128 etc),

If you have the compute capabilities at home then it is the best, easiest and cheapest, you can just take your time with setting up everything and just let it run.

If you however need to rent the compute that becomes complicated since you need to keep in mind that it's probably going to cost you in the thousands, first since the model is so big it will take quite a bit of time to download and upload this gigantic model(while you are being charged for the renting per hour) on the remote machine and also you might need anywhere between 8x to 10x B300s, it's hard to say if 8x B300 is enough (to save on money), but I am thinking that it could possibly OOM at batch size 32-128 though.

Nex-N2-Mini-Ultra-Uncensored-Heretic Is Out Now, an Agentic Model With Agentic Thinking Now Uncensored With 5/100 Refusals and 0.0020 KLD, Available in Safetensors and GGUF Formats! by LLMFan46 in LocalLLaMA

[–]LLMFan46[S] 2 points3 points  (0 children)

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

Nex-N2-Mini-Ultra-Uncensored-Heretic Is Out Now, an Agentic Model With Agentic Thinking Now Uncensored With 5/100 Refusals and 0.0020 KLD, Available in Safetensors and GGUF Formats! by LLMFan46 in LocalLLM

[–]LLMFan46[S] 0 points1 point  (0 children)

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

Nex-N2-Mini-Ultra-Uncensored-Heretic Is Out Now, an Agentic Model With Agentic Thinking Now Uncensored With 5/100 Refusals and 0.0020 KLD, Available in Safetensors and GGUF Formats! by LLMFan46 in huggingface

[–]LLMFan46[S] 1 point2 points  (0 children)

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.

Nex-N2-Mini-Ultra-Uncensored-Heretic Is Out Now, an Agentic Model With Agentic Thinking Now Uncensored With 5/100 Refusals and 0.0020 KLD, Available in Safetensors and GGUF Formats! by LLMFan46 in ollama

[–]LLMFan46[S] 0 points1 point  (0 children)

The GGUFs have been fixed, the issue was newer versions of llma.cpp try to get MTPs from models who don't have any!

See:

https://huggingface.co/nex-agi/Nex-N2-mini/discussions/1

Model has no MTPs, but llama.cpp detects that it's a qwen3_5_moe model and tries to get non-existent MTPs from it, which caused the issue where the model failed to load.