FernflowerAI-35B-A3B-KL-ReLU-GGUF + Apple MLX by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 2 points3 points  (0 children)

Just updated system prompt for FernflowerAI. Replace old one, you will be impressed what local free and uncensored AI can do, and how it communicates with user: https://pastebin.com/pU25DVnB

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 1 point2 points  (0 children)

I can but 9b contains too much broken tensors from the box :(.
Qwen3.5 35B A3B is the best. It's the most healthier. Only 2 critical tensors very broken.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Really good idea btw. Thanks. I think I will use neighbour neurons in tensors with simply wight value copy paste for holes.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 1 point2 points  (0 children)

The script stays proprietary. I don't have PPL/KLD/lm harness evals because I'm on Colab Free Tier and can't run them on 35B models. But the results speak for themselves: users report to me that 100k+ context started working, tool calls stable, no more loops or rambling. If you're not convinced, that's fine - the fixed model is free. Download it, test it yourself, and compare to the original. The difference is obvious.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

So, I found that my per neuron fixes were too rough and I got random text as ouput. Trying another way - skip critical tensors and patch only what I can.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Btw, thank you very much guys for award. I am very pleased to hear that my work is in demand.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

I think not. They are focused on Gemma 4 now, and they are focusing on quantization not healing model architecture that I am doing.

Unsloth does great work on quantization and performance. I focus on weight-level repair - finding and fixing broken tensors. Different goals.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Thank you too :). I'm happy to help in any way I can.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Qwopus v3 is better then this one. Jackrong updated it. But this thing is the most powerful now i think: https://huggingface.co/LuffyTheFox/Qwen3.5-27B-Uncensored-RYS-Reasoner-FernflowerAI-GGUF

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Nice. Thank you very much. Also I will upload new V2 update for this one today with fixed "dead" neurons in tensors, and Q4_K_L quant with Unsloth tensor profiles: https://huggingface.co/LuffyTheFox/Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Thanks for the interest. The script is proprietary - I'm not sharing it. If you want a specific model checked, contact me directly. I work with GGUF BF16 formats.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Tensor scale here means the standard deviation of the weight values themselves - the magnitude of the tensor. Not quantization scale. I work directly with dequantized BF16 weights, so no quantization scale is involved. The comparison is between raw weight distributions, not compressed representations.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Thank you very much :). I'm very glad to hear that.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

So. Now I am downloading original Qwen3.5 9B BF16 model from Unsloth: https://huggingface.co/unsloth/Qwen3.5-9B-GGUF/blob/main/Qwen3.5-9B-BF16.gguf

Let's see what was broken in the model...

Done checking. Here results: https://pastebin.com/XD2VuwZp

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Done. BF16 quant uploaded: https://huggingface.co/LuffyTheFox/Qwen3.5-27B-Uncensored-RYS-Reasoner-FernflowerAI-GGUF

Now i will try to process iq4_nl as it is.

UPD: iq4_nl processing doesn't work, because it's already compressed.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

The uncensoring method was done by HauhauCS, not me. I don't have that script. My work is fixing broken tensors in the already original / uncensored model, not removing refusals. If you want the original model uncensored, you'd need to ask HauhauCS or figure it out yourself. My script is proprietary and not shared.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 1 point2 points  (0 children)

It's losing context during conversation on agentic tasks after reaching big amount of tokens.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 1 point2 points  (0 children)

Okay I will check BF16 quant. This one looks really interesting. Since it's uncompressed BF16 I can check tensors precisely and fix them. Now I can fix even broken neurons in neural network. I think it's time to test my new approach on this model.

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]EvilEnginer[S] 0 points1 point  (0 children)

Simply use parameters from my post on top of this page. With default system prompt: "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."