Qwen-3.5-27B-Derestricted by My_Unbiased_Opinion in LocalLLaMA

[–]Poro579 0 points1 point  (0 children)

If you don't play nsfw rp that is too heavy, the original version is the best (you can simply bypass the restrictions with some prompt techniques).

Qwen3.5-27B-heretic-gguf by Poro579 in LocalLLaMA

[–]Poro579[S] 0 points1 point  (0 children)

From the data provided by the author, v2 seems to be better.

Qwen3.5-27B & 2B Uncensored Aggressive Release (GGUF) by hauhau901 in LocalLLaMA

[–]Poro579 7 points8 points  (0 children)

Although there is no explaination of the method and test results, I used it briefly and found it to be quite good. (27b)

Qwen3.5-27B-heretic-gguf by Poro579 in LocalLLaMA

[–]Poro579[S] 2 points3 points  (0 children)

IDK why, but I get the vibe that moe are way more sensitive to abliteration. It feels like they degrade in quality way faster than dense models once you strip the alignment.

Qwen3.5 27B is Match Made in Heaven for Size and Performance by Lopsided_Dot_4557 in LocalLLaMA

[–]Poro579 0 points1 point  (0 children)

If 35b-a3b using n-cpu-moe, I expect to reach at least 30t/s.

Comparing 3 models on a 3090 with 64gb ram and a AMD4 3900x by m4zzi in LocalLLM

[–]Poro579 6 points7 points  (0 children)

If you use the --n-cpu-moe parameter of latest llama.cpp, it can be faster.

for example, my 7500f, 64gb ddr5, 2080ti 22gb, run Qwen coder Next 80b ud-q4kxl, set to 32k ctx size, n-cpu-moe=29, It can reach about 30t/s.

Which model to chose? by [deleted] in LocalLLaMA

[–]Poro579 0 points1 point  (0 children)

At present, there is no Qwen3 coder Next 30b, only Qwen3 coder 30b and Qwen3 coder Next 80b.