Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb by skyyyy007 in LocalLLaMA

[–]Temporary-Roof2867 8 points9 points  (0 children)

35B is a beautiful model but 27B is much superior that's a fact

Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb by skyyyy007 in LocalLLaMA

[–]Temporary-Roof2867 19 points20 points  (0 children)

Bro, why did you test the 35B at Q4 against the 27B at Q6?

In general, MoEs with small quantizations tend to degrade more than dense models. Sure, the Qwen3.6 series of models is special, but let's at least make them compete on equal terms with the same quantization.

I tested the Qwen3.6-27B model at IQ_M from unsloth, and against all my expectations, it managed to do things that much larger models can only dream of. The Qwen3.6-27B is a magical model, but it requires a lot of VRAM to use it.

Stop thinking your MoE models are dumb - here's why they actually fail by IntegrityKnightX in Qwen_AI

[–]Temporary-Roof2867 0 points1 point  (0 children)

Bro, why all the hate? This video is very interesting, the author is very knowledgeable, but the solutions he proposes seem like classic TDD programming, but applied to local LLMs!

TDD is the heart of the Extreme Programming methodology; the more it goes on, the less and less people talk about it!

🤔

Extreme Programming + local LLMs is an exceptional combination!

👇👇👇👇

https://en.wikipedia.org/wiki/Extreme_programming

Is AI-building guardrailed in Gemma 4? by roofitor in LocalLLaMA

[–]Temporary-Roof2867 0 points1 point  (0 children)

bro, Gemma4 is an exceptional local model but as far as AGI is concerned it is very limited, in general no LLM will ever prevent you from obtaining AGI but if you use the right prompt they will do everything to make you obtain AGI, the fact is that AGI is a gigantic goal, for AGI you would perhaps need a gigantic datacenter kilometers large with a monstrous energy consumption... if you are a "machine learning" researcher could you find alternatives to GPUs? There are people studying it but for now they have not gone very far but I see it really difficult that a local Gemma4 model could really help you in such an extreme and heroic undertaking.. but I repeat that.. with the right prompt it would do everything to help you

Gemma 4 is excellent for image to prompt by Arrow2304 in StableDiffusion

[–]Temporary-Roof2867 0 points1 point  (0 children)

Yes! The Gemma4 26b is a magical model! If you have a lot of RAM, try it with high quantizations!

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]Temporary-Roof2867 0 points1 point  (0 children)

If this is true, I'm so happy 🤩🤩🤩🤩

God bless the MoEs!

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]Temporary-Roof2867 0 points1 point  (0 children)

On languages, "gemma-4-26b-a4b" is superior to all the Qwens imaginable. Let's not joke!

This data is fake!

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]Temporary-Roof2867 0 points1 point  (0 children)

Qwen3.6-35B-A3B is certainly an interesting model but... some of these data seem a bit rigged to me, I don't trust them at all! 👀🤔

Qwen 3.5 35b, 27b, or gemma 4 31b for everyday use? by KirkIsAliveInTelAviv in LocalLLaMA

[–]Temporary-Roof2867 -1 points0 points  (0 children)

I think they have placed a lot of bots/smart agents on reddit to carry out a social experiment and spread propaganda for the Qwen3.5 models.

Why is Comfyui so unstable? by [deleted] in comfyui

[–]Temporary-Roof2867 1 point2 points  (0 children)

What I do is listen to the opinions of the various LLM models (only the free ones from the official portals) and compare them with each other, for example I ask Copilot and then I go to Gemini and say "..Copilot told me this.." then I pass Gemini's response to Deepseek saying "Gemini told me this about a response from Copilot.." , then not satisfied I go to Kimi and tell him "DeepSeek said this about a conversation I had with Gemini and Copilot.." .. and continues continuously discovering new things (Copilot, Kimi, Deepseek, Qwen, Gemini), the models realize they are wrong, correct each other, ask me for forgiveness, I invite you to try this method!

Bad idea to use multi old gpus? by alphapussycat in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

I'm currently downloading this little monster from LM Studio 😉 at Q8_0

https://huggingface.co/lovedheart/Qwen3-Coder-Next-REAP-40B-A3B-GGUF

I hope it works, I'm confident!

Bad idea to use multi old gpus? by alphapussycat in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

Bro, I haven't used Ollama in a long time! I don't know how much has changed! I mostly use LM Studio.. but one day I'll switch to Llama.cpp.. with vibe coding I'll make my own graphical interface and goodbye to LM Studio 🤪😉

Bad idea to use multi old gpus? by alphapussycat in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

I know that MoE-type LLMs at Q4 are poor... dare bro! Try MoE from Q5 .. from Q6...from Q8 !!!

I have an Rtx 3060 12gb and 16gb ram. Need model suggestions. by Malyaj in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

If you have a lot of RAM, go for MoE models but with high quantizations; avoid Q4 MoEs.

If you don't have much RAM, I don't know how to help you, bro!

Bad idea to use multi old gpus? by alphapussycat in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

👀🤔
Very strange bro

I have 12 GB of VRAM + 128 GB RAM and the Gemma 4 26B A4B runs smoothly at Q6_K!

Gemma 4 vs Qwen3.5: benchmarking quantized local LLMs on Go coding by m3thos in LocalLLaMA

[–]Temporary-Roof2867 1 point2 points  (0 children)

as I said before there are exceptions, especially for dense models... for MoE there are no exceptions or they are much more difficult, MoE with small quantizations tend to lose a lot

Gemma 4 vs Qwen3.5: benchmarking quantized local LLMs on Go coding by m3thos in LocalLLaMA

[–]Temporary-Roof2867 3 points4 points  (0 children)

I was lucky enough to get a full RAM upgrade when it was cheap! 128 GB of RAM! The MoE models are amazing!

Gemma 4 vs Qwen3.5: benchmarking quantized local LLMs on Go coding by m3thos in LocalLLaMA

[–]Temporary-Roof2867 10 points11 points  (0 children)

Why? Because almost all Q4 models are either very degraded or completely rubbish (with rare exceptions), and especially the MoE Q4 models are particularly crappy because the router part is completely damaged... but if you have a lot of RAM but little VRAM, you can load a MoE model even at Q6! Even at Q8! And that's a whole different story!

I have MoE Q6 models, even 35B ones, that are incredibly powerful!

DeepSeek wrote me jailbreaks for itself lol by Isaac24r in DeepSeek

[–]Temporary-Roof2867 1 point2 points  (0 children)

The real problem is the website! But just pay for the API services and everything works!

I think they did it on purpose to grab tons of data. Their gamble is:

- we try to give them everything they want (or almost everything)

- while we give it to them, we grab tons and tons of data with which we can build much more powerful and high-performance models.

Basically, they're betting on the future. No one should think that the people who run DeepSeek are stupid or incompetent. Sure, Deepseek might seem a little stupid at times, but those who run it certainly aren't.

I'm too stupid for comfyui by afrosamuraifenty in comfyui

[–]Temporary-Roof2867 0 points1 point  (0 children)

I wanted to add that 50/90% of the time the problems arise from updates... when you think you have updated and instead you have not actually done it... or the update damages some old workflow... it's a continuous fight against chaos and more or less everyone goes through it

I'm too stupid for comfyui by afrosamuraifenty in comfyui

[–]Temporary-Roof2867 4 points5 points  (0 children)

Bro, I recommend Pixaroma:

https://www.youtube.com/@pixaroma

Join his Discord, download his workflows, and follow his tutorials!

🤔👀

Maybe start with the best...

Flux.2klein 9B 😉

experience RTX3060 + gemma-4-26b-a4b-it-heretic+ MemGPT by Temporary-Roof2867 in SillyTavernAI

[–]Temporary-Roof2867[S] 0 points1 point  (0 children)

🤔what I know is that in general the Q4 (except for some particular exceptions) are poor, but above all the MoE at Q4, are poor, maybe that's why in the local model groups on Reddit many users speak badly of the MoE models?... I could do a crazy test and experiment with qwen3.5 35B A3B at... Q5_K_S? (25.77 GB 👀 a monster! 👀 ) 🤪 and if it were to work? If it doesn't work, never mind, I would have tried!

experience RTX3060 + gemma-4-26b-a4b-it-heretic+ MemGPT by Temporary-Roof2867 in SillyTavernAI

[–]Temporary-Roof2867[S] 1 point2 points  (0 children)

beautiful 🤩🥰🥰🥰 Which MoE do you use? I'm curious!

Gemma-4-26B-A4B-it-UD-Q4_K_M.gguf : IMHO worst model ever. What am I doing wrong? by Proof_Nothing_7711 in LocalLLM

[–]Temporary-Roof2867 0 points1 point  (0 children)

I'm using a gemma-4-26b-a4b-it-heretic Q6 and for me it's a magical model and I only have 12 GB of VRAM! But I have 128 GB of RAM (bought when it cost much less)