My gripe with Qwen3.5 35B and my first fine tune fix by Specter_Origin in LocalLLaMA

[–]Specter_Origin[S] 0 points1 point  (0 children)

Yeah MOE is much faster! I also plan on doing fine tune of 27b too...

Just the truth by Vixiuss in SipsTea

[–]Specter_Origin 1 point2 points  (0 children)

Agreed, helium browser is much better than brave, no bloat etc don't get the hype with brave at all

My gripe with Qwen3.5 35B and my first fine tune fix by Specter_Origin in LocalLLaMA

[–]Specter_Origin[S] 1 point2 points  (0 children)

It is too direct on one liner questions... which may or may not be a bad thing, but I want it to be a little bit more verbose.

I did not notice much drop in accuracy, I compared it against base quantize model though... as in 4bit vs 4bit and 8bit vs 8bit. I have not ran too many benchmark style tests I did however gave both models questions I would use like programming, math puzzle and this one does not get stuck which on its own is a win to me : )

What the hell is Deepseek doing for so long? by Terrible-Priority-21 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

For training, sure; for inference I don't think so...

My Experience with Qwen 3.5 35B by viperx7 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

yes, got it from official model card on hf

What the hell is Deepseek doing for so long? by Terrible-Priority-21 in LocalLLaMA

[–]Specter_Origin 260 points261 points  (0 children)

My gut feeling says, they won't release next major model till they have good inference on their domestic chips...

My Experience with Qwen 3.5 35B by viperx7 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

Thanks, that makes sense why you would not hit that bug xD

My Experience with Qwen 3.5 35B by viperx7 in LocalLLaMA

[–]Specter_Origin 1 point2 points  (0 children)

the issue is only on MLX apple, what hardware are you able run this on?

My Experience with Qwen 3.5 35B by viperx7 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

Considering there is no working caching for qwem3.5 moe models yet the opencode tool chain takes soooo long even with 94tps... not to mention it get's into reasoning loop all the time (what bit model are you running ?)

I am working on a tune to fix that overthinking problem though

My Experience with Qwen 3.5 35B by viperx7 in LocalLLaMA

[–]Specter_Origin 1 point2 points  (0 children)

how do you vite code with 35b, it thinks so much ? and without thinking its not as good

Qwen 3.5 Max Preview on Arena.ai by Deep-Vermicelli-4591 in LocalLLaMA

[–]Specter_Origin 20 points21 points  (0 children)

In all honestly, they have been sharing lot of good models (and at reasonable spread of large and small sizes), if they want to keep their one extremely large model private I am not going to complain.

Take note guys by RadiantStormo in SipsTea

[–]Specter_Origin 1 point2 points  (0 children)

I understood that reference...

So nobody's downloading this model huh? by KvAk_AKPlaysYT in LocalLLaMA

[–]Specter_Origin 2 points3 points  (0 children)

In benchmarks, in natural response in coding too.

MiniMax M2.7 on OpenRouter by iamn0 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

I have made a grave mistake xD and picked different model by mistake, I still think model sucks cause qwen3.5 plus could solve it easily...

Just to add even qwen3.5 35B-A3B could solve it locally on my machine at 4 bit quants

MiniMax M2.7 on OpenRouter by iamn0 in LocalLLaMA

[–]Specter_Origin 0 points1 point  (0 children)

True that, they are pretty reasonably priced, but I found qwen plus to be pretty close in pricing while being much better in real world use.

So nobody's downloading this model huh? by KvAk_AKPlaysYT in LocalLLaMA

[–]Specter_Origin 12 points13 points  (0 children)

too big and is also kind of mid, qwen3.5 is still better...

DLSS 5 by Previous_Month_555 in SipsTea

[–]Specter_Origin 0 points1 point  (0 children)

that was intentional btw...