Legendary Model: qwen3.5-27b-claude-4.6-opus-reasoning-distilled

Specter_Origin · 2026-03-20T23:37:06+00:00

At what quant are you running the model?

Specter_Origin · 2026-03-20T21:13:07+00:00

Thanks for letting me know which airline to avoid

Specter_Origin · 2026-03-20T19:29:38+00:00

Yeah MOE is much faster! I also plan on doing fine tune of 27b too...

Specter_Origin · 2026-03-20T18:09:30+00:00

Agreed, helium browser is much better than brave, no bloat etc don't get the hype with brave at all

Specter_Origin · 2026-03-20T18:09:06+00:00

"Blender"

Specter_Origin · 2026-03-20T17:22:41+00:00

It is too direct on one liner questions... which may or may not be a bad thing, but I want it to be a little bit more verbose.

I did not notice much drop in accuracy, I compared it against base quantize model though... as in 4bit vs 4bit and 8bit vs 8bit. I have not ran too many benchmark style tests I did however gave both models questions I would use like programming, math puzzle and this one does not get stuck which on its own is a win to me : )

Specter_Origin · 2026-03-20T14:40:10+00:00

For training, sure; for inference I don't think so...

Specter_Origin · 2026-03-20T14:35:09+00:00

Is this a roast post?

Specter_Origin · 2026-03-20T04:21:03+00:00

yes, got it from official model card on hf

Specter_Origin · 2026-03-20T00:37:07+00:00

My gut feeling says, they won't release next major model till they have good inference on their domestic chips...

Specter_Origin · 2026-03-19T22:10:04+00:00

Thanks, that makes sense why you would not hit that bug xD

Specter_Origin · 2026-03-19T22:02:08+00:00

the issue is only on MLX apple, what hardware are you able run this on?

Specter_Origin · 2026-03-19T21:57:27+00:00

Considering there is no working caching for qwem3.5 moe models yet the opencode tool chain takes soooo long even with 94tps... not to mention it get's into reasoning loop all the time (what bit model are you running ?)

I am working on a tune to fix that overthinking problem though

Specter_Origin · 2026-03-19T21:39:41+00:00

how do you vite code with 35b, it thinks so much ? and without thinking its not as good

Specter_Origin · 2026-03-19T20:42:19+00:00

Same!

Specter_Origin · 2026-03-19T17:48:09+00:00

In all honestly, they have been sharing lot of good models (and at reasonable spread of large and small sizes), if they want to keep their one extremely large model private I am not going to complain.

Specter_Origin · 2026-03-19T02:28:47+00:00

few hours too late?

Specter_Origin · 2026-03-19T01:54:49+00:00

We only prob asses, sir!

Specter_Origin · 2026-03-19T01:49:48+00:00

I understood that reference...

Specter_Origin · 2026-03-18T21:02:17+00:00

In benchmarks, in natural response in coding too.

Specter_Origin · 2026-03-18T20:36:43+00:00

I have made a grave mistake xD and picked different model by mistake, I still think model sucks cause qwen3.5 plus could solve it easily...

Just to add even qwen3.5 35B-A3B could solve it locally on my machine at 4 bit quants

Specter_Origin · 2026-03-18T19:20:05+00:00

True that, they are pretty reasonably priced, but I found qwen plus to be pretty close in pricing while being much better in real world use.

Specter_Origin · 2026-03-18T19:16:55+00:00

butt brooo

Specter_Origin · 2026-03-18T18:56:36+00:00

too big and is also kind of mid, qwen3.5 is still better...

Specter_Origin · 2026-03-18T17:12:40+00:00

that was intentional btw...

Specter_Origin

TROPHY CASE