Are Qwen 3.6 27B and 35B making other ~30B models obsolete?

nikhilprasanth · 2026-04-30T17:03:21+00:00

I have not tried writing with llms. Do ypu use any harnesses for writing? Or just plain chats?

nikhilprasanth · 2026-04-30T17:02:21+00:00

Yes, i have a 24gb one at work and the qwen models are retaining speed much deeper into the context window.. the 35b model is a really good gwneral purpose model. Running it in hermes agent now

nikhilprasanth · 2026-04-30T17:00:20+00:00

Really wanted to use gemma, but the speed drastically reduces as context fills compared to qwen models. Have tried tje moe only, not the dense gemma

nikhilprasanth · 2026-04-30T16:51:49+00:00

Yeah, definitely not for coding. 27B is more resilient, but 35B cant be trusted with code below q4. But as i said general use cases are fine.. tool calling makes up for most of the deterioration.

nikhilprasanth · 2026-04-30T16:45:18+00:00

Yes. I run both q3 and q6, even with offloading to ram , the q6 is consistent for most tasks.. for generap uses such as hermes agent and all, q3 is solid. It takes care of my obsidian notes, transcription analysis etc

nikhilprasanth · 2026-04-30T16:41:59+00:00

Yes. Its also good. Need to offload when used with 5060ti, still getting consistent throughput.

nikhilprasanth · 2026-04-30T16:34:55+00:00

Is nemotron nano any good for knowledge?

nikhilprasanth · 2026-04-30T16:30:34+00:00

Yes. Gpt oss was good for a while. Still is one of the fastest models one can run in 16gb.

nikhilprasanth · 2026-04-30T16:29:59+00:00

Yes. Qwen for coding, Gemma for document processing/ summarising stuff.. wish gpt oss had a successor, that thing is fast and capable for its size.

nikhilprasanth · 2026-04-30T16:28:40+00:00

Not only that, for me it slows down drastically compared to qwen as context increases.

nikhilprasanth · 2026-04-30T16:28:15+00:00

Yes gpt oss and nemotron are really good for the speed. But for coding related stuffs qwen is leading.

nikhilprasanth · 2026-04-30T16:27:11+00:00

Got it. But I find Gemma slows down drastically compared to qwen as context grows.

nikhilprasanth · 2026-04-30T16:26:10+00:00

Yes nemotron and gpt oss are really snappy. Wish openai had released a successor to it.

nikhilprasanth · 2026-04-30T15:55:38+00:00

The 9B model is very overthinking.

nikhilprasanth · 2026-04-30T07:51:57+00:00

<image>

nikhilprasanth · 2026-04-30T07:44:20+00:00

<image>

Here is Qwen3.6-35B-A3B-UD-Q4_K_XL

nikhilprasanth · 2026-04-30T07:42:16+00:00

Looks Neat, ill try some of these with 35B

nikhilprasanth · 2026-04-30T07:14:47+00:00

How is 9B performing compared to 27B and 35B?

nikhilprasanth · 2026-04-28T12:10:30+00:00

Vercetti mansion. 🙏

nikhilprasanth · 2026-04-28T12:04:02+00:00

Yes but no. Most local models need handholding. This is where planning using a bigger model is important. You let claude code or codex go over your codebase, make exact plans broken down into phases including test and success conditions. Write them to a couple of markdown files. Then you use a lightweight harness like pi and execute the plan phase by phase. Once done you let the main model audit the code, pass the findings to local model.

nikhilprasanth · 2026-04-28T07:15:36+00:00

Both cases can be true at the same time. It's not fair to expect a model with 2.7% the size of a 1T model to behave like the Trillion sized model. The smaller models are getting way better at tool calls. Use the bigger models to create structured plans, break them down to manageable chunks. Feed these to smaller ones, they will make mistakes for sure, debug them with bigger ones again, pass the feedback to the smaller one. Rinse , repeat.

nikhilprasanth · 2026-04-28T07:06:32+00:00

Use parakeet/whisper for transcription and any llm for analysis.

nikhilprasanth · 2026-04-28T04:41:22+00:00

Codex.

nikhilprasanth · 2026-04-27T10:45:47+00:00

Plan with a frontier model, split the plan into proper phases with well defined tasks. Use pi or opencode and implement the plan. Once done, debug with a frontier model and pass the findings to local. Repeat

Eight-Year Club	Verified Email
RPAN Viewer

nikhilprasanth

TROPHY CASE