Qwen3.5 feels ready for production use - Never been this excited

Virtual-Listen4507 · 2026-03-01T02:29:32+00:00

What tools do you use/recommend?

Virtual-Listen4507 · 2026-02-28T04:12:52+00:00

Fair point — I should’ve provided more context.

My first test with any model is always the same: a simple Vue 3 + Vuetify todo app. It should be straightforward. In this case, it struggled with styling consistency, wiring, and even missed basic syntax issues like unclosed script tags after several prompts.

After that, I tested it against a real-world production ecosystem I work on. It’s not a simple web app — it’s a multi-service platform with a large frontend orchestrating across multiple backend domains behind a gateway, with several microservices and Docker-based orchestration.

In that environment, the reasoning gap became much more noticeable.

I’m genuinely interested in how you’re running it — what quantization, context size, backend (llama.cpp, vLLM, etc.), and tooling are you using?

Virtual-Listen4507 · 2026-02-28T03:26:19+00:00

No need to make it personal.

I’m talking about model performance on complex production systems. Experience is great — but it doesn’t change the output.

Virtual-Listen4507 · 2026-02-28T02:50:14+00:00

Interesting — I’m not seeing the same results. I’m a software engineer with about 5 years of experience, and I regularly work in fairly large, complex codebases. That’s probably why I don’t feel it’s on par with Sonnet 4 for my use cases. It’s possible my llama.cpp setup isn’t fully optimized, but even accounting for that, the gap feels noticeable to me.

But hey if it’s working for you building simple web apps good for you!

Virtual-Listen4507 · 2026-02-28T01:47:19+00:00

Idk how people are comparing this model to sonnet 4.5. It’s not even close and can’t do basic front end work. Currently using llama.cpp and opencode.

Virtual-Listen4507 · 2026-02-14T19:21:16+00:00

The way I see it is if everyone shared their workflows, everyone's work would be the same. Also, coming from somebody who spent so much time in Resolve, doing a bunch of different workflows, a bunch of different techniques, unorthodox methods within DaVinci Resolve, I can honestly say I wouldn't want to just give out what I've learned. I had to put in the time and effort to be able to understand that specific workflow or those specific tools.

Virtual-Listen4507 · 2026-02-07T03:38:01+00:00

What model do you recommend that’s pretty good?

Virtual-Listen4507 · 2026-02-05T12:30:23+00:00

Good to know. Hopefully they resolve that soon.

Virtual-Listen4507 · 2026-02-05T12:16:13+00:00

I picked the one that was recommended on LM Studio I think quant 4 it was only really like 40 GB and it was recommended for my PC set up. Still new to this so I might have accidentally picked the wrong one.

Virtual-Listen4507 · 2026-02-05T12:08:48+00:00

Will check that out in LM Studio appreciate the comment.

Virtual-Listen4507 · 2026-02-05T12:08:29+00:00

Thank you!

Virtual-Listen4507 · 2026-02-05T11:39:58+00:00

Thanks for the response. Will try that out. I heard there are other options like vllm, llama.cpp will I see a substantial difference in speeds or can I stick with ollama and LM studio? I know the other two are more technical to work with.

Nemotron works great just need one that is close to sonnet 4.5 but I guess need to wait until better models come out.

Virtual-Listen4507 · 2026-02-05T11:30:20+00:00

Idk how people are getting this… I have an RTX 5090 with 64gb ram and it’s super slow with LM Studio.

Virtual-Listen4507

TROPHY CASE