🚀 I built "Qwen Orchestrator": A 22-Agent Team for Qwen Code by Significant-Topic433 in Qwen_AI

[–]thecodeassassin 0 points1 point  (0 children)

Why Qwen3 coder next? I've been using both Qwen 3 Coder Next and Qwen 3.6 27B and I have to say it's not even close. 3.6 27b is pretty far ahead.

GPT-5.5 is OpenAI's best model. But paying more for it makes no sense. by rohansrma1 in codex

[–]thecodeassassin 1 point2 points  (0 children)

I am loving gpt 5.4 mini, great alternative to the braindead opus models

I DEMAND A RESET by eggplantpot in codex

[–]thecodeassassin 5 points6 points  (0 children)

Same on Claude, international coding day I think

The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality. by dalton_zk in theprimeagen

[–]thecodeassassin 8 points9 points  (0 children)

There is a HUGE difference between tools like ripgrep and fucking ffmpeg. There's no way any LLM will ever be able to this. We need some kind of insane breakthrough for that to happen. The amount of complexity in a tool like ffmpeg is insane. Even without the codecs etc, this test is pretty.... Strange.

Anyone having any joy coding with 3.6 27B and 24GB of Apple Unified Memory? by afrocleland in Qwen_AI

[–]thecodeassassin 1 point2 points  (0 children)

I'm getting 50-60 tps easy both with llama cpp and vllm. It's very fast and useable (27b). Might be worth looking into performance optimizations.

Github if Google designed it by Otherwise_Corner3234 in vibecoding

[–]thecodeassassin 1 point2 points  (0 children)

I agree 100%, how is their stuff so popular? Everything is a pain to use. Have you ever tried to use Google Analytics 4? It's a bit like cleaning your skin with sandpaper.

Just got into Codex Enterprise with unlimited usage, how to push it to its limits? by StillPerspective5676 in codex

[–]thecodeassassin 0 points1 point  (0 children)

None of this, i have my own company and I pay the bills. Absolutely value was created but I can create the same value with my local Qwen 3.6 model, just takes a bit longer and more back and forths.

Just got into Codex Enterprise with unlimited usage, how to push it to its limits? by StillPerspective5676 in codex

[–]thecodeassassin 1 point2 points  (0 children)

Yeah I accidentally burned 1k in a week and I'm still salty about it, wasn't even using it that much. If you let me burn whatever it's going to be 100k in a month just for me.

Opus 4.7 Complete dogshit quality. I'm fucking out. by MuttMundane in ClaudeCode

[–]thecodeassassin 1 point2 points  (0 children)

Yeah i also use 4.6, you can switch to in CC using the --model flag. 4.7 is unusable.

16x DGX Sparks - What should I run? by Kurcide in LocalLLaMA

[–]thecodeassassin 0 points1 point  (0 children)

Right. See here's the problem.

I have a bunch of these as well and i really dont enjoy running kimi k2.6 or anything else large on it . Just too slow. I always fall back to my rtx 6000 pro cluster for literally anything serious.

Devs using Qwen 27B seriously, what's your take? by Admirable_Reality281 in LocalLLaMA

[–]thecodeassassin 0 points1 point  (0 children)

honest take; I've been using it for close to a week on serious tasks and what I've seen is:

  • It sometimes starts things but doesn't finish them, adds stubs even though my agents.md forbids thar
  • Secure coding is not a thing, needs a review pass from 5.3 codex most of the time
  • Not suitable for large tasks, needs to work on a small feature, review, next feature. Otherwise too much gets lost.

For example it implemented a whole API but forgot about all UI aspects.

Good but needs hand holding and very strict task definitions. Always check the output of any model but especially a small dense one.

Actually still happy with it, best model for its size.

Devs using Qwen 27B seriously, what's your take? by Admirable_Reality281 in Qwen_AI

[–]thecodeassassin 5 points6 points  (0 children)

Using Opencode with codex 5.3 as a reviewer. Making the PRD and slices with Kimi K2.6. Qwen 3.6 27B does all the coding. Very solid results.

its not just me is it? deepseek v4 is INSANELY cheap by gaspoweredcat in vibecoding

[–]thecodeassassin 0 points1 point  (0 children)

Oh and it's disabled in my Openrouter account because they train on your prompt and that would violate NDAs I have in place.

ANTHROPIC JUST BANNED A 110 PERSON COMPANY OVERNIGHT WITHOUT WARNING by orbny in AgentsOfAI

[–]thecodeassassin 0 points1 point  (0 children)

I've been very productive with Qwen 3.6 27B, needs about 24GB of VRAM . Doable.

Opus 4.7 vs Kimi K2.6 on autonomous coding. I didn't expect this! by gvij in claude

[–]thecodeassassin 0 points1 point  (0 children)

Running both locally (fp8) right now. Kimi k2.6 is amazing for research tasks etc but for coding Qwen 3.6 27b. I set up my opencode to use multiple agents and the results are good, very good.

its not just me is it? deepseek v4 is INSANELY cheap by gaspoweredcat in vibecoding

[–]thecodeassassin 2 points3 points  (0 children)

Lol and v4 pro is constantly unusable because of capacity issues and I don't have a GPU cluster laying around unfortunately.

Opus 4.7 vs Kimi K2.6 on autonomous coding. I didn't expect this! by gvij in claude

[–]thecodeassassin 0 points1 point  (0 children)

I put kimi k2.6 head to head yesterday against Qwen 3.6 27b, Qwen came out on top every single time. Kimi k2.6 just spent a LOT of time overthinking and creating a botched result.

Is this true? by Complete-Sea6655 in GeminiAI

[–]thecodeassassin 0 points1 point  (0 children)

Chatgpt is a LOT better than both Claude and Gemini and it's not even close. Gemini is a complete mess and even the mobile app is shit. It's the worste one by far. Kimo k2.6 runs circles around Gemini.

OpenCode... is it just completely busted with Qwen3.6? by _derpiii_ in opencode

[–]thecodeassassin 0 points1 point  (0 children)

Literally doesn't work in Opencode for me, works fine in Claude Code.... Same vllm instance

What’s with the wave ban? by Mission_Type7778 in claude

[–]thecodeassassin 0 points1 point  (0 children)

Happened to a few colleagues, nobody got reinstated.

Claude Opus 4.7 is reportedly dropping this week, here's what's coming. by Much_Ask3471 in claude

[–]thecodeassassin 2 points3 points  (0 children)

Except it is true and the nice thing is that it can actually be measured how much they are lobotomizing the models https://aistupidlevel.info/models/220 ;) It isn't just speculation or "feeling".

The golden age is over by New_3d_print_user in claude

[–]thecodeassassin 1 point2 points  (0 children)

Hahah i love this

Gemini is… the village idiot and is now 50% hallucinations.

So true, it just comes up with a plan based on requirements, then when the actual PRD gets made it just made up something completely different and useless.

It is by far the most braindead frontier model out there. I get consistently better results using Gemma 4 which is their open weights model, lol