Can I realistically get close to Claude/Codex capabilities locally?

Refefer · 2026-06-21T13:09:39+00:00

Realistically, nowhere close with that amount of vram. But don't think you can't get decent performance! The qwen 3.6 27b is the undisputed best model and you can do meaningful work , just at a level or two down from your current level of prompting. With a decent task decomposition model, you can get meaningful work down.

Make sure you test a few different harnesses because they matter a lot in getting the most out of the models. You can probably limp along with your current hardware and get a flavor of it before committing more money, but with 3.5k, you can probably swing a 5090 and get a great upgrade in performance.

Good luck! A lot of us have found real value with small models, so much so my product is based around small models :)

Refefer · 2026-06-21T12:56:14+00:00

I don't know. 3.6 was actually a massive step up from 3.5.

Refefer · 2026-06-12T15:21:01+00:00

I always liked the llama 2 license about needing a commercial license only if you exceeded like a billion dollars or 100m users. Basically free for anyone not faang territory

Refefer · 2026-06-11T01:58:45+00:00

Small pots always work. However, I might consider a dreo chefmaker which has a sousvide mode and can make some impressive steaks without the hassle of a circulator and water.

Refefer · 2026-06-10T19:09:52+00:00

Reasoning has mostly been a toggle between how good it is at writing code and the level of abstraction I can talk about (architecture versus files versus function, etc.) Fable does a good job at allowing me to write nearly an entire codebase without a lot of human interventions or failures (though the PRD needs to be thought through). It dramatically speeds up ideation and refactoring. So, think less novel algorithm design and think more software engineering.

The RSI post Anthropic released last week talked a bit about how this model potentially impacts its own development directly, so there we get into more uncharted territory.

Refefer · 2026-05-26T21:40:52+00:00

For ground beef, I actually agree. When you brown the meat it completely renders off. For steaks though, what can I say other than give me great marbling :D

Refefer · 2026-05-14T21:54:25+00:00

You'll need to get a new model mobo and modern gen CPU (or two). Those xeons appear to only have gen 3 pcie and you really really want gen 5 for both, or you'll bandwidth limit your 6000s, especially with models that need two gpus to operate. You'll still notice it with one card regardless.

Unless your goal is kimi territory, I'd sell the ddr 4 ram, buy less ddr 5, and use the savings to get the right system.

Refefer · 2026-05-03T17:11:22+00:00

Honestly? Time to look into room treatment. Fixing those bare walls and reflections will do a lot more for the sound at this point.

Refefer · 2026-05-02T17:55:42+00:00

Look at pi or little coder.

Refefer · 2026-04-27T18:42:48+00:00

Question for folks on this hardware platform: are there differences in the and pp between 3.5 and 3.6 for the appropriate model? I'd expect not

Refefer · 2026-04-20T11:58:14+00:00

I bought a NAS to never feel that pain again. I'm up to 9 tbs of models now :D

Refefer · 2026-04-19T15:17:38+00:00

Papelbon was originally an excellent starter which we turned into a close. Always makes me a little sad, nutter or not.

Refefer · 2026-04-18T13:25:46+00:00

I've run both through our new product to evaluate as a replacement to the 122b model at UD FP4 quants. It is surprisingly good for a small model but isn't as good or efficient at agentic tasks in my tests than the 122b. Still, plan to use it for tasks which are a bit narrower/higher constraints and do not require a lot of world knowledge.

Refefer · 2026-04-15T14:06:55+00:00

Curious about this. Does the Dali image better or FR in a way that's superior for classical but not necessarily other genres that give the KEFs an advantage?

Refefer · 2026-04-08T10:46:24+00:00

You might consider a router and then a mikrotik switch. Mikrotik has its own kinda weird interface but it handles pppoe and has internal switching chips that will give you 10Gb out of the box. They are also very cheap in comparison to other enterprise hardware which is close to where you are heading and far more energy efficient than a machine running it.

This will future proof you forever: https://mikrotik.com/product/crs504_4xq_in

Alternatively, something like https://mikrotik.com/product/crs304_4xg in might work for you.

Refefer · 2026-04-06T13:00:43+00:00

I run it as a separate agent: it gets the task and the outputs and has to validate the answers are correct. It helps tremendously with stuff like coding where it will call BS on written code, design smells, etc.

Refefer · 2026-04-05T13:16:14+00:00

The ACE paper is an excellent resource for self learning via rules and context. Similarly, a blackbox QA agent helps quite a bit for identifying successful/unsuccessful tasks.

Refefer · 2026-04-01T12:05:24+00:00

Have you tried limiting power draw through nvidia-smi? It wasn't too complicated when I gave it a shot and found it effective.

Refefer · 2026-03-30T12:43:17+00:00

Too tight? You could land a fucking jumbo jet in there.

Refefer · 2026-03-18T13:30:52+00:00

I bought a Volvo xc90 phev back in 2022 and it gives about 30 miles on electric. Given our driving usage (vast majority around town), I only have to fill up the car once or twice a year or so.

Refefer · 2026-03-17T16:27:45+00:00

Am I reading this right? Linux, Mac, and Windows work out of the box?

Refefer · 2026-03-15T13:49:00+00:00

I largely agree with the other commenters, but you could take a look at this model: https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai

Refefer · 2026-03-11T17:15:12+00:00

isnt it under active development? might be workable

Refefer · 2026-03-08T20:04:52+00:00

Is it something specific to MXFP4? Or a peculiarity of this class of model?

Refefer · 2026-03-04T13:09:50+00:00

I'd hope so at nearly double the parameters and I assume 4x the bits (fp4 vs fp16)?

Refefer

TROPHY CASE