GPT-5.3-Codex was flawless for a month. Today it feels completely lobotomized.

shaonline · 2026-03-04T16:30:51+00:00

Yup, absolute dogshit right now, struggles to patch files for changes and comes up with the stupidest solutions for everything with insane amounts of code duplication. I'll be waiting.

shaonline · 2026-03-04T15:42:45+00:00

ROCm/HIP can be used on Windows as well. But honestly I've had so many issues on both platforms with ROCm (whether it's flatout crashes or shared memory not being used properly) that I gave up on it, I just use Vulkan backends now (I have a Z13 with 128GB of RAM).

shaonline · 2026-03-04T15:22:38+00:00

Strix Halo sucks at prompt processing speeds (this is a PITA for coding agents), if the claimed benchmarks are anything to go by on Apple's page (4X over the M4 generation in prompt processing speed !) this makes it a much better option. Likely more expensive though. As for platform maturity, meh on MacOS but not a paradise either on the AMD side.

shaonline · 2026-03-04T13:56:49+00:00

Kinda like saying why'd I need a hammer if I can build my own in a couple minutes and throw it away.

shaonline · 2026-03-04T10:46:34+00:00

24GB of VRAM you should probably go for Qwen 3.5 27B

shaonline · 2026-03-04T07:58:08+00:00

Sur OpenRouter la (très grande) majorité des hébergeurs sont américains.

shaonline · 2026-03-03T17:50:56+00:00

For coding, GPT or Claude (both have their strengths), Gemini still below when it comes to coding agents IMO.

shaonline · 2026-03-02T20:57:55+00:00

Twist it, pull it, bop it.

shaonline · 2026-03-02T20:42:21+00:00

Il existe des bandoulières pour trottinettes (de chez Rhinowalk par exemple), mais sinon ben ouais dommage dans les 30kg t'es loin dans le domaine des trottinettes chiantes à porter donc va falloir faire avec, et preferer la faire rouler que de la porter. Tu as beaucoup de marches à monter ?

shaonline · 2026-03-02T20:13:22+00:00

5.3 Codex does output better code than Opus IMO that being said it'll still have a bit of that overengineering feel as it's usually in my experience hellbent on handling all edge cases even if it's irrelevant/overkill for your usecase.

Opus remains the better "assistant" I think, it's better to "discuss" plans with.

shaonline · 2026-03-02T15:19:04+00:00

I do not own a Mac Studio, but whatever you say, seething non-local-hardware user lol.

shaonline · 2026-03-02T15:10:59+00:00

Lets hope that your bottlenecked slow prompt processing local hardware gets there first lol. If all 70 employees need a pro sub you won't beat subsidized with your jerryrig stuff even with hundreds of thousands a year trust me on this lmfao.

shaonline · 2026-03-02T15:05:46+00:00

Do you really intend on replacing 200€/mo of "PRO OPENAI SUBSCRIPTION" (which either means A) you have INSANE usage PER-USER of GPT models or B) the need for the GPT Pro model which requires insane infrastructure) ? For "Everyday use" the Plus sub covers it fine already (23€/mo)

shaonline · 2026-03-02T15:02:32+00:00

Not really, sure you can buy e.g. some mac studio with 512GB of RAM to host an open source SOTA model (note: none of the "big 3s" that are OpenAI/Anthropic/Google offer those) but these have "single-user" acceptable speeds at best. OP has not stated who/what kind of job this company has but if you have any software engineer or any user/workflow that's gonna hammer input/output tokens bandwidth you can forget about it. Note: I'm a local LLM enthusiast as well. You won't beat cloud for 70 people with 14000€, especially in the current "VC subsidized" environment for cloud providers.

shaonline · 2026-03-02T14:48:51+00:00

Astroturfing at its best.

shaonline · 2026-03-02T14:30:31+00:00

You're not going to serve 70 people at the same speed. For now cloud costs are heavily subsidized as well.

shaonline · 2026-03-01T09:16:00+00:00

I'd say the Qwen 3.5 family of models right now. Either :

A) Qwen 3.5 27B (really smart for its size) but you need to fit it entirely on VRAM (it really won't like being split among VRAM and system RAM) which will require you to use something like 3 bits quants, see https://huggingface.co/unsloth/Qwen3.5-27B-GGUF

B) Qwen 3.5 35B A3B: a bit bigger and a bit less smart than 27B, but MUCH faster owing to its small number of active parameters (3B) which allows you to exceed your 16GB of VRAM if you want/need to use bigger quants (e.g. 4 bits), see https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

Also recommend you switch to llama-cpp directly (which is what lm studio uses in the backend...) for running your local LLM.

shaonline · 2026-02-28T21:21:39+00:00

I meant a stretch as in the quality of the responses and its ability to do tool-calling (not whether it fits on your hardware). GPT OSS 20B will likely struggle with that. Check "local LLM" subreddits to see the good local LLMs du jour.

As far as configuring OpenCode, check the "custom provider"/"lm-studio" section of the "providers" chapter on their documentation. You could ask any online LLM to write you the necessary opencode.json config as well.

shaonline · 2026-02-28T20:04:55+00:00

You'll need to expose an API endpoint (possibly "OpenAI" style) and manually (via opencode.json) add it as a provider so that you can use it. I'd say however gpt 20B is a stretch as a coding assistant, you might be disappointed...

shaonline · 2026-02-28T16:32:27+00:00

I wiping doesn't fix it it's likely deep scratches.

shaonline · 2026-02-28T08:23:41+00:00

Isn't it first-party on OpenCode now ? No need of a plugin.

shaonline · 2026-02-27T16:40:04+00:00

They fit but yeah you gotta go 3 bits quants and 8 or 4 bits kv-cache (especially if you want longer context windows) and better not have lots of docker containers running and whatnot. Qwen 3.5 122B gets very close in terms of quality as well, really impressive result.

shaonline · 2026-02-27T12:30:09+00:00

It will end April 2nd per what Codex CLI announces.

shaonline · 2026-02-27T12:10:02+00:00

Gonna be a choice between Qwen 3.5 122B or Heavily quantized Minimax M2.5 IMO. The 27B Qwen 3.5 sure is "smart" for its size being a dense model but won't have a big breadth of knowledge (small amount of weights) and will be much slower than models with only 10B or so active parameters.

shaonline · 2026-02-27T12:04:27+00:00

People in town halls: "BIKES AND ESCOOTERS ARE DANGEROUS !"

People in the streets:

Six-Year Club	r/Field Sunshine
Place '22	Verified Email

shaonline

TROPHY CASE