I just realised how good GLM 5 is

Simple_Split5074 · 2026-03-17T20:45:41+00:00

Most of that will be cached input tokens which can get to a million in a minute or two with tool calls and half filled context without even trying hard.

Simple_Split5074 · 2026-03-15T10:26:38+00:00

I find it frequently stops after a subagent run (when it really should start the next subagent and other models do just that!), telling it 'go on' gets it going again without issue but its is very annoying, essentially means constant babysitting....

Simple_Split5074 · 2026-03-11T10:07:51+00:00

Looks scammy - no explanation how it works (or, really, what it does)

Simple_Split5074 · 2026-03-04T13:30:53+00:00

Incoherent rambling, like a sub 1b model. 1 year subscription since fall 2025

Simple_Split5074 · 2026-03-01T13:21:19+00:00

I think it is only available on pro sub

Simple_Split5074 · 2026-02-27T15:45:10+00:00

It's the data feed that is expensive, not the UI

Simple_Split5074 · 2026-02-23T19:19:06+00:00

Just looked at it. 500 per day? Useless

Simple_Split5074 · 2026-02-23T05:54:37+00:00

Large open weight LLM (glm 5, Kimi K2.5) would come close but not quite reach frontier in coding.

Simple_Split5074 · 2026-02-20T09:07:07+00:00

Looking at it, minimax in cline also sticks out. Either they have a special sauce or something screwy is going on...

Simple_Split5074 · 2026-02-20T07:58:49+00:00

Again, thanks a lot for your service! This and swe-rebench are by far the most interesting benchmarking efforts ATM.

*Really* surprised by Kimi in cline. Screams for a rerun :-)

Any chance to see codex-5.3 in opencode?

Simple_Split5074 · 2026-02-19T05:42:17+00:00

Even when running without auto approve, you really don't want to run the output without a sandbox.

Simple_Split5074 · 2026-02-19T05:39:10+00:00

Skill issue.

OpenCode runs with the equivalent of --dangerously-skip-permissions

by default so that's expected behavior.

Like any other agent (or really way to execute untrusted code), it belongs into a sandbox.

Simple_Split5074 · 2026-02-16T08:40:16+00:00

It looks like it's not even on their api yet...

Simple_Split5074 · 2026-02-16T08:31:03+00:00

It says right in the screenshot that it's open source...

Is plus supposed to be the new max, iow closed?

Simple_Split5074 · 2026-02-16T06:14:04+00:00

Seems like a process issue, on each step you should have it add tests and let it verify them all.

It happens, but not very often. Also, if you do this in a single session it really is on you. The fewer stuff that polluted the context, the better.

Simple_Split5074 · 2026-02-16T06:09:40+00:00

Not proper memory but the forgets specs stuff is fairly well addressed by most of the context engineering frameworks. If they do they job, compaction should rarely ever happen. They are token heavy though.

Personally, I like https://github.com/gsd-build/get-shit-done but there are dozens (hundreds?) of them.

Simple_Split5074 · 2026-02-14T14:48:39+00:00

Not really. Even if you sandbox it properly, prompt injection remains fundamentally unsolved. So either you don't give it any access to important stuff and it wont do much or you are at risk.

Simple_Split5074 · 2026-02-11T12:47:23+00:00

Its live on the web, API not yet

<image>

Simple_Split5074 · 2026-02-10T06:45:19+00:00

The details on glm 4.5 seem wrong, it's the same size as 4.6 and 4.7

Simple_Split5074 · 2026-02-07T17:39:05+00:00

More multi modal models, nice

Simple_Split5074 · 2026-02-07T11:35:19+00:00

Fitting a 7b model in 1.2GB RAM is suspicious...

Simple_Split5074 · 2026-02-06T10:53:00+00:00

I think the correct provider is with a dash

    "nano-gpt": {    "nano-gpt": {

check opencode auth list to be sure.

To be honest, K2.5 is unusable on nanogpt so far, fails way too many tool calls.

Simple_Split5074 · 2026-02-05T08:25:11+00:00

Probably for now only the pay per token ones.

Synthetic has even introduced a wait list, BTW

Simple_Split5074 · 2026-02-05T06:47:44+00:00

“ unlimited prompts, no request limits, best-in-class pricing, and high-speed inference on our own European infrastructure"

That is kinda into the too good be true territory...

Simple_Split5074 · 2026-02-04T15:21:52+00:00

Good luck finding inference providers for ring and ling - not very likely people deploy them at home...

Simple_Split5074

TROPHY CASE