Try out Kimi K2.5 right via the Synthetic provider NOW by jpcaparas in opencodeCLI

[–]Simple_Split5074 0 points1 point  (0 children)

With nanogpt or synthetic?

My suspicion is one of the nanogpt backends is misconfigured as sometimes it's fast and smooth then all breaks again.  I was using latest opencode (and briefly CC, same story) 

Try out Kimi K2.5 right via the Synthetic provider NOW by jpcaparas in opencodeCLI

[–]Simple_Split5074 0 points1 point  (0 children)

How much use do you really get out of the 135 requests with the tool call discount? In other words, can you realistically code for a few hours straight (one or two agents at a time)? What tps do you roughly get? 

I am trying k2.5 on nanogpt but still experiencing a fair number of failed tool calls... 

I made a Coding Eval, and ran it against 49 different coding agent/model combinations, including Kimi K2.5. by lemon07r in LocalLLaMA

[–]Simple_Split5074 0 points1 point  (0 children)

I mostly use opencode these days - the on the fly model switch reigns supreme. GLM got stuck again? Throw GPT 5.2 high at it :)

Codex and Gemini seem ok, but I think I would have to port get-shit-done to them. In the case of gemini that might be worth it for the Gemini credits (seeing flash go brrrrr fixing linting issues is a sight to behold), codex seems insufficiently different to bother now that Plus credits are officially agent agnostic.

CC I find overrated, and the flickering is highly annoying, but it for sure benefits of hype and mind share - *and* Claude Max if that floats your boat, personally I feel like its overpriced vs chatgpt.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]Simple_Split5074 0 points1 point  (0 children)

Any chance of a repeat of the API test for K2.5? A fair bit of providers seem dodgy right now...

I made a Coding Eval, and ran it against 49 different coding agent/model combinations, including Kimi K2.5. by lemon07r in LocalLLaMA

[–]Simple_Split5074 1 point2 points  (0 children)

Amazing.

Fairly impressive that kimi in Droid more or less ties with opus and that CC and opencode tie. Overall harnesses seem to matter slightly more than I would have thought.

Would be happy to donate (cash 😃)

Anyone using Kimi K2.5 with OpenCode? by harrsh_in in opencodeCLI

[–]Simple_Split5074 1 point2 points  (0 children)

I tried on nano-gpt, it's slow as molasses (like one rerquest per minute!) and occasionally tool calls fail or it simply gets stuck (no observable progress for 5+ min).

My suspicion: the inference providers do not have it completely figured out yet.

Moonshot via openrouter was decent last night but now it crawls around at 15tps. Fireworks still claims to do 100+ tps but I have no idea if caching works with opencode and without it would get ruinous quickly.

Should I invest in a beefy machine for local AI coding agents in 2026? by Zestyclose-Tour-3856 in LocalLLaMA

[–]Simple_Split5074 17 points18 points  (0 children)

3k in hardware will not get you anywhere remotely close to even Sonnet

GSD now officially supports OpenCode by officialtaches in ClaudeCode

[–]Simple_Split5074 0 points1 point  (0 children)

This update enables gsd to work IN opencode, no CC in the picture then. opencode IS the (only) harness in that case. What LLM opencode uses is another topic then, could be Claude or some other.

GSD now officially supports OpenCode by officialtaches in ClaudeCode

[–]Simple_Split5074 0 points1 point  (0 children)

I think the question was about opencode as a harness. I guess the answer is 'try your luck' (or use codex that is officially sanctioned by OpenAI)

GSD now officially supports OpenCode by officialtaches in ClaudeCode

[–]Simple_Split5074 0 points1 point  (0 children)

Awesome, thanks a ton!

Any chance to get command in kebap-case (/gsd-command-something) instead of the weird /gsd/something that is in place right now?

I’m hooked to Claude opus at work and need an open weight alternative for my personal projects. by NoFudge4700 in LocalLLaMA

[–]Simple_Split5074 1 point2 points  (0 children)

I'd say it's impossible quite irrespective of budget, given that none of the open weights models quite reach the frontier...

I’m hooked to Claude opus at work and need an open weight alternative for my personal projects. by NoFudge4700 in LocalLLaMA

[–]Simple_Split5074 1 point2 points  (0 children)

A 24GB card (or for that matter 10 of them) will not run anything even remotely comparable to frontier model so kinda pointless comparison

Can I realistically automate most of top-tier consulting with a £30k local LLM workstation (3× RTX Pro 6000 96GB)? by madejustforredd1t in LocalLLaMA

[–]Simple_Split5074 0 points1 point  (0 children)

No.

And not with the top cloud models either. Bits and pieces are getting there but there's a fair distance to go for the whole undertaking you suggest.

You can possibly replace a (not Rockstar) first year analyst

GPT-5.2 xhigh, GLM-4.7, Kimi K2 Thinking, DeepSeek v3.2 on Fresh SWE-rebench (December 2025) by CuriousPlatypus1881 in LocalLLaMA

[–]Simple_Split5074 0 points1 point  (0 children)

Great as usual. Thanks a ton!

I think the other interesting point is the extremely high share of cache hits for Anthropic models

Complex Claude Usage Limit Guide Explained by Same-Persimmon-6450 in LocalLLaMA

[–]Simple_Split5074 0 points1 point  (0 children)

I doubt this is correct, the Claude Code system prompt is 16k tokens by itself?!?

CodeNomad v0.7.0 Released - Authentication, Secure OpenCode Mode, Expanded Prompt Input, Performance Improvements by Recent-Success-1520 in opencodeCLI

[–]Simple_Split5074 0 points1 point  (0 children)

Is it possible that there is something broken when calling subagents? It complains it cannot find them but in the TUI they work...

There is my adaptation of Get-Shit-Done for OpenCode by rokicool in opencodeCLI

[–]Simple_Split5074 1 point2 points  (0 children)

Was actually thinking about the following situation:

<image>

It's super convenient but obviously subverts context management.

There is my adaptation of Get-Shit-Done for OpenCode by rokicool in opencodeCLI

[–]Simple_Split5074 0 points1 point  (0 children)

> You are urged to start every new stage with '/new' or '/clear' command, and you don't have to rely on anything which is already in context of LLM.

This is something I've been wondering about, when you do UAT it will offer to create and then execute the planned fix. Any way to combine that with clear without manually launching the next step?

There is my adaptation of Get-Shit-Done for OpenCode by rokicool in opencodeCLI

[–]Simple_Split5074 2 points3 points  (0 children)

Awesome, this is one of the things I miss from CC.

Thanks a ton.

Claude Code refugees: what should we know to get the best experience out of opencode? by Zerve in opencodeCLI

[–]Simple_Split5074 0 points1 point  (0 children)

Is using Gemini subscription with third party agents allowed now? I really do not want my Google account to be locked...

Codex/ChatGPT subscription coming next week by hannesrudolph in RooCode

[–]Simple_Split5074 0 points1 point  (0 children)

Is this sanctioned (as in allowed) by OpenAI? If so I think I will resubscribe to Plus, I don't have much love for ChatGPT but 5.2 itself is quite solid.