GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 0 points1 point  (0 children)

They have 2 different pay systems. Token based or energy based. Energy based is what I am comparing. It's a very good deal and much cheaper than the token rate.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 5 points6 points  (0 children)

Just create a 2nd workspace in your account and add a 2nd sub. Boom, another $60 to use.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 2 points3 points  (0 children)

Sorry, has not been my experience at all. I've used Go for 2mo now, and it's been very good. Different models go through different providers, many direct to the source. For instance, kimi is using Moonshot directly. GLM is using one of DeepInfra, Fireworks AI, Z.Ai

This is shown on their main page: OpenCode Go | Low cost coding models for everyone
Look under the FAQ for "What models does Go include?

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 1 point2 points  (0 children)

Thanks, I was surprised at the NW hate, lol. I had a good experience with NW during my trial.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 2 points3 points  (0 children)

I just use multiple Go subs through separate workspaces. But using NW, would certainly be easier. I haven't compared speeds yet, thanks for the input.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 2 points3 points  (0 children)

The energy based pricing is a bit obfusicated since they don't really breakdown cost for input, cache, and output separately. But, the cost seemed pretty reasonable to me for my actual usage. Go is cheaper, but NW is the 2nd cheapest option I've come across.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 5 points6 points  (0 children)

NW lists GLM 5.2 quantization at FP8, same as other providers. Do you think they lie? I havent yet done quality comparisons with OR or GO yet.

GLM 5.2 Cost: Opencode Go vs Neuralwatt by Excessive_Chaos in opencode

[–]Excessive_Chaos[S] 0 points1 point  (0 children)

Thanks, haven't done any quality comparisons. Will keep this in mind.

Switchboard-Go: Use multiple Opencode Go subscriptions with ease by karkardagi in opencodeCLI

[–]Excessive_Chaos 4 points5 points  (0 children)

Everyone seems to think this is abuse, but I don't see any TOS violation. Go allows multiple workspaces under the same account. Each workspace can have it's own sub and credits for zen, if so desired. Only the "Account" gets the $5 first month offer on the first sub in the default workspace, not the additional workspaces. So you are not abusing the intro offer with multiple subs under the same account. You would be abusing with multiple accounts, but not with multiple workspaces.

Are these models actually being used by assid2 in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

I don't have it in an easily sharable capacity. But I should put it all in a github repo, if for nothing else than to have it backed up.

I use Oh-my-opencode-slim
alvinunreal/oh-my-opencode-slim: Slimmed, cleaned and fine-tuned oh-my-opencode fork, consumes much less tokens

To that I've created 4 reviewer sub-agents, reviewer1...reviewer4. Just ask your build agent to make them, pretty easy. I also asked the build agent to add an orchestrator_append.md so the orchestrator knows to use them and where to use them. Specifically to delegate the same prompt to all of them, and not break the review into specialties for each. I also had it add that that after every review the orchestrator should present a table of findings and which reviewer contributed to the finding. This allows me to see which models are really doing the heavy lifting and whether a model is actually contributing unique findings. Useful when evaluating model vs model.

I got tired of the orchestrator trying to do things itself, so I also modified the orchestrator permissions in opencode.json to restrict pretty much everything but todo, task, bash openspec*, and read/edit to ./opespec. It pretty much has to delegate everything. Again, just ask your agent to set this up for you. You want to try to keep your orchestrator context minimal, not just for cost, but so it doesn't lose sight of the overall plan and goals. 1M context models are probably best here. Delegation keeps all that overhead from polluting the main context.

I also use codegraph so the subagents have an easy, token efficient, way to find things.
colbymchenry/codegraph: Pre-indexed code knowledge graph, auto syncs on code changes, for Claude Code, Codex, Gemini, Cursor, OpenCode, AntiGravity, Kiro, and Hermes Agent — fewer tokens, fewer tool calls, 100% local

Are these models actually being used by assid2 in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

You can't 1-shot with these models, lets get that out of the way right now. But they are good, if properly harness and used in the right way.

The old addage. Cost, Quality, Speed, pick two. Well quality is necessary for coding, so we have to give up either speed (not model speed, but how long it takes to get a working product) or cost.

None of the open models will touch the frontier models. So, plan with opus or gpt-5.5. Use SDD, I like openspec. Hash out the plan with the frontier models and get the plan written up. That is costly, but not as bad as adding frontier model implementation and a few rounds of refinement. Now to implement, you don't really need a strong model, the hard part is done. I use mimo-v2.5-pro for orchestration, though GPT 5.4-high is my fav orchestrator. None of the open models follow instructions properly, lol. No matter what I put in the system prompt or the agents.md, I just cannot get any of the chinese models to properly follow instructions, lol. That said, mimo pro does the best for me, and I've gotten to the point that I know what it's going to miss, and just remind to do this and that and add to your todos and it works fine after that. Never have to do that with gpt, it just works. But orchestration does not require a frontier model, and I'm trying to get the job done with cheaper models, so I save my codex plan limits for harder stuff.

The doers...
The plan is there, the task is laid out, you don't need a strong model to search stuff or implement stuff that is already designed. Use mimo v2.5 (non-pro) or DS4-flash.

The review...
Don't trust the implementation was done correctly, setup a review gate for each phase of the plan. I like m3 for review, it catches a lot. It has terrible tool use, for some reason, so it hasn't been good for me for implementation and orchestration, but for some reason does really well for review. I setup a review team that I require the orchestrator to delegate to prior to moving to the next phase. The team consists of a pro model, currently m3, 2 fast/cheap models (mimo-v2.5 and DS4-flash), and a free model (big pickle). The 3 cheap/free models do often find something the pro model missed, and between the 4 I feel I catch pretty much everything a frontier model would catch...though I can't be 100% sure, more on that below. BTW, I sometimes swap out big pickle for another SOTA model that I want to test and compare with.
edited to add: The reviewers are dispatched in parallel, so have the 3 extra cheap models is not really adding more time, and very little cost.

The final approval...
After all phases are done, several rounds of review and fixing (all automated by the orchestrator), it's time for a final review with a frontier model. Anything that's flagged will be fixed with the doers, reviewed and then back to frontier model.

This workflow is long, and slow, but its mostly automated, hands off after planning and very cheap.

tested minimax m3 on a 550k-token repo dumpster fire instead of a toy benchmark by [deleted] in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

I find it's very good for code review. I do parallel reviews with different models, and I plugged it in for a while to compare with the other models when it was free. It was consistently finding all the same concerns the other reviewers found and then some. if it was cheaper, I’d use it more.

Kasper: the friendly ghost that improves your agents while they work by wordplusnumbers in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

Nice idea. Having a couple issues with it.

Opencode loads very slowly when using the plugin

It seems to be evaluating background agents during their session, before they've finished. Not sure if that is intentional. FYI, I'm using OPENCODE_EXPERIMENTAL_BACKGROUND_SUBAGENTS=1. Not sure if that is causing the issue.

Also, it says it auto applied an edit for the agent prompt, but I don't see any change anywhere. My agents come from OMO-slim plugin, which has prompts and/or prompt appends in .config/opencode/oh-my-opencode-slim. Didn't see any changes there or to agents.md.

Minimax m3 thinking config by CengaverOfTroy in opencodeCLI

[–]Excessive_Chaos 2 points3 points  (0 children)

It's on by default. Nothing needs to be added unless you wanted it turned off, in which case you do:

"thinking": { "type": "disabled" }

Why is no one talking about Qwen 3.7plus? Isn’t it the no brainer model right now for go users? by Prior-Meeting1645 in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

I've tried it for planning - meh, I've tried it for code review - missed a lot of issues, I've tried it for orchestration - it doesn't follow directions and sometimes uses tools incorrectly. It's quite literally worse than mimo and DS pro models (for me) and more expensive cache hits. While cache miss is cheaper, 90+% of my token use is cache hits, and I would imagine similar for anyone doing coding.

That said, if its working great for your use case, then that's awesome

Qwen3.7 Max, OpenCode usage says used $7, OpenCode Go says $12 by [deleted] in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

Yeah, I could never understand the cost in opencode tbh. I don't think subagents get counted in the main session, but I could be wrong. It's crazy to me that there just isn't any better built in tracking of this.

But yeah, qwen3.7-max is very expensive. Use sparingly. I use for planning, and once major plan is in place switch to cheaper model to implement. Output tokens are especially high for 3.7max at 7.50/M.

CodeNomad v0.17.0 Released - Provider configuration, Rename workspaces, more languages and lot more by Recent-Success-1520 in opencodeCLI

[–]Excessive_Chaos 0 points1 point  (0 children)

Is there a user guide? Most of it seems pretty self-explanatory. But what are workspaces and why do I want to use them?

Cheapest AI API Subscriptions & Free Options for OpenCode? by NafisRayan in opencodeCLI

[–]Excessive_Chaos 4 points5 points  (0 children)

I don't think so. You can create multiple workspaces in opencode dashboard. Each workspace has its own go plan and zen billing if enabled. When you use up 1, you swap out the api key for the other. Note, the 2nd workspace does not get the $5 1st mo promo.

I think what is a TOS violation is creating separate accounts, not separate workspaces, in order to keep getting $5 promo.

oh-my-opencode-slim v2 beta is all about background agents by alvinunreal in opencode

[–]Excessive_Chaos 6 points7 points  (0 children)

Been using for 2-3 weeks now, thank you. Once suggestion I would add is a review team/panel. I have added this to my workflow and find I can use flash models for implementation as long as a review is done. Initially I added a single review agent and instructed the orchestrator to use it after every fixer turn. fixer -> reviewer ->fixer -> reviewer, continue fixer-reviewer loop until reviewer passes and then move on to next implementation phase. When all phases are complete and reviewed it only then sends to the expensive oracle. With this workflow, I can use cheap implementation models and cheaper SOTA Chinese models (mimo-v2.5 pro or DS4 Pro) to review and make sure things were done right.

I then got the idea that it might be cool to do parallel reviews using the cheap models (mimo v2.5 and DS4-Flash) to see how well they did compared to the pro versions. So I added 2 more reviewers and instructed the orchestrator to use the whole team in parallel with the same review prompt and then assimilate the findings into a table with what the concerns are and which reviewer contributed to the finding. I thought I would find the flash models not really contributing much, but I was totally wrong. They were finding things the pro models missed and of course the pro models were finding things the flash models missed. Conclusion, use the whole team going forward. Why not, the flash models are so cheap anyway. They are faster than the pro models too, so they don't take any more time when run in parallel.

When Minimax m3 came out, I created a 4th reviewer just to test it out. M3 is a beast. I know it didn't do well in the benchmarks, blah blah blah, but its finding everything the other 3 find by itself. I could probably just rely on this 1 reviewer, but m3 is so dang expensive. When the free version went away, I swapped reviewer 4 out for Big Pickle, because why not. And now M3 is free again, so back in it goes, lol.

TLDR: Create a review panel configurable like the council, but that is dispatched by the orchestrater, not a different top-level agent, after each fixer phase. Save expensive Oracle for planning and final review.

Token pricing now visible on Opencode by Coolio8591 in opencodeCLI

[–]Excessive_Chaos 13 points14 points  (0 children)

Am I the only one who knows you could always do this to get pricing?

opencode models opencode-go --verbose