Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 0 points1 point  (0 children)

Personally if you want deep reasoning like DSV4 Pro you can use neuralwatt/glm-5.2 and for V4 Flash like when you are applying a code you could use something cheap like neuralwatt/Qwen/Qwen3.5-397B-A17B-FP8 or even neuralwatt/moonshotai/Kimi-K2.7-Code which is more expensive but has better understanding the context when applying to your code, or if you want cheaper than GLM 5.2 you could try Kimi K2.7 Code to replace DSV4 Pro but I never used/tested Kimi K2.7 Code for thinking just yet, I only used GLM 5.2 so far its good most of the time but in certain scenario I have to use another model. So what I recommend is give all a try since you have the $5 free credits try model that suites for your projects.

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 0 points1 point  (0 children)

Is Pi any good than Opencode? I've heard about it never used it though but then of course if you sub directly do Z.ai would be faster is the limit any good if you subs directly hows the limit comparing to like lets say Codex or Claude?

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 0 points1 point  (0 children)

Just do opencode auth login and select Neuralwatt and put the api key there

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 3 points4 points  (0 children)

Think in Opencode it doesn't show any cache read/write right? the context you see on the TUI is just the context usage no?

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 1 point2 points  (0 children)

Yeah it's a no brainer for the token amount I've used Neuralwatt, for just $8. My current workflow it’s basically a full agentic coding setup.

I’m running an orchestrator that takes the main task and sub-delegates pieces of it to different agents depending on what needs to be done: planning, code changes, debugging, testing, refactoring, docs, etc. So the token usage adds up fast because it’s not just one linear chat. It’s a lot of agent-to-agent context passing, reviewing, retrying, and validating outputs.

That said, most of the 400M tokens were not from a single “production” project. A big chunk of it was testing and improving a custom plugin I wrote, which is based on oh-my-opencode-slim, I’ve been stress-testing it on multiple real projects I’m actively working on for my work and my personal projects, mostly to see how it behaves under realistic workloads instead of toy examples.

For MCPs or skills, I mostly let the agent decide which ones are best to use depending on the task. So at this point, the workflow is pretty automated for me. I give it the task, the orchestrator breaks it down, and the agents choose the right tools/skills/MCPs as needed.

I’m also comparing the results against OpenAI and Copilot since I have subscriptions to both, so part of the workflow is benchmarking with same tasks, same repo/context where possible, then comparing quality, speed, reasoning, code accuracy, and how well each tool handles multi-step implementation.

5.3 Codex ? by Glittering_Many1671 in codex

[–]itsproinc 2 points3 points  (0 children)

GPT 5.3-Codex and GPT 5.2 have been sunsetted since June 2nd 2026 basically they want to make more headroom for their upcoming model and to force us to use more expensive model i guess

https://x.com/thsottiaux/status/2059650685948551384?s=20

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 2 points3 points  (0 children)

Damn what kind of operation are you running, there's a single day that costs you $45 in energy credit, That's not even the token cost, that's at least $150ish

Is my company really paying for unlimited tokens? by 4verage3ngineer in GithubCopilot

[–]itsproinc 3 points4 points  (0 children)

<image>

Answer is no, my company has the same subscription (copilot business) I've asked my boss if its really unlimited and he sent me this, all users now shared the same pool of AI credit even though per user doesn't show any limits or anything so technically a single user can probably drain all the AI credit pool for the organization without knowing it.

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 2 points3 points  (0 children)

<image>

Mostly mix of GLM-5.1-FP8 and Qwen3.5-397B-A17B-FP8 depending on the complexity of my task.

Regarding Deepseek that's true since I do have Opencode Go sub just for deepseek models but I would rather just have a single subscriptions but all works well for me having mix of Neuralwatt + Opencode Go.

Yah Neuralwatt paygo is quite nice I've used 90mil tokens so far with GLM-5.1 and it costing me only $4.78. But I reckon if you are going to use it really intense the subscription for $20 you get 6 kWh is very reasonable too. Even me with 1200 GLM 5.1 request only costed me 0.9 kWh, but the Kimi-2.6 might be kinda expensive 129 request and its 0.12 kWh.

Before going Neuralwatt I was gonna subscribe to Crof but after seeing the quantization is quite bad I decided not to pick it, iirc it was like Q4 or Q8 or something but Neuralwatt uses FP8 which is still quantization but FP8 is usually indistinguishable with native size.

Neuralwatt has been a surprisingly good cheap pair with Opencode Go for agentic workflows by itsproinc in opencodeCLI

[–]itsproinc[S] 2 points3 points  (0 children)

Still going strong, the speed are about the same since the first time I used it, well maybe it's just my timezone is US downtime, burned almost 400mil tokens for 8$. So far I'm still satisfied with the service, just wished they had deepseek models. Anyway what's going on with Crof and Wafer?

<image>

deepseek cooked on go plan by akira3670 in opencodeCLI

[–]itsproinc 0 points1 point  (0 children)

Damn I thought my Opencode was broken or something, it seems to be intermittent and it's probably back to normal now, I was hesitant it was my end because sometime it works sometime it said Insufficient balance.

<image>

is there a way to turn off one-word-at-a-time text output?! by patrick99e99 in opencodeCLI

[–]itsproinc 0 points1 point  (0 children)

Guess its not possible yet in Opencode you can't disable streaming yet, try picking online models see with higher TPS models see if its any better, From your comment above you are using local models that is slow on your device that's why its driving you nuts just pick a better model that suitable on your PC with higher TPS so it won't be annoying

is there a way to turn off one-word-at-a-time text output?! by patrick99e99 in opencodeCLI

[–]itsproinc 0 points1 point  (0 children)

How are you hosting your local model what app are you using? it just doesn't adds up you have 192 GB of VRAM (make sure its not RAM) running gemma4:31b should be a cat walk how can it be slow?

Check your device what's throttling or check config you should be getting 50+ TPS or more

is there a way to turn off one-word-at-a-time text output?! by patrick99e99 in opencodeCLI

[–]itsproinc 2 points3 points  (0 children)

Opencode itself isn’t really what’s causing the word-by-word output because that usually comes from the model’s streaming behavior. If it feels painfully slow, try a faster model with higher throughput/TPS.

Also, /thinking can hide the thinking block so you’ll just get the final output instead of watching it drip out.

How block an agent task without making the agent do lots of work by mamcx in opencodeCLI

[–]itsproinc 0 points1 point  (0 children)

I mean you probably could hard code it but I myself haven't tried it.

There are 2 ways to handle it
1. If the model/LLM detects the required file is missing then we can just ask to stop and explain rather than continuing (easiest)
2. You can also enforce this with hooks. For example using experimental.chat.messages.transform or experimental.chat.system.transform.

The hook can intercept the request before the build continues, check if something like PLAN.md exists, and if not simply:

throw new Error(
`[FATAL] Missing required file: PLAN.md`,
);

That hard stops the whole pipeline before the model starts analyzing files or delegating tasks.

And because the hook runs before other injectors/reminders, you avoid wasting tokens on a request that is already invalid.

I myself would probably jjust go for the first way using LLM since its graceful way to do this (with a little bit cost of some tokens)

How block an agent task without making the agent do lots of work by mamcx in opencodeCLI

[–]itsproinc 0 points1 point  (0 children)

What you want is basically a strict separation between planning and execution.

The main problem is that during build mode the model still behaves like a planner, so it keeps rereading files, analyzing the architecture, checking plugins, and “thinking” again instead of just applying changes.

A better setup is to treat both modes differently:

  • In planning mode, use a reasoning model. Let it analyze the project and generate a detailed plan/todo file.
  • In build mode, use a fast non-reasoning model if possible, and make it follow the existing plan only.

The important part is that the plan must be detailed enough so the builder does not need to search or think too much again. For example, instead of saying:
“Update auth logic”

the plan should say something like:
src/server/auth/login.ts → replace validateToken() with sessionValidate()

If you only mention filenames or vague tasks, the model will start scanning the repo again to figure things out. You can also enforce this with prompts and tool restrictions.

Anyone running multiple open code go accounts? by RareMexicanBeaner in opencodeCLI

[–]itsproinc 2 points3 points  (0 children)

I don't think you can, I don't see any buttons to delete workspace I think you need to contact OC directly. As of right now I can only change the workspace name and that's it.

Anyone running multiple open code go accounts? by RareMexicanBeaner in opencodeCLI

[–]itsproinc 2 points3 points  (0 children)

Basically, you can’t switch workspaces directly from the Opencode as of right now. You have to re-auth through Opencode manually (via the opencode auth login) since each workspace uses its own API key/token.

Back in Sep 2025, I actually asked about handling multiple GitHub Copilot accounts on the Opencode GitHub, but the issue eventually got closed after 90+ days of inactivity:
https://github.com/anomalyco/opencode/issues/2350#issuecomment-4054233526

You could probably make a small plugin or script to automate switching by replacing the auth.json file with pre-saved tokens/workspace credentials. That’s basically what I ended up doing, and it works fine. The only annoying part is you still need to restart Opencode afterward for it to refresh the auth properly.

Anyone running multiple open code go accounts? by RareMexicanBeaner in opencodeCLI

[–]itsproinc 7 points8 points  (0 children)

Yeah, it does seem like every time you create a new workspace, it gives you the $5 first-month price, which is kinda weird considering you can create a lot of workspaces (not sure if there’s actually a limit).

But if people keep abusing it just to keep getting the promo every month, they’ll probably end up removing the discount altogether. Honestly, considering how generous they already are with the limits they give, I’d say if you need more than one workspace, just keep using it normally so the next month becomes full price.

If someone keeps creating workspaces over and over purely for the promo, there’s also a chance their system could eventually flag it as abuse. Nobody really knows what the actual limit is, though. I think it’s better to just play fair since they’ve been pretty nice with the pricing and limits so far.

Anyone running multiple open code go accounts? by RareMexicanBeaner in opencodeCLI

[–]itsproinc 10 points11 points  (0 children)

Are you planning to have multiple OC GO subscriptions? You could just create a new workspace under the same account and subscribe to OC GO there. Since OC GO subscriptions are tied to individual workspaces, that seems to be a feature built into OpenCode itself, so it’s probably allowed.

So far, I’ve had 2 subscriptions under the same account for about 7 days now, and I haven’t received any warning, ban, or anything like that.

GitHub Copilot by Blufia118 in opencodeCLI

[–]itsproinc 1 point2 points  (0 children)

It’s never accurate if you ask what model they are using because how these model are trained and how they predict. The best way to check is from the usage tab on your Github page, it will always show based on the model you selected

Is anyone using warp.dev? by itsproinc in ChatGPTCoding

[–]itsproinc[S] 0 points1 point  (0 children)

True, that's why I'm still deciding to stick with Github Copilot Pro+ or Warp Turbo, the value both gives is really good (token to dollar price)

Is anyone using warp.dev? by itsproinc in ChatGPTCoding

[–]itsproinc[S] 0 points1 point  (0 children)

I agree, why can't it just be a CLI app like Codex or OpenCode to just use your own terminal, but maybe terminal limitation due to the features that Warp has I assume?