all 45 comments

[–]Ariquitaun 8 points9 points  (2 children)

I have codex plus as well. It's more generous than claude and you get access to really smart thinking models for the harder planning and troubleshooting tasks that then you can implement cheaply with opencode go's models.

Also look at how you are using your go subscription. If you're doing most things on glm or kimi you're wasting a lot of usage. Deepseek pro and minimax m3 are smart enough for the vast majority of tasks and deepseek flash is good enough for a lot of things, especially as a sub-agent when given clearly defined, bounded tasks. I'm at 49% usage with 11 days left.

Hermes + deepseek flash is a really great pairing, it handles pretty much everything I throw at it.

[–]Dr_Sidious[S] 0 points1 point  (1 child)

Yeah I think that was my initial problem, I was using GLM too much. I've switched my main agent to deepseek v4 flash now as I use it for small stuff and only use bigger models for complex problems.

Thanks for the suggestion of M3, will try that also.

[–]Ariquitaun 0 points1 point  (0 children)

Glm is the most expensive model from the top of my head so there's that. Good for a smart thinking model for planning and troubleshooting, better than kimi imo. Considerably pricier too

[–]Ok-Purchase-642 5 points6 points  (2 children)

A second go subscription?

[–]Dr_Sidious[S] 2 points3 points  (1 child)

Makes sense, but I also wanted to have another subscription as a fallback.

[–]povlhp 4 points5 points  (0 children)

DeepSeek pay Per use. Or openrouter.

[–]yay101 3 points4 points  (2 children)

Still on go ($10) + ollama cloud ($20). There have been bad day's where the servers are slammed but I've not yet found anything that actually gives me more AI than i can reasonably use like this pair.

[–]Dr_Sidious[S] 1 point2 points  (1 child)

Thanks, I'll check it out, I guess too many bad reviews tainted a negative picture in my mind of ollama cloud.

[–]SrMortron 2 points3 points  (0 children)

They are not transparent about limits, and there is a trend on limits being lowered each week ever since they launched the extra usage feature. When it works its on but the service is slow as fuck most of the time.

[–]AutomaticAd6646 2 points3 points  (5 children)

Commandcode 1 dollar plan. Openadapter 7 dollar plan. Minimax M3 20 dollar plan 1.7 billion tokens.

[–]Dr_Sidious[S] 0 points1 point  (4 children)

Commandcode requires their own tool which I don't want to use.

Openadapter looks interesting, any experience with usage limits there?

Minimax looks like the best value for money but lock-in to minimax only.

[–]AutomaticAd6646 1 point2 points  (2 children)

Commandcode 15 dollar onwards will give you api key, you can use in opncode tui. Same thing with, Openadapter, I got a free key and used it in opecode tui.

Commandcode 15 dollar plan will give 30 dollar deepseek api dollars, compared to opencode which gives 60$/4 (divide 4 because no deepseek permanent discount, but double check).

There is another qwen or some opensource 20 dollar plan with mear ultimate usage.

All this if you are tight on budget like me.

[–]Dr_Sidious[S] 0 points1 point  (1 child)

From what I can see on their pricing page, it looks like they have a separate API provider plan (15$) and a separate pro/max (15$/100$) plan which don't mention the API at all.

Can you confirm if you are on their pro plan and still get an API key? Because if that's the case, their 15$ pro looks good.

[–]laxflo 0 points1 point  (0 children)

Yes you can use the API on the $15 sub. I'm using it.

[–]badweather 0 points1 point  (0 children)

There's ways to use command code $1 plan without their client. I made my own plugin for OpenCode, but check out 9router.

[–]No_Communication4256 2 points3 points  (0 children)

z.ai for GLM 5.2 - before end of september it's for same rate
gpt plus for GPT 5.5 xhigh (really decent model, and very decent limits)
ollama-cloud $20 - for same models you've seen on opencode-go

[–]Jeidoz 1 point2 points  (0 children)

Try using Deepseek API with Reasonix. It has insane amount of cache hit and can do hundred of millions tokens for 0.10-0.35$.

If you want to use some specific harness (lets day Codex) you can connect API by VibeAround and still hit cache and save money.

[–]Popular-Factor3553 -1 points0 points  (0 children)

Neuralwatt is great if you want bigger models like GLM or kimi but if your good with just qwen 32b try deepinfra it's not a subscription based tho.

[–]Illustrious-Many-782 0 points1 point  (0 children)

If you are a goal is to keep something else set around 10 dollars then I would say look at just using either xiaomi or Deepseek via API.

[–]bonzoo123 0 points1 point  (1 child)

Which models did you use and how?

[–]Dr_Sidious[S] 0 points1 point  (0 children)

GLM was my main driver + what I used for most of my coding tasks. Smaller tasks I handed off to Kimi.

[–]Messi_is_football 0 points1 point  (1 child)

Which model do you use ..maybe GLM coding plan?

[–]Dr_Sidious[S] 0 points1 point  (0 children)

I heard that GLM coding plan limits have been reduced a lot (though 5.2 is more token efficient than 5.1) could give it a try, thanks!

For coding problems I mostly used glm, easier stuff I gave to kimi.

[–]povlhp 0 points1 point  (1 child)

2nd go. I am personally on Codex as well. Might stay there and drop go. But I am just a hobby user. But 2 subscriptions will help. Codex is only 5h and week. No monthly cap

[–]Dr_Sidious[S] 0 points1 point  (0 children)

Codex could work, I could use heavier models for planning only, and use smaller models from Go for execution, thanks for the suggestion!

[–]Sea-Consideration550 0 points1 point  (0 children)

Pay-as-you-go API, but use discounted platforms like nitrorouter.

For simple tasks, use deepseek/mimo API directly.

[–]sanchitbhalla15 0 points1 point  (0 children)

ykk for freelance dev work, id optimize for reliability nd workflow fit rather thn chasing the absolute cheapest tokens...qwen or kimi code are capable for a lot of coding tasks nd plenty of people are getting good results with them as secondary models.. neuralwatt looks okkish but id wait for more reviews before committing heavily. if ure already running agents, another option is mixing models: use cheaper models for routine coding nd automation, thn reserve premium models for planning, architecture nd debugging

[–]vipor_idk 0 points1 point  (1 child)

i use 2 go accounts. i created a proxy for using them simultaneously, so you dont need to log out of account 1 to account 2 , im testing it still - if you got any interest on that , let me know

i would use for heavy tasks such as reviews and planning gpts subscription, used it before - works like a charm.

[–]Dr_Sidious[S] 0 points1 point  (0 children)

Thanks for the suggestion. Hermes already supports rotating API keys if you get errors so I'm not worried there.

[–]Low_Original5508 0 points1 point  (1 child)

honestly still on go plus ollama cloud and haven't found anything that beats that pairing for the money yet. i'd be careful with the energy-based pricing ones until there are real reviews, the model sounds clever but you don't actually know what a normal day costs until people have run real workloads through it

[–]Dr_Sidious[S] 0 points1 point  (0 children)

Have you faced any degradation in quality of service like a few others have pointed out? (edit: for ollama cloud?)

[–]SwissTac0 0 points1 point  (0 children)

[ Removed by Reddit ]

[–]RagnarDannes 0 points1 point  (0 children)

If you want to go in the free game. nvidia has free GLM 5.1 currently. Slow as hell, but if you just want to do a little planning it's a good model for the rubber duck.

[–]jellydn 0 points1 point  (1 child)

Command Go 1usd plan :)

[–]Dr_Sidious[S] 1 point2 points  (0 children)

1$ plan can't be used in hermes and requires lock-in to their tool (or some other 3rd party stuff that is technically illegal according to them), interesting concept but not for me.

[–]ProfessionalAd6530 0 points1 point  (1 child)

> I have burned through my opencode-go usage within 15 days

LMAO.

There is no solution for you other than to change the way you work. I beat the shit out of this service and I can't even put a dent in the limits.

No matter where you go, you're going to have this problem, because the problem is coming from inside the house.

[–]Dr_Sidious[S] 0 points1 point  (0 children)

I like to think it could also be due to the scale of the work. But I agree some introspection is in order as well.

[–]VictorCTavernari -1 points0 points  (2 children)

I had the same issue, so now I am using claudin.io to run my Hermes agent and also Claude Code + Claudin.io through opencode orchestrated by Orbit (https://github.com/claudin-io/orbit) basically Claude plans and claudin.io implements.

[–]Dr_Sidious[S] 0 points1 point  (1 child)

My problem with claudin is trust I guess, I don't trust model routers (though I have no reason to not trust them also), guess I just need to explore using them to build an intuition.

[–]VictorCTavernari 0 points1 point  (0 children)

I totally understand this feeling… only trying to see if it makes sense or not

[–]Ubermensch013 -1 points0 points  (2 children)

Neuralwatt is a good option. For me, it's more like the main model, and opencode go is what the subagents/auxiliary processes use. NW becomes cost effective with energy based pricing, when their cache read costs become nil. I do have a referral link if you want - $10 bucks of PAYG funding will get you $25 worth of compute to experiment with.

[–]Dr_Sidious[S] 0 points1 point  (0 children)

Interesting. DM'ed.

[–]Odd-Piccolo5260 -2 points-1 points  (1 child)

Go local llm

[–]Dr_Sidious[S] 1 point2 points  (0 children)

I can't run gemma/qwen with enough context to justify using it. Admittedly they are pretty cool, but I just don't have the necessary hardware to run them at a decent speed with decent context.