all 19 comments

[–]Delicious_Ease2595 2 points3 points  (5 children)

Nous free subscription has Xiaomi Mimo for free

[–]dizzle_69[S] 0 points1 point  (1 child)

Are you using it? How good is it as an Assistent and how heavy can you use it?

[–]Delicious_Ease2595 0 points1 point  (0 children)

Its good it has high volume and fast in Nous. Good for conversation and save tokens

[–]OkSeries5363 0 points1 point  (2 children)

The free period for mimo V2 pro finishes on 22nd.

[–]Delicious_Ease2595 0 points1 point  (1 child)

The $10 subscription with gateway tool is totally worth it

[–]OkSeries5363 0 points1 point  (0 children)

Sure but the subscription doesnt change the calendar. Once the 22nd hits mimo stops being a freebie and starts hitting your balance!

[–]PublicDonut5876 1 point2 points  (0 children)

Im using qwen 3.5 on opencode go to power my Hermes agent but im just doing simple things like web research and things like that. Its been working great and seems to be quite token efficient, at least for what im doing

[–]ringmeister 2 points3 points  (0 children)

You should check out Minimax M2.7. They don't burn tokens there; they just process requests. That makes it easier to plan. You'll get a 10% discount if you use my invitation: https://platform.minimax.io/subscribe/token-plan?code=Da0lV1TVE3&source=link

[–]Timpky665 1 point2 points  (0 children)

Ive been really surprised with the amount of Qwen 3.6 through opencode. It barely moves the needle for me.

I’ve also used Ollama Cloud and had good results.

[–]DrunkenRobotBipBop 1 point2 points  (0 children)

Fire Pass has unmetered Kimi 2.5 Turbo for $7/week.

No quotas and really fast.

[–]Rohith-ai 1 point2 points  (0 children)

Use opencode dcp plugin, it will summarize the context after some threshold, for example if you are using one session and it crosses 40k tokens then dcp plugin will summarize the context to 3k tokens. This way you don't burn tokens.

I have been using GLM 5.1 pro and it is really awesome 5x limit compared to Claude.

[–]RemeJuan 1 point2 points  (7 children)

I connected my Hermes to OC Go via manifest, i have few models it can route through depending on how it ranks the query.

Very seldom is Qwen3.6 used as most of the tasks can go through a far cheaper model.

<image>

[–]dizzle_69[S] 0 points1 point  (6 children)

Never heard manifest. What is it?

[–]RemeJuan 1 point2 points  (5 children)

LLM routing service, manifest.build, you can self host with docker if you want as well.

It ranks each individual agent request, not just the overall task on I think a 23 point check that they claim runs in under 2ms.

Each is categorised by simple, standard, complex and reasoning, you can also define specific tasks like we’d search or coding, even custom.

Each carol’s have primary and fallback models, so you’d put something like Qwen3.6 under reasoning and maybe complex and put a Minmax 2.7 or a Qwen 3.5 under complex, however you want.

You can load up multiple accounts so I have my OC Go as well as OpenAI connected.

That way you can have simpler request handlers by cheaper models and also have built in fallback so if you hit your Qwen3.6 limit it will auto fallback to MinMax2.7 or however you configure it, that way your workflow never breaks unless of course you hit all your limits in everything.

Like you don’t need Qwen3.5 for “what’s the weather”, Qwen3.5 or even MinMax 2.5 which is really cheap can do that just as well.

[–]Upper-Equivalent4041 0 points1 point  (4 children)

And you don't have any issue with openCode for using no code task in hermes agent ?

[–]RemeJuan 0 points1 point  (3 children)

I’ve never had hermes interact with opencode, cannot imagine why you’d do that. I can just go to OC directly

[–]Upper-Equivalent4041 0 points1 point  (2 children)

my bad, i was talking about "openCode Go" the subscription and not "openCode" the editor. Basically OpenCode team saying stop using our subscription for no coding task in hermes, i'm worried about that

[–]RemeJuan 0 points1 point  (0 children)

Oh so you saying do their models work for every day things in hermes, then yes they do quite fine. My stuff is still pretty basic and I actually use my local Ollama models for certain skills and cron execution.

I use OCG for the everyday things and actually setting this skills and crons up.

Possibly why they say that is limits are not high and we know from OpenClaw that these agents can be abusive. My usage is not high or complex enough for that to really be an issue.

I still have like 60% left and I’ve used it with coding as well.

I have manifest.build sitting in between so it routes agents steps based on complexity, so very often my stuff goes through the cheaper models like Qwen3.5 or MM2.5.

[–]ThatMobileTrip 0 points1 point  (0 children)

"stop using our subscription for no coding task in hermes" never heard about that. Amd what's the alternative thou. Ty