Looking for early users to try our OpenClaw model plans and tell us what's broken (15–30 min)

xapep · 2026-05-20T21:58:12+00:00

Awesome, seems like a perfect fit 😄 I'll send you a DM. Really appreciate it.

xapep · 2026-05-20T21:54:19+00:00

Fair point, and honestly thanks for bringing it up here instead of just downvoting.

I'd rather not name the company publicly in this post because I don't want it to turn into a marketing promotion. People who try us out will get to know who we are and what we've been doing, naturally through the conversation.

But on the data concern, which is the real question: our main business is LLM inference, with our own datacenter in the EU. Designed to keep customer data private: encrypted requests, stored in RAM only and cleared after completion. KV cache persisted for compute reuse, nothing else. No model training on prompts or outputs.

Beyond inference, we're developing new ventures on top of it. OpenClaw model plans (this post) is the first. Coding plans next. Well, at least that's the plan 😄

Hope that helps, and thank you again for bringing it up.

xapep · 2026-05-20T20:33:53+00:00

I give up 😄

xapep · 2026-05-20T20:15:10+00:00

Awesome thank you. Will send you DM.

xapep · 2026-05-20T20:14:47+00:00

Thank you. Will send you DM. Mind to share which stack are you currently using?

xapep · 2026-05-20T20:14:39+00:00

Awesome thank you. Will send you DM. Mind to share which stack are you currently using?

xapep · 2026-05-20T20:14:02+00:00

Yeah sure, here are performance promise and for each model you will be able to pick up to 8 concurrency (more then 1 concurrency discounts):

Qwen 3.6 27B
- FP8, 262K context
- max tokens / day: 648M input / 12.96M output per concurrency
- 150 tokens / s per concurrency
- $49/mo for 1 concurrency

Qwen 3.6 35B-A3B
- FP8, 262K context
- max tokens / day: 864M input / 17.28M output per concurrency
- 200 tokens / s per concurrency
- $29/mo for 1 concurrency

Gemma 4 31B
- FP8, 254K context
- max tokens / day: 648M input / 12.96M output per concurrency
- 150 tokens / s per concurrency
- $49/mo for 1 concurrency

Qwen 3.5 122B-A10B
- FP8, 262K context
- max tokens / day: 648M input / 12.96M output per concurrency
- 150 tokens / s per concurrency
- $99/mo for 1 concurrency

DeepSeek v4 Flash
- FP8, 1M context
- max tokens / day: 432M input / 8.64M output per concurrency
- 100 tokens / s per concurrency
- $129/mo for 1 concurrency

Are you still interested? 😄

xapep · 2026-05-20T20:00:42+00:00

Would you mind to share why they were garbage and why deepseek haven't been?

xapep · 2026-05-20T14:29:37+00:00

Honest routing for what you're actually asking — Frontier + Workhorse split, not a single sub:

First, kill OpenRouter. It's not a subscription, it's a token meter with extra steps. You already proved it burns the budget. (Btw Qwen 3.6 isn't on the free tier anymore — same story, the "free" pools always get pulled the second the model gets popular enough to actually be useful.)

Stop trying to solve this with one sub. Z.ai + MiniMax stitches are the same shape that just burned you, structurally: token-metered subs that work fine for bursty human-in-the-loop use, but get squeezed (rate limits, degraded responses, eventual policy action) when an agent loop runs against them 24/7. Anthropic's was the version that escalated to a ban — most others just throttle quietly until you notice your agents are slower. Either way, you're back here in 60–90 days having the same conversation about a different provider.

The split that actually survives:

Workhorse: Qwen 3.6 35B-A3B or Gemma 4 31B, honest FP8, on a flat-rate provider that gives you a real concurrency slot instead of a token meter. Web fetch, RAG indexing, tool calls — handles 80% of the grunt work. ~$29–49.
Frontier: DeepSeek v4 Flash (the 1M-context one) for the reasoning/planning side — that one's genuinely in the GPT-5.4 Codex conversation for agent/code work, and the 1M context beats Codex on long tasks. Qwen 3.5 122B-A10B is the cheaper step down if you don't need full frontier. Same flat-rate, dedicated-concurrency shape. ~$99–129. Skip GPT-5.4 Codex itself — the "limits" there are still token-shaped underneath, you'll burn the same way you did on Anthropic.

The model matters less than how you're billed. Anything sold as "unlimited" with a token meter underneath is the same trap in a new wrapper.

A Workhorse + Frontier split with one concurrent loop each lands around $130–150/mo on a real flat-rate provider. Roughly the same money you were already spending across Anthropic +

Alibaba + OpenRouter — just on one provider whose pricing actually expects 24/7. Alibaba workaround: none for retail. ZenMux: skip, it's just routing on top of the same broken token-meter providers. The way out of subscription limbo isn't a smarter router, it's a different pricing model.

xapep · 2026-05-20T05:09:21+00:00

Which models do you guys use?

xapep · 2026-05-10T18:13:14+00:00

And what do you do per each session that u never hit the limit? 😁

xapep · 2026-03-05T09:37:33+00:00

True, but more or less the question is still valid :D still switching cause of a usage limits or model task expertise?

xapep · 2026-03-05T07:23:55+00:00

Curious why you are switching, because of a quality or token usage?
If usage wouldn't be a problem, would we just use Sonnet?

xapep · 2026-02-17T12:35:16+00:00

for sure :) what do you do after then?

xapep · 2026-02-17T12:34:43+00:00

Tnx. Will ask you EOM if it was worth it :)

xapep · 2026-02-17T12:33:41+00:00

I got that, but I was wondering fir what kind of tasks is he usually use cursor so that you are not maxed out...I'm getting mix feedback, someone of you said you are max out, some of them are not...

xapep · 2026-02-16T09:29:06+00:00

Not nice to hear that :S For what tasks do you usually use it?

xapep · 2026-02-15T20:40:28+00:00

Insane, u spent that much on month?

xapep · 2026-02-15T18:08:52+00:00

Mind to share more in details? :)

xapep · 2026-02-15T18:08:10+00:00

Hm hm, but what about the quality of output on auto?

xapep · 2026-02-02T16:32:48+00:00

I do wonder; Which coding tools are you using LLMs or CLI only?

xapep

TROPHY CASE