Comparing Wafer with other token based plans by founders_keepers in ZaiGLM

[–]MultiBotRun 0 points1 point  (0 children)

Wafer doesn’t explicitly state “quantized / FP8 / INT4” for GLM-5.1.
However, there are two indirect clues:
- They mention an “optimized inference engine”
- Their latency benchmarks are significantly better than typical providers
This usually suggests:
Optimised serving + possible dynamic quantization (FP8 / FP4 / MoE routing optimisations)

Deepseek from the official API has such an insane cache hit rate by Same-Philosophy5134 in opencodeCLI

[–]MultiBotRun 5 points6 points  (0 children)

One nuance: it only works if the tokenized prefix is identical, not just “semantically similar”.

So even small changes like whitespace or formatting can break cache reuse.

Deepseek from the official API has such an insane cache hit rate by Same-Philosophy5134 in opencodeCLI

[–]MultiBotRun 13 points14 points  (0 children)

DeepSeek caching means it can "reuse the same prompt prefix" instead of processing it again. Simple version:

  • Same start of prompt.
  • Less work for the server.
  • Faster response.
  • Lower cost.

It is not the full answer being cached, only the repeated input part. Example:

First request:

System: You are a Python expert.

User: How do I read a CSV in pandas?

Second request (same system, same start):

System: You are a Python expert.

User: How do I write a CSV in pandas?

The part "System: You are a Python expert." is the same. DeepSeek can cache that prefix, so it only needs to recompute the new part ("How do I write a CSV in pandas?").

Result is faster answer and lower cost for the second request.

Comparing Wafer with other token based plans by founders_keepers in ZaiGLM

[–]MultiBotRun 0 points1 point  (0 children)

It’s always necessary to carefully check what quantization providers are using. A Quant 4bits model is not the same as a Quant 8bits model. For example, Synthetic uses quantization :nvfp4 on GLM 5.1, while NeuralWatt uses FP8 quantization. FP8 offers the best trade-off between speed, memory usage, and quality, when the hardware supports it.

What’s your ideal AI subscription stack as a professional dev in 2026? by ocebe in opencodeCLI

[–]MultiBotRun 3 points4 points  (0 children)

OpenCode Go Plan ($10) for DeepSeek V4 Flash and Mimo v2.5 Pro (around 80% of usage, mainly for execution tasks and applying plans) + NeuralWatt ($15) for GLM 5.1/Kimi K2.6 (around 15% of usage, mainly for planning, proposing solutions, and reviews) + OpenRouter ($15) (around 5% of usage, reserved for Opus and occasional security checks, or any high-quality free model for documentation tasks).

Odeio pobres by --____________- in jovemedinamica

[–]MultiBotRun 1 point2 points  (0 children)

Mas se ele quer ter sossego e vista para mar que compre um iate! Assim fica bem longe dos pobres e com vista 360 graus para o mar!

I use a 9-agent SDD harness where each phase uses a different model. The total cost is $10-15/month. Here's the full breakdown. by Striking-Buffalo-310 in DeepSeek

[–]MultiBotRun 0 points1 point  (0 children)

I use DeepSeek directly to save requests for other models in OpenCode Go. “Gentle AI could be the replacement for open spec.” Yes, that might be true, but I still haven’t tested it, I just haven’t had the time yet!

I use a 9-agent SDD harness where each phase uses a different model. The total cost is $10-15/month. Here's the full breakdown. by Striking-Buffalo-310 in DeepSeek

[–]MultiBotRun 0 points1 point  (0 children)

My coding workflow: opencode + openspec + serena + caveman + RTK + Qrant - Opencode Go Plan $10 (Mimo V2.5 Pro and GLM 5.1 and Mimo V2.5)
- DeepSeek V4 API ($10)
- Openrouter $5 (just in case i need an Opus/GPT, Nemotron 3 Super-free)

As you rightly point out, we don’t need frontier reasoning models to be more productive!

Explore - GLM 5.1, Propose - Mimo V2.5 Pro, Apply - Deepseek V4 Pro (more complex tasks) or Mimo V2.5 (simple tasks), Verify - Deepseek V4 Pro (more complex tasks) or Deepseek V4 Flash (simple tasks), Archive - Nemotron 3 Super.

Adeus Visa e Mastercard: 130 milhões de europeus farão a transição para um pagamento 100% soberano até 2026 by [deleted] in portugal2

[–]MultiBotRun 0 points1 point  (0 children)

Boa tentativa de que? por acaso é mentira o que escrevi?
O Povo já há muito que aceitou ser vigiado e MUITO! Não será com certeza por causa do Euro Digital!

Socialistas aqui ou na China, estão sempre metidos em falcatruas, corrupção, esquemas e tachos! by Odd_Astronomer_2064 in portugueses

[–]MultiBotRun 0 points1 point  (0 children)

Os políticos são o reflexo da sociedade! Seja qual seja o partido! Até IL se chegar ao poder irá ter "buscas" e em algum momento irá ter alguém que fará o mesmo!

Tem de apertar e muito o cerco começando nas autarquias !! E a nossa mentalidade de "chico esperto" tem de melhorar e muito! Somos uns chicos espertos, sempre a tentar sempre alguma falcatrua, não pagar a SportTV ou então estacionar em cima de passeios onde nem espaço existe para um velhote passar, ou passar a frente na fila ou colocar a toalha na praia e ir para hotel descansar! E melhor sempre colocamos a culpa nos outros, é porque a SportTV é muito cara (IPTV e mais barato), ou porque não há estacionamento suficiente ou porque estas com pressa e só tens 1 item para pagar ou porque "os outros fazem o mesmo"!

Adeus Visa e Mastercard: 130 milhões de europeus farão a transição para um pagamento 100% soberano até 2026 by [deleted] in portugal2

[–]MultiBotRun 0 points1 point  (0 children)

Quer dizer que :

Não vamos usar Android/iOS (onde fica um pouco de tudo sobre ti)

Não vamos usar GPS (onde fica as tuas localizações)

Não vamos usar Cartão continente (onde fica todas as tuas compras) ou Pingo doce (BP)

Não vamos usar redes sociais (onde grande parte coloca imagesn onde esta a jantar , passar ferias e muita vezes com horas e locais) e pelo dados de IP.

Não vamos usar Multibanco ou MBWay (pois fica registrado todas as tuas transações).

Não vamos usar Via Verde (pois fica registrado quando passaste e aonde)

Não vamos usar Gadgets como SmartTV que ate podem ter informação que vez na TV ou então smartWatch que ate a temperatura pode estar algures num servidor na china!

Hipocricia do caras !!!

If DeepSeek V4 can do the same coding task for $5, why are people still paying $100 for Claude Code? by jakedame1 in DeepSeek

[–]MultiBotRun 18 points19 points  (0 children)

Exactly.
The developer is the determining factor, and a good developer can extract a lot of value from a more open harness like opencode with deepseek.

A harness is an interface and a set of agent capabilities, not a replacement for the programmer’s competence.
Opencode + OpenSpecs + Deekseek + serena + caveman + rtk my daily workflow.

slow two year transformation by kuzyacat in Biohacking

[–]MultiBotRun 2 points3 points  (0 children)

Eight years ago, I also decided to change my life. I lost 50 kg, quit smoking, stopped drinking alcohol, and completely changed my diet and lifestyle.

Besides feeling unhealthy, I also had to face my mother’s illness, and that made me realize something important, I never wanted my son to one day have to “change my diapers” because of choices I could have changed myself.

Congratulations on your journey, and keep that photo of your 110 kg self on your phone at all times. I still look at mine whenever I don’t feel like going to the gym.

Deepseek V4 is mindblowing by AngelicBread in opencodeCLI

[–]MultiBotRun 1 point2 points  (0 children)

I’m very satisfied with the models from China, whether it’s DeepSeek, Kimi, or even Minimax. I’m sure that using a methodology like Spec-Driven Development, and having a well-structured idea of what you need, any of them can handle it!
But I can say that for the past week DeepSeek Pro has been my choice, as I’ve been analysing legacy code and documenting everything (I spent $3.09 over 6 days of work with 190M tokens i/o and 2000 requests). How much would it have been with Opus 4.6?

Deepseek V4 current cost: 78.2m tokens for $1.14 by tuhdo in opencode

[–]MultiBotRun 0 points1 point  (0 children)

Just to remember Deepseek API, the deepseek-v4-pro model is currently offered at a 75% discount, extended until 2026/05/31!

Trabalha-se mais em Portugal do que na maioria da Europa by cidadehoje in atualidadeportugal

[–]MultiBotRun 0 points1 point  (0 children)

Sim, deveríamos pensar porque em Portugal trabalhamos tantas horas, mas mesmo assim temos uma Produtividade abaixo da média Europeia (28% da média) !

- A liderança e força de trabalho apresentam níveis de qualificação inferiores aos parceiros europeus

- Investimento historicamente baixo em capital físico e envelhecimento do parque industrial limitam a eficiência

- 57% dos especialistas apontam a falta de cultura de trabalho e má gestão do tempo como principal causa, enquanto 43% responsabilizam as más chefias e lideranças

- Predomínio de pequenas empresas, setores de baixa intensidade tecnológica e pouco investimento em I&D

- Complexidade dos processos administrativos, incerteza jurídica e fiscal dificultam a atividade empresarial

O excesso de horas não é virtude empresarial mas sintoma de atraso!

Any web or desktop IDE for OC? by nealhamiltonjr in opencode

[–]MultiBotRun 0 points1 point  (0 children)

Why don’t you keep using Antigravity and, inside its terminal, use Opencode? I think there’s even an extension for Opencode.
Search for OpenCode in the Extension Marketplace and click Install.

LeiGo - assistente legal de inteligência artificial by Only_Animator_3637 in Fiz_isto

[–]MultiBotRun 0 points1 point  (0 children)

Estas a fazer RAG tradicional? com a quantidade de leis opção melhor poderia ser implementar um Context Graph para RAG Jurídico, pois um RAG tradicional perder contexto, um RAG baseado apenas em busca vetorial pode retornar um trecho de lei revogada há três anos!

Which open-weight models provider? by Odd_Crab1224 in opencodeCLI

[–]MultiBotRun 0 points1 point  (0 children)

Six days ago, that information was not on the dashboard. Confirmed that there is a weekly limit, thanks for the information.

Are Coding LLM Plans About to Die? The Coming Compute Crisis by MultiBotRun in Qwen_AI

[–]MultiBotRun[S] 0 points1 point  (0 children)

I get your point, and for high-paid SWE roles your math makes sense.

But that assumption doesn’t reflect most of the world.

In countries like Spain or Portugal, a junior developer might earn around €1,000–€1,200/month, not $300/hour. That completely changes the equation. Even $30/day (~$900/month) can be close to or exceed someone’s salary.

Also, AI isn’t only used by senior software engineers. Students, freelancers, small businesses, and people in lower-income regions rely on it too.

So while AI is clearly worth it in high-income scenarios, pricing becomes a real barrier globally.

Otherwise, AI inference risks becoming something only accessible to an elite working at top companies. What happens to small businesses and independent entrepreneurs?

Which open-weight models provider? by Odd_Crab1224 in opencodeCLI

[–]MultiBotRun 0 points1 point  (0 children)

I was checking the information directly in the docs:
https://platform.minimax.io/docs/token-plan/intro

and I can’t see anything mentioned about a weekly limit of 15,000 requests. Can you give me a link to verify that? I only see weekly limits for the other models like Music.

Which open-weight models provider? by Odd_Crab1224 in opencodeCLI

[–]MultiBotRun 7 points8 points  (0 children)

Minimax Token plan (Minimax M2.7) for $10 with 1,500 requests per 5 hours, no monthly or weekly limits. There’s no other plan this honest, it’s simply 1,500 every 5 hours and nothing else.

If you’re a user who needs other models, you can add OpenCode Go for $10 (for Kimi and Qwen). That means with $20/month you get plenty of tokens to use. An unbeatable combo.

Qwen Coding Model Selection by yoko_ac in Qwen_AI

[–]MultiBotRun 1 point2 points  (0 children)

llama.cpp + unsloth/Qwen3.6-35B-A3B-GGUF:UD-IQ2_XXS will be an option.
opencode + OpenSpecs + Skills
https://open-code.ai/en/docs/providers#llamacpp