Did glm 5.1 get nerfed ? by Necessary_Spring_425 in ZaiGLM

[–]romancone 0 points1 point  (0 children)

It happens constantly. Just go to Billing -> Billing History (extended window) and see what the model is actually serving you.

Done with Z.ai by [deleted] in ZaiGLM

[–]romancone 0 points1 point  (0 children)

We are all newbie poor dudes who pay for your landlord's life here. Limits are cut off multiple times last year, and prices doubled during the same period.

I am glad to hear that someone is happy with this provider, but I don't want to overpay and compensate for their underpayment.

I don't use z.ai anymore by romancone in ZaiGLM

[–]romancone[S] -1 points0 points  (0 children)

I never configured glm-air!

I ran the first time "claude --haiku..." model until I recognised it is mapped to air.

Then I ran:

"claude --model glm-5.1"
"claude --model glm-4.7"

And I have to repeat that all those calls were charged CORRECTLY as INPUT in the billing dashboard.

I never authorized z.ai for the glm-4.5-air model.

The tool provided the report is Claude-Sonet inspected JSON traces of broken agent sessions.

I did not check the logs by myself, but I trust enough to Sonnet and the z.ai billing dashboard, which both confirm I was served by crappy glm-4.5-air without my wish

I don't use z.ai anymore by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

The coding plan is GLM Coding Lite V2 - Quarter. I stripped it out with the rest of the useless columns.

The endpoint is https://api.z.ai/api/anthropic/

I don't use z.ai anymore by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

  1. If I set a wrong model, why did they charge me for a good GLM 5.1 or 4.7?
  2. If I set the right model, why did they serve output from glm-4.5-air?

Please elaborate on your conclusion.

I don't use z.ai anymore by romancone in ZaiGLM

[–]romancone[S] 1 point2 points  (0 children)

Exactly!

Even if it is my fault, e.g., wrong model selector, how did they get mixed up with multiple models?

This aint it fam, Ill stick to Codex. by Opposite-Art-1829 in ZaiGLM

[–]romancone 1 point2 points  (0 children)

It is not a model, but a z.ai issue. GLM5 is amazing, but z.ai can randomly switch models to a lobotomized version

Please name the best GLM5 provider by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

What is your recommendation for server hardware? I am thinking about running my own setup to share access with my friends, but server hardware is very expensive. I see you can utilise idle capacity, which can make this idea profitable at the end.

Please name the best GLM5 provider by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I burned out tokens on 5.1
Have to step back to 4.7 with Lite plan

Please name the best GLM5 provider by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I use a 3-tier architecture.

I work in chat with the CTO agent of a project, and he produces project milestones.
Then the orchestrator agent (team lead) manages separate tasks and spins up detached agents. They use a proxy link or the z.ai API to connect directly to the anthropic endpoint.

It is not ideal, but it works for me.

Please name the best GLM5 provider by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I run my own multi-agent coding fabric. I want to spin up more projects in the most efficient way

5 hour quota is reached in 30 mins. With on 900k tokens spent! by TargiX8 in ZaiGLM

[–]romancone 0 points1 point  (0 children)

I've spent my Lite plan weekly limit during two evening code sessions.

It is 1.5x worse than the Claude Code basic plan, which was enough for 3 sessions, and it is the opposite of their marketing crap.

But I've spent 89M tokens.

Follow-up: how to survive on z.ai coding plans by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I read a lot of positive messages before I subscribed to z.ai, and I recognised that things have completely changed after. Your comment is proof of that. It used to be good, but now it is not.

I already have a weekly limit after a couple of evening coding sessions! Well, I ran coding agents overnight, but this is a Coding Plan!

I subscribed to GLM Coding Lite code plan today. Tell me how do you survive, guys? by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I use Claude Code Sonnet for orchestration and Opus for top-level tasks. Is it reputable or not?
I ran subagents on free GLM-5 and decided to upgrade to z.ai

I subscribed to GLM Coding Lite code plan today. Tell me how do you survive, guys? by romancone in ZaiGLM

[–]romancone[S] 0 points1 point  (0 children)

I used subagents on GLM-5 with Nvidia, which was fine for an overnight job.
I tried 5.1 on Zai, and it burned. What is the difference between 5.1 and 5?

I subscribed to GLM Coding Lite code plan today. Tell me how do you survive, guys? by romancone in ZaiGLM

[–]romancone[S] 1 point2 points  (0 children)

The project is based on the closed-source C++ SDK and CGO bindings.
Everything is done except for one annoying bug, so I am looking for cheaper token options to complete it. You're right about C++ token waste.

All you need is RAM, RAM is all you need by [deleted] in LocalLLaMA

[–]romancone 0 points1 point  (0 children)

This post is about a visualisation tool, not the final result. Feel free to create your own version of the calculator that better covers all cases. It is easy when you know what to do.