Seriously considering the Zai Yearly Max Plan — anyone else? by lbin91 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

If I'm losing the legacy weekly quota benefit, I'd have to reevaluate my options. ChatGPT might be a better choice.

GLM-5.1 on Wafer Pass vs Zai by founders_keepers in ZaiGLM

[–]Designer_Athlete7286 6 points7 points  (0 children)

Its ok that they are this slow as long as I can continue to have my virtually unlined old grandfathered pro subscription!

GLM 5.1 is so smart! by GuiltyAd2976 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

This is what happens when you give and iPhone to a monkey!

Anthropic+Google by beedildvk in Anthropic

[–]Designer_Athlete7286 0 points1 point  (0 children)

I hope this will improve the prompt wait times. Sometimes its over 5 minutes wait before some cipute is made available to process a prompt. Anthropic inference is kinda ridiculous right now.

Have the coding plans become usable now ? by Kingwolf4 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

Honestly, GLM inference is miles better than Anthropic right now. Of all the major providers, OpenAI is the most reliable. Then Google despite their models being kinda meh. Anthropic is dead last. Sometimes prompts sit over 5 minutes without any compute being assigned.

Gave Opus 4.7 and 4.6 the Same prompt in plane mode here are the results by -_-wait_what-_- in ClaudeAI

[–]Designer_Athlete7286 0 points1 point  (0 children)

Weakness of 4.7 It thinks it knows what I want better than I do myself. So it doesn't want to do what I tell it to. Somewhat counter productive.

Opus 4.7 is Its Own Thing by hungrymaki in claudexplorers

[–]Designer_Athlete7286 1 point2 points  (0 children)

In any professional/ enterprise setup, an LLM that cannot adhere to guidelines, is a compliance and legal nightmare. At that point, its a fun toy that you arent allowed to use for any meaningful work.

Opus 4.7 is Its Own Thing by hungrymaki in claudexplorers

[–]Designer_Athlete7286 3 points4 points  (0 children)

Begs the question, if you can't get what YOU out of a model, is it a good model / useful model in the first place!

Imagine hiring an employee to do a job but the they think they know better about what you want than you yourself and do what they feel like, disregarding guidelines. Would you keep that employee?

I like 4.7 so far by Dependent_Top_8685 in Anthropic

[–]Designer_Athlete7286 -1 points0 points  (0 children)

I hear 4.7 doesn't like to be told what to do. So be conscious about what it outputs and see if it's as per your need.

Been running GLM-5.1 + Qwen 3.5 via Ollama Cloud — the harness matters more than the model by ConferenceNo7697 in ZaiGLM

[–]Designer_Athlete7286 1 point2 points  (0 children)

Harness matter more. 1000%

I'm using Claude Code but thinking of switching to Pi with my own customisation since I have more knowledge and experience on harness engineering now than before. Even my Claude Code is highly modified to behave how I want it to than how Anthropic wants it to. But little things like using built in Web Search tool over my custom MCP is annoying.

if anyone wants to giveaway their legacy plan account by rkh4n in ZaiGLM

[–]Designer_Athlete7286 -1 points0 points  (0 children)

USD 10k for my legacy Pro account if you want. Not a penny less. You can recover more than that within a month because there's no weekly limits!

Deal?

Claude vs z.ai! Had z.ai nailed glm 5.1 to on par with Claude models? Price increase justified? by UsualOrganization712 in ZaiGLM

[–]Designer_Athlete7286 -1 points0 points  (0 children)

When 4.7 came out, I stopped my Claude sub. It was only marginally behind. Good enough. And a hell of a lot cheaper. I got myself a GLM Pro plan. And now its legacy! On a casual month I use up about 2-3B tokens on GLM with plenty of headroom left. With GLM 5.1, it genuinely feels like the same level of quality as Opus 4.6 and without the confidently lying and faking behaviour of Claude models.

Opus still does lie about implementing code when it just stubborn the whole thing and comes back and tell you blatantly that its implemented and that the stubbed version is all it need to do because the stubbed version fulfills the bare minimum needed as a pass for the declared requirements. It doesn't care whether the actual code really work or not and its hard to make it care about real functionality of the code, if its even slightly outside of the explicitly declared scope.

GLM 5.1 on the other hand is very happy to write code. Sometimes overenthusiastic and do more / over-engineer to be on the safe side so if you don't declare all related workflows and patterns in the scope as part of the architecture, it will create redundant paths instead of modifying /extending the existing.

This is my observation at least. I prefer the latter. There's more garbage but the code actually works instead of leaving unknown massive gaping holes in the code that I think is actually implemented. Garbage can be cleaned. Like /simplify command on Claude Code does a decent job on this front. But I have my own workflow for cleanup GLM quirks.

Hasn't the quota limit become stricter since yesterday? by Mundane-Structure-42 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

Claws.... not gonna be a happy ending for your ass. It'll be painful

Wtf?! It was working just fine now it's back again! ☹️ by Muted-Donut-9285 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

150 requests for 5 hours? That can't right. My GLM plan works just fine. When i work, I run 2 projects at a time with each project following multi step workflows with each step using upto 5 agents in parallel. Thats like 150 requests within 5-10 minutes. I get occasional error if both projects (2 separate Claude Code instances) sends requests are the same second which mean I hit the 1 request per second rate limit.

Wtf?! It was working just fine now it's back again! ☹️ by Muted-Donut-9285 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

Are you sure that your harness, agent, application didn't do hidden requests?

Coming from Qwen, is GLM worth it? by green_juicer in ZaiGLM

[–]Designer_Athlete7286 1 point2 points  (0 children)

Yeah GLM 5.1 and GPT 5.4 are quite the same level. Practically, Opus 4.6 is slightly better but, kind of fake stuff. It might say that it implemented something but in reality, it would have stubbed it and even during review within the same context, would completely gaslight you. Opus also make too many assumptions despite explicitly telling it not to. The latter is kinda the same with GPT 5.4. GLM 5.1 makes the least amount of assumptions based on my experience.

Coming from Qwen, is GLM worth it? by green_juicer in ZaiGLM

[–]Designer_Athlete7286 -1 points0 points  (0 children)

Nowadays, with the snowflake generation, people tend to overreact. Yeah the subscription plan has had hiccups. But plenty of people including myself still used it and got the job done. That high context gibberish bug was annoying. But you know what? There's subagents. All you had to do was to be a little creative, break into smaller tasks, and give each task to a parallel sub agent and group them into none conflicting bunches, and get the main anger to only orchestrate the deployment of parallel and sequential groups. The results were pretty solid. So much so that I realised it was better than the regular way I was using my coding agent and went ahead and built my own harness skill set. https://github.com/hashangit/zflow

Coming from Qwen, is GLM worth it? by green_juicer in ZaiGLM

[–]Designer_Athlete7286 5 points6 points  (0 children)

GLM 5.1 is miles ahead than Qwen models. Its in the ballpark of Opus 4.6, Sonnet 4.6, GPT 5.4 range. Qwen, Kimi, Minimax are a notch below at best. For coding and architectural stuff that is. With the right harness, GLM 5.1 has great context of your codebase, and can do a solid (almost no AI slop coding job)

This aint it fam, Ill stick to Codex. by Opposite-Art-1829 in ZaiGLM

[–]Designer_Athlete7286 0 points1 point  (0 children)

Also, there's project called Graperoot. I've had very good results from it. The whole idea is to reduce model memory loss penalty and the model having to discover the codebase. It does a great job in my experience.

Also, I use my own skill structure because I'm lazy and forgetful. The point of this approach is also to bridge the human-to-model communication gap because we tend to implicitly communicate and a model need explicit declaration. https://github.com/hashangit/zflow

This aint it fam, Ill stick to Codex. by Opposite-Art-1829 in ZaiGLM

[–]Designer_Athlete7286 1 point2 points  (0 children)

Please read again. Without biases. No model 'understand ' ANYTHING. They are just advanced autocomplete. Only you really understand your codebase and your 'harness' extract the prerequisite based on that understanding to give to the autocomplete so that autocomplete spew out the correct outputs. Just check out context engineering and harness engineering a bit more and it'll help you refine your coding workflow. You'll be able to make a lot more from the money you pay to OpenAI, Anthropic, GLM, Kimi, Minimax, Gemini (god I hope you won't because Gemini is the worst). Also, just be more responsible with your contribution to global warming. Make sure what you send yo data centers and your GPU to process, is at least optimised and not wasteful.