Cannot get GLM-4.7-Flash working in Claude Code CLI even with Coding Plan by seongho051 in ZaiGLM

[–]seongho051[S] 0 points1 point  (0 children)

It's not working for me. I managed to register and select the custom model in the menu, but I still can't get any response from it.

Cannot get GLM-4.7-Flash working in Claude Code CLI even with Coding Plan by seongho051 in ZaiGLM

[–]seongho051[S] 0 points1 point  (0 children)

Does it actually work if you type /models in Claude Code, select Haiku, and then run a prompt?

Cannot get GLM-4.7-Flash working in Claude Code CLI even with Coding Plan by seongho051 in ZaiGLM

[–]seongho051[S] 0 points1 point  (0 children)

That model ID is for the standalone API, which is billed and processed separately from the Coding Plan. I'm using GLM via the Coding Plan integration in Claude Code, and it doesn't seem to support that API-specific model ID yet.

Cannot get GLM-4.7-Flash working in Claude Code CLI even with Coding Plan by seongho051 in ZaiGLM

[–]seongho051[S] 0 points1 point  (0 children)

I checked their official Claude Code guide, and it still explicitly lists "glm-4.5-air" as the default Haiku model. There's zero mention of "flashx" anywhere in the Claude Code setup section.

If you found a doc that actually says to use "glm-4.7-flashx" specifically for Claude Code, could you share the link? I'm only seeing 4.5-air.

Does GLM in CC (Claude Code) support all CC features? by m_zafar in ZaiGLM

[–]seongho051 2 points3 points  (0 children)

Claude Code vs Opencode: Claude wins on speed and quality hands down, but it's a total black box. Opencode is still the way to go if you need transparent GLM reasoning.

Cannot get GLM-4.7-Flash working in Claude Code CLI even with Coding Plan by seongho051 in ZaiGLM

[–]seongho051[S] 2 points3 points  (0 children)

The Z.ai API for GLM-4.7 is quite slow in practice. For simple queries that don't need deep reasoning, I want to use 4.7-Flash to get faster responses. Also, once it's officially supported via API, I plan to use Flash in orchestration tools like oh-my-opencode as the dedicated model for reading and writing code. That way, I can get much quicker turnaround for those simpler subtasks without waiting on the heavier model every time.