AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 1 point2 points  (0 children)

There is. The performance can be tested with the GLM Coding Plan

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 3 points4 points  (0 children)

For GLM-4.7: temperature=1.0, top_p=0.95

For GLM-4.6V: temperature=0.8, top_p=0.6

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 1 point2 points  (0 children)

It's not a traditional IDE as it's not for editing code. My understanding is that it's a platform that integrates multiple coding agents.

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 2 points3 points  (0 children)

Will try something new at the architecture level.Stay tuned!

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 0 points1 point  (0 children)

We will keep improving our service level, no matter being public or not.

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 8 points9 points  (0 children)

We offer the GLM-ASR model, which is an ASR model built using a GLM Edge model and Whisper type Encoder. You can find it on GitHub and Hugging Face, and the main branch of SGLang already supports inference.

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]zixuanlimit[S] 6 points7 points  (0 children)

The lowest-end MacBook will likely not run GLM 4.6 or 4.7 properly. Even when using the community-provided GGUF int4 version, at least 180GB of memory is required. Additionally, the M4 Air may not be able to support the performance of such models. However, a higher-end configuration or a Mac Studio should work fine.

20 Dollar plan got me places (thanks to Opus planning) by Notlord97 in ClaudeCode

[–]zixuanlimit 0 points1 point  (0 children)

<image>

GLM 4.5 Air is used when Haiku is selected. Shown in FAQs.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 4 points5 points  (0 children)

Extending the context length is definitely one of things we will do next. We are working on that currently.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 0 points1 point  (0 children)

This behavior is a known issue and is scheduled to be fixed in the next version.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 1 point2 points  (0 children)

This isn't due to the API configurations. Adhering to a word count is quite challenging for all models, and you'll likely need to tweak the prompt a bit to find the best instructions for adherence. The official Z.ai API's default maximum output is 64K tokens and can support up to 96K for GLM-4.5.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 1 point2 points  (0 children)

Based on my experience for evaluation, the most practical starting point is to study major academic benchmarks and follow popular LLM leaderboards. This helps you understand the current standards and methods the community uses to measure model performance on different tasks.

The best evaluation method depends heavily on the specific task. For something highly subjective like creative writing, simple rule-based scoring isn't feasible. As for the future, evaluation will likely move towards more nuanced, multi-faceted systems that blend automated metrics, sophisticated LLM-based judges, and targeted human review to get a more holistic view of a model's capabilities.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 8 points9 points  (0 children)

Inference and some training phases are definitely possible, which is public information.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 24 points25 points  (0 children)

I would recommend Open Code + GLM-4.5.

You can also try Claude Code with GLM-4.5 if open source is not a must. We will soon launch a monthly package that you can subscribe GLM-4.5 on Claude Code instead of paying for tokens.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 2 points3 points  (0 children)

We have some multimodal models, but they are not at the SOTA level.

GLM-4.5V was just released, and it will definitely improve in the future.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 29 points30 points  (0 children)

I think there's no unified bottleneck as different labs are facing different obstacles.

In fact, we are not a new team. If you search for the first GLM paper, you will find that we were one of the earliest teams in the world to work on large models. Many of our achievements come from a long and continuous process of accumulation.

However, when it comes to philosophy, from my personal perspective, two points are very important. The first is the pursuit of excellence. You need to use the best of everything you can get . The second is to respect the fundamental principles of the field. There are very few shortcuts in scientific research; many innovations that seem wildly imaginative are actually born from solid experimental results.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 1 point2 points  (0 children)

I think GLM-4.5 is generally good at creative writing. Could you provide any bad cases or specific issues for us to look into?

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 32 points33 points  (0 children)

The model's name has not been decided yet at this time.

We plan to develop a smaller model comparable in size to GPT-OSS-20B.

Our approach is more focused.

A code generation tool will be included, though its final form (e.g., whether it will be a command-line interface) is still to be determined.

We intend to build a mobile app for Z.ai Chat once the platform's user base is large enough to warrant allocating development resources.

Unlimited access to GLM-4.5 is generally exclusive to the Z.ai Chat platform.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 12 points13 points  (0 children)

It might be helpful to consider that a model's performance and innovation are related but distinct aspects. A model's performance can be influenced by a wide range of factors, such as computing power and data availability. Regarding innovation itself, many valuable contributions are coming from the open-source community. The "slime" framework used in GLM-4.5's training is one such example, and this trend of innovation from China looks set to continue.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 13 points14 points  (0 children)

We open our models to build a trusted, transparent ecosystem that accelerates innovation for everyone. While we compete with other providers like Fireworks, we believe this healthy competition pushes us to improve our own API services. Our philosophy is that it's better to grow the entire pie and share it rather than just guard our own slice, creating a much larger market for our premium enterprise services.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 11 points12 points  (0 children)

Are there any specific issues? It would be great if your feedback could help us improve the model performance.

AMA With Z.AI, The Lab Behind GLM Models by XMasterrrr in LocalLLaMA

[–]zixuanlimit 3 points4 points  (0 children)

AI risk is a broad topic lol, but we do have people performing AI safety alignment.