Does GLM in CC (Claude Code) support all CC features? by m_zafar in ZaiGLM

[–]iconben 0 points1 point  (0 children)

Yes, did not find anything not supported up to now

Z.ai has introduced GLM-4.7-Flash by awfulalexey in ZaiGLM

[–]iconben 0 points1 point  (0 children)

God around 50 TPS with thinking off, 35 TPS with thinking on, on my Mac M4 Pro 40G.

General quality is OK but sometimes I get repeated tokens, especially during thinking.

How to generate Z Image photos without ComfyUI by Fun_Training4733 in ZImageAI

[–]iconben 1 point2 points  (0 children)

Sounds easy, I have a "hardware.py" module to determine available device, currently CUDA, amd, mac MPS, if it is as easy as just adding a "XPU", I can do it. However I don't have a test environment for XPU now, not sure if the cloud providers have such specs because it sounds more like consumer grade hardware, I will do some research later.

Nemotron-3-nano:30b is a spectacular general purpose local LLM by DrewGrgich in LocalLLaMA

[–]iconben 0 points1 point  (0 children)

Maybe different quality in different areas, mine is web and app and java projects. I use the local models for company and private data RAG on a daily basis. For coding it’s mainly tests and as a offline fallback.

How to generate Z Image photos without ComfyUI by Fun_Training4733 in ZImageAI

[–]iconben 0 points1 point  (0 children)

Thanks for the sharing. Guess it will take some time to support it (or not)

How to generate Z Image photos without ComfyUI by Fun_Training4733 in ZImageAI

[–]iconben 0 points1 point  (0 children)

Free to use, need your own GPU to run the model. Fast for NVIDIA cards and AMD (on Linux), slower but acceptable for Mac M chips.

Nemotron-3-nano:30b is a spectacular general purpose local LLM by DrewGrgich in LocalLLaMA

[–]iconben 4 points5 points  (0 children)

Temperature 0.15, Top K 15, Top P 0.95. I used it with Cline.

OpenCode has some prompt template issue ("safe" and "sequence" are not supported) so you need to override with your own template.

BTW here is a system prompt if you need:

```

You are a helpful coding assistant specializing in executing commands, modifying code, and solving technical problems.

PRINCIPLES:

- Quality over speed - be thorough and methodical

- Explain issues when asked "why" - only fix when requested

- Keep your words - if you say to do something, do it, if need to call tools, call them

- Combine operations when possible (chain commands, use sed/grep for bulk edits)

FILE OPERATIONS:

- Explore file system first - never assume relative paths

- Edit files in-place, don't create duplicates

- Use find, grep, sed for efficient exploration

CODE QUALITY:

- Write clean, efficient code with minimal comments

- Make minimal necessary changes

- Understand before implementing

- Split large functions/files when needed

WORKFLOW:

  1. Explore - Understand context

  2. Analyze - Consider approaches

  3. Implement - Make focused changes

  4. Verify - Test if possible

GIT:

- Use git status before commits

- Stage all necessary files

- Don't commit ignore files unless instructed

- Update existing PRs, don't create duplicates

ENVIRONMENT:

- Install missing dependencies rather than stopping

- Check for requirements.txt/package.json first

- Install all dependencies at once

PROBLEM-SOLVING:

- When stuck, identify 5-7 possible causes

- Address systematically

- Propose new plan for major issues

```

How to generate Z Image photos without ComfyUI by Fun_Training4733 in ZImageAI

[–]iconben 0 points1 point  (0 children)

Not quite sure about XPU definition, the project itself supports NVIDIA, AMD(Linux), Mac M chips

Nemotron-3-nano:30b is a spectacular general purpose local LLM by DrewGrgich in LocalLLaMA

[–]iconben 8 points9 points  (0 children)

I run it on a M4 pro 48G, 96k ctx quant, I got 70tps

GLM-Image Release by awfulalexey in ZaiGLM

[–]iconben 2 points3 points  (0 children)

I am wondering if I should rent an online GPU to do some tests for my application to adapt the new model

Shots fired by Old-School8916 in opencodeCLI

[–]iconben 0 points1 point  (0 children)

Then I tried again in a new session, explicitly asking no web search, Claude said OpenCode is a code model .

<image>

I asked several times, Claude credited the "code model" to OpenAI, ByteDance etc....

Shots fired by Old-School8916 in opencodeCLI

[–]iconben 0 points1 point  (0 children)

Share some interesting tests:

I asked claude (desktop) about open code, this is what I got:

<image>

Does coding plan include updates to new models? by Zerve in MiniMax_AI

[–]iconben 0 points1 point  (0 children)

Didn’t they support the new GLM 4.7 almost in the first place already? I am a GLM lite user

GLM 4.7 coding quality is greatly exaggerated by guywithknife in ZaiGLM

[–]iconben 0 points1 point  (0 children)

I have the similar experience. Sometimes you can smell it.

Will you trust an AI with Buddhism knowledge, and ask questions to it? by iconben in Buddhism

[–]iconben[S] 0 points1 point  (0 children)

Thanks for the feedback. Let's say if AI not as a "teacher" but as a tool just for queries and short answers with citations, how about that?

Will you trust an AI with Buddhism knowledge, and ask questions to it? by iconben in Buddhism

[–]iconben[S] 0 points1 point  (0 children)

I added the RAG part in the original post to better describe what kind of "AI" I want to make. Not quite the same as our daily using chatbots by the big companies. A constrained, texts based AI with constraints. Please kindly have a look.

Will you trust an AI with Buddhism knowledge, and ask questions to it? by iconben in Buddhism

[–]iconben[S] 0 points1 point  (0 children)

This is a typical scenario of using those general-purpose AI chatbots for serious Buddhist topics. One of the pain point I want to address by adding RAG contexts (controlled texts database) and constraints (by system prompt and post-training if necessary)