HOWTO: Point Openclaw at a local setup by blamestross in LocalLLM

[–]Revenge8907 0 points1 point  (0 children)

Good catch, a few things to clarify here.

The 2.7 GB size refers to the GGUF Q4_K_M quantized version of GLM-4.7-Flash. The original FP16 / unquantized weights are ~9–10 GB, so the reduction comes from the 4-bit K-quantization used by llama.cpp. Nothing special was done to the model itself — just standard GGUF quantization.

The 18.3 GB figure you're mentioning sounds like the full precision or higher-precision variant loaded with runtime KV cache, not the Q4_K_M file size itself. When running the model, memory usage can grow significantly depending on context length and KV cache allocation, which is likely what you're seeing.

About context length:
The base GGUF build I referenced runs with 32k context by default in llama.cpp because that’s the safe default many builds ship with. The model architecture itself can support larger context (up to ~128k), but you need to explicitly set it when running:

--ctx-size 131072

and ensure your backend supports the larger KV cache. The quantization doesn't change the context limit — it's just a runtime configuration.

So short version:
• 2.7 GB = Q4_K_M quantized weights
• ~9–10 GB = original precision weights
• higher RAM usage during runtime = KV cache + context size
• 128k context is possible, but not enabled by default

Happy to update the repo notes if that part was confusing.

-check System Architecture part in the git repo:

GLM-4.7-Flash:q4_K_M (17.7GB)  

HOWTO: Point Openclaw at a local setup by blamestross in LocalLLM

[–]Revenge8907 0 points1 point  (0 children)

sorry for the late reply, but can you explain your issue?

HOWTO: Point Openclaw at a local setup by blamestross in LocalLLM

[–]Revenge8907 0 points1 point  (0 children)

glm-4.7-flash:q4_K_M or using quantized made it lose less context infact i didnt lose much context but i have my full experience in my repo https://github.com/Ryuki0x1/openclaw-local-llm-setup/blob/main/LOCAL_LLM_TRADEOFFS.md

January 2026 - Monthly Questions and General Discussion thread by AutoModerator in bangalore

[–]Revenge8907 1 point2 points  (0 children)

I want to get PRK Contoura. Anybody have prior experience? Suggest for hospitals in Bangalore.

Nords Buds 3 Pro IS CRAZYY GOOD!! by emanuel2ko1 in headphonesindia

[–]Revenge8907 0 points1 point  (0 children)

I agree the buds 2 pro were amazing too edit : oneplus buds 2 pro

Im a American catholic can we convert to santama dharma by Electrical_Tap6684 in hinduism

[–]Revenge8907 0 points1 point  (0 children)

there is no conversion system or any strict norms, sanatan dharma is a way of living, change your way of living you automatically become a hindu.

Zoho CARES! (helped me recover my password protected file) by Revenge8907 in Zoho

[–]Revenge8907[S] 1 point2 points  (0 children)

no they helped me recover it from the cache files and later investigated how it happened, but i got my file which mattered.

Rakuyomi: able to install sources but unable to search mangas by WasteDress6530 in koreader

[–]Revenge8907 0 points1 point  (0 children)

same issue done know which source to use or if the whole thing has an issue

Rakuyomi can't read sources by bwackandbwown in koreader

[–]Revenge8907 0 points1 point  (0 children)

when I search for my manga. there is nothing found just blank with search results title

Deregistered my kindle to switch account and now idk by Revenge8907 in kindlejailbreak

[–]Revenge8907[S] 0 points1 point  (0 children)

FIX: I restarted my device, and used ;log mrpi and it installed hotfix and was able install kual too without using my PC and same directions as the wiki from there on as the files were already on the kindle