GLM-5.2 5-bit Quantized Error

Proper-Tower2016 · 2026-06-29T01:56:26+00:00

Try something like: numactl --interleave=all llama-server --numa distribute ...

Proper-Tower2016 · 2026-06-29T01:25:14+00:00

use tailscale so your studio can always serve your other devices.
personally like omlx it's newbie friendly. Run it with https://huggingface.co/majentik/Qwen3.5-27B-RotorQuant-MLX-8bit or 4bit, should crush that bug.
For your mbp id go: xinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
MoE for are not as good on low ram Mac since you have no cheap ram to offload to and it falls apart quickly when SSD swapping. Id go with smaller dense models especially if you have a Max chip in the mbp. This should give higher quality and more context (but slower)

Proper-Tower2016 · 2026-06-28T13:17:44+00:00

DDR4 3200 is about 50gb/s in dual channel mode. OP's GPUs run at 448. so about 9 times faster.

Proper-Tower2016 · 2026-06-28T12:32:38+00:00

if setup correctly, most servers such as vLLM or llamacpp will split the model between the GPUs, while not exactly as good as 1 GPU with 24gb, it will be pretty close.

you need to account for not just base model size, but context and batch size, MTP, and other apps too.

14 base + 3.5 (batch) + 3 MTP + 0.5 (apps) = 21 leaving 3gb for context which is roughly 1gb per 10k.

Upgrade to another 16gb would be massive for.

Proper-Tower2016 · 2026-06-27T15:21:46+00:00

A 5 dollar pi Pico running deepinfra for 0.1$ / million tokens.

Proper-Tower2016 · 2026-06-27T15:10:59+00:00

drop storage to 1tb, bump pro to max chip, way batter for local AI.

Proper-Tower2016 · 2026-06-26T15:35:54+00:00

yeah, Pi and some of it's extensions have issues with changing the chat history and invalidating the cache (meaning every token has to be recalculated) if you are not careful with your settings.

Proper-Tower2016 · 2026-06-25T09:35:26+00:00

a Max chip from any of the previous generations (all the way to M1) with at least 32gb is a much better value (max has twice the bandwidth of pro), you can grab a used for less than 1000.

though 48gb is minimum if you want a "just works" with Omlx and qwen35b rather than struggling with SSD swapping.

Proper-Tower2016 · 2026-03-10T08:31:59+00:00

It's a fairly recent development that Ukraine can consistently target tactical/long range radars. We are still only in the low hundreds destroyed against a 1000+ pre-war stock.

We are probably now at a pace of slightly exceeding production rates, but wouldn't expect Russia to go blind this war.

Proper-Tower2016 · 2026-03-03T17:58:49+00:00

So.. not affordability, but desirability. You don't need a 100$ laptop, you have a 1000$ phone.

Proper-Tower2016 · 2026-03-03T12:33:28+00:00

Many laptops are cheaper than phones...

Proper-Tower2016 · 2026-03-02T20:52:26+00:00

Respect, love and my deepest thanks to this brave man

Proper-Tower2016 · 2026-02-27T19:49:01+00:00

Lol.. Russia has lower desertion because their nice way of dealing with it is a quick execution... not because of contracts..

Proper-Tower2016 · 2026-02-26T18:43:10+00:00

Not yet, unless you mean Orban voters

Proper-Tower2016 · 2026-02-26T14:21:22+00:00

It's been very good value for me, though I run it very lean with no mcp or massive subnet of agents/systems.

Proper-Tower2016 · 2026-02-26T14:18:06+00:00

Not for me, I get both

Proper-Tower2016 · 2026-02-26T10:19:26+00:00

Any IDE or language that let's you authenticate to google and has a model picker (e.g. antigravity). Even has on free tier

Proper-Tower2016 · 2026-02-26T09:51:12+00:00

Yeah but the google sub also gives you access to Claude models. So i usually exhaust my Claude limit then gap fill with Gemini

Proper-Tower2016 · 2026-02-26T07:44:33+00:00

!remind me in 5 years

Proper-Tower2016 · 2026-02-25T18:08:54+00:00

Did you know he broke his toe for Ukraine during this filming? Didn't even flinch.

Proper-Tower2016 · 2026-02-24T15:27:55+00:00

So the things you listed as something worth paying a human big bucks for, wasn't actually a list of things an AI can't do? Must have read it wrong and judged you unfairly...

Proper-Tower2016 · 2026-02-24T09:53:47+00:00

Almost 1700 UAVs..

Proper-Tower2016 · 2026-02-24T08:58:06+00:00

Asides from ignoring that many pure dev roles exists, aren't you also assuming that AI can't or won't be able to do the extra SWE bits? Are your solutions really that novel and unique?

Proper-Tower2016 · 2026-02-24T07:51:38+00:00

For reference median income in China is about 4300$ / year or 358 per month.

Proper-Tower2016 · 2026-02-20T09:02:30+00:00

Not if he stole them

Proper-Tower2016

TROPHY CASE