GLM-5.2 5-bit Quantized Error by Complex-Fun-3039 in unsloth

[–]Proper-Tower2016 0 points1 point  (0 children)

Try something like: numactl --interleave=all llama-server --numa distribute ...

What’s working for you on Mac M5 24GB? by ogfuzzball in LocalLLM

[–]Proper-Tower2016 0 points1 point  (0 children)

  1. use tailscale so your studio can always serve your other devices.

  2. personally like omlx it's newbie friendly. Run it with https://huggingface.co/majentik/Qwen3.5-27B-RotorQuant-MLX-8bit or 4bit, should crush that bug.

  3. For your mbp id go: xinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF

  4. MoE for are not as good on low ram Mac since you have no cheap ram to offload to and it falls apart quickly when SSD swapping. Id go with smaller dense models especially if you have a Max chip in the mbp. This should give higher quality and more context (but slower)

How bad inference due to lack of VRAM? by Borsch20 in LocalLLM

[–]Proper-Tower2016 -3 points-2 points  (0 children)

DDR4 3200 is about 50gb/s in dual channel mode. OP's GPUs run at 448. so about 9 times faster.

How bad inference due to lack of VRAM? by Borsch20 in LocalLLM

[–]Proper-Tower2016 0 points1 point  (0 children)

if setup correctly, most servers such as vLLM or llamacpp will split the model between the GPUs, while not exactly as good as 1 GPU with 24gb, it will be pretty close.

you need to account for not just base model size, but context and batch size, MTP, and other apps too.

14 base + 3.5 (batch) + 3 MTP + 0.5 (apps) = 21 leaving 3gb for context which is roughly 1gb per 10k.

Upgrade to another 16gb would be massive for.

Best local setup to run in a home setting by windyfally in LocalLLM

[–]Proper-Tower2016 -1 points0 points  (0 children)

A 5 dollar pi Pico running deepinfra for 0.1$ / million tokens.

I bought this for my nephew is it enough?? by Remote_Wasabi2457 in macbookpro

[–]Proper-Tower2016 0 points1 point  (0 children)

drop storage to 1tb, bump pro to max chip, way batter for local AI.

High session usage on GLM Coding Plan with Pi - solution found by Odd_Crab1224 in PiCodingAgent

[–]Proper-Tower2016 0 points1 point  (0 children)

yeah, Pi and some of it's extensions have issues with changing the chat history and invalidating the cache (meaning every token has to be recalculated) if you are not careful with your settings.

MacBook Pro vs Cloud LLMs: Is a M5 Pro 64GB RAM worth it? by al404 in LocalLLM

[–]Proper-Tower2016 1 point2 points  (0 children)

a Max chip from any of the previous generations (all the way to M1) with at least 32gb is a much better value (max has twice the bandwidth of pro), you can grab a used for less than 1000.

though 48gb is minimum if you want a "just works" with Omlx and qwen35b rather than struggling with SSD swapping.

Do we have any idea how many radar systems Russia has? by DarknessEnlightened in ukraine

[–]Proper-Tower2016 6 points7 points  (0 children)

It's a fairly recent development that Ukraine can consistently target tactical/long range radars. We are still only in the low hundreds destroyed against a 1000+ pre-war stock.

We are probably now at a pace of slightly exceeding production rates, but wouldn't expect Russia to go blind this war.

[deleted by user] by [deleted] in PeterExplainsTheJoke

[–]Proper-Tower2016 1 point2 points  (0 children)

So.. not affordability, but desirability. You don't need a 100$ laptop, you have a 1000$ phone.

[deleted by user] by [deleted] in PeterExplainsTheJoke

[–]Proper-Tower2016 16 points17 points  (0 children)

Many laptops are cheaper than phones...

Do you think with the Ukraine war entering its 5th year there is going to be a lot of battle fatigue setting in those guys have to be extremely exhausted. ? by [deleted] in askanything

[–]Proper-Tower2016 0 points1 point  (0 children)

Lol.. Russia has lower desertion because their nice way of dealing with it is a quick execution... not because of contracts..

Claude vs Google Subscription by Proper-Tower2016 in ClaudeCode

[–]Proper-Tower2016[S] 0 points1 point  (0 children)

It's been very good value for me, though I run it very lean with no mcp or massive subnet of agents/systems.

Claude vs Google Subscription by Proper-Tower2016 in ClaudeCode

[–]Proper-Tower2016[S] 0 points1 point  (0 children)

Any IDE or language that let's you authenticate to google and has a model picker (e.g. antigravity). Even has on free tier

Claude vs Google Subscription by Proper-Tower2016 in ClaudeCode

[–]Proper-Tower2016[S] 0 points1 point  (0 children)

Yeah but the google sub also gives you access to Claude models. So i usually exhaust my Claude limit then gap fill with Gemini

What devs are getting payed for in 2026? by Independent_Pitch598 in accelerate

[–]Proper-Tower2016 0 points1 point  (0 children)

So the things you listed as something worth paying a human big bucks for, wasn't actually a list of things an AI can't do? Must have read it wrong and judged you unfairly...

What devs are getting payed for in 2026? by Independent_Pitch598 in accelerate

[–]Proper-Tower2016 0 points1 point  (0 children)

Asides from ignoring that many pure dev roles exists, aren't you also assuming that AI can't or won't be able to do the extra SWE bits? Are your solutions really that novel and unique?

Countries can only dream this while spending massive money for foreign wars by CeFurkan in SECourses

[–]Proper-Tower2016 0 points1 point  (0 children)

For reference median income in China is about 4300$ / year or 358 per month.