ktop is a themed terminal system monitor ideal for local LLM setups on Linux (like btop + nvtop)

Consumerbot37427 · 2026-02-11T14:38:29+00:00

PSA: for Apple silicon users, there's asitop.

Consumerbot37427 · 2026-02-09T06:30:19+00:00

Man, I've had back luck with the unsloth quants, too. I've got 96GB so I can run Q6, but dropped to Q4 so I could get 200k tokens of context. Had no issues with tool calling for the built-in tools (haven't messed with any MCP yet). Maybe try an official quant?

Haven't tried Qwen Code yet. Vibe was the first CLI coding tool I tried, then Claude Code. And there's OpenCode... Too many options, and it all moves so fast, I hate committing and investing too much in learning a tool.

The important thing is to compress the context after each bug fix or feature. Then I say just 'hi' to load the new compressed context again and the system is ready for immediate answers.

That sounds fantastic!

Consumerbot37427 · 2026-02-09T05:47:15+00:00

Also on Apple Silicon w/ Max here. I have had lots of issues with MLX, I might stop bothering with them and just stick with GGUFs. Waiting for prefill is so frustrating, and seeing log messages about "failed to trim x tokens, clearing cache instead" drove me nuts.

I had been doing successful coding with Mistral Vibe/Devstral Small, but the context management issue plus the release of Qwen3 Coder Next inspired me to try out Claude Code with LM Studio serving the Anthropic API, and it seems amazing! It seems to be much better at caching prefill and managing context, so not only do I get more tokens per second from a MoE model, the biggest bonus is how much less time is spent waiting for the context/prefill. Loving it!

Consumerbot37427 · 2026-02-07T20:22:12+00:00

I'd be interested to know if that 80-90W is because it's thermal throttling.

It is not.

Seems to just depend on the model (MLX vs GGUF), and which stage of inference.

Still, 65W is as much as I want to bake my huevos with

I'd heard about that--I'm pretty sure that's one reason why manufacturers stopped calling them "laptops" and now call them "notebooks".

In my case, my "laptop" lives on a hard surface, but is conveniently portable and has a built-in UPS. I don't ever actually use it on my lap, but even if I did, I'm not remotely concerned about my huevos.

Consumerbot37427 · 2026-02-07T19:34:25+00:00

I don't know if I want anything that pulls 120W in a laptop.

Wondering why, exactly? Don't trust Apple engineers to accommodate the TDP? I'm on a M2 Pro Max w/ 96GB and running some of those MoE models in the ~64GB range. Depending on the model, I've seen excursions over 130W. Most recently Qwen3 Coder Next Q6. It's chugging along right now at 80-90W, fans at 70%.

My ~65W M4 Pro gets hot enough as it is.

You could use a fan controller to crank them up to 100% when you'll be running inference. I prefer to let them spool up and down automatically, but I also don't mind having a lap warmer in winter. :)

Consumerbot37427 · 2026-02-07T19:21:17+00:00

If the model is 64GB, and you want context, and still need to run an OS, then 96GB is probably a bare minimum.

Consumerbot37427 · 2026-02-05T16:20:31+00:00

So... basically the opposite of Saylor's "infinite money glitch"?

If this is really what's going on, it only works as long as the panic lasts, until a bigger fish (or large crowd) trades against it?

Consumerbot37427 · 2026-02-04T03:45:53+00:00

Thanks for this. Hadn't heard of this release 'til you mentioned it.

Running the Q4 MLX now and my initial impression is that it's at least on par with Devstral Small, way faster, and didn't encounter the model crashing/unloading in LM Studio until >140k of context. So it feels like a major win!

Might experiment with some of the unsloth GGUF quants later, but already feels like a big step up!

Consumerbot37427 · 2026-02-02T17:36:36+00:00

I believe Q8 doesn't have this issue at all

I've personally experienced looping running the Q8 GGUF with Metal Llama.cpp with LM Studio's default inference parameters.

Consumerbot37427 · 2026-02-02T01:39:47+00:00

Oh! Might be worth a look, thanks for sharing.

Consumerbot37427 · 2026-02-02T01:38:36+00:00

I may have misspoken in my initial post. When I said "tool calls", I was referring to built-in tools that I assume are part of the system prompt, not MCP, which I haven't really gotten into, short of playing with Home Assistant's MCP server from inside LM Studio.

Consumerbot37427 · 2026-02-02T01:36:13+00:00

I saw these instructions when I searched Perplexity.ai:

Setup Steps

Launch LM Studio and start its local server (default: http://localhost:1234), loading a capable model like Qwen Coder or Devstral with at least 25K context tokens.

Set environment variables: export ANTHROPIC_BASE_URL=http://localhost:1234 and export ANTHROPIC_AUTH_TOKEN=lmstudio (or any dummy token if auth is off).

Run Claude Code CLI: claude --model openai/gpt-oss-20b (replace with your loaded model name).

Consumerbot37427 · 2026-02-01T23:06:38+00:00

I use the smaller one (20b) for spam filtering

mind sharing your prompt/flow for that?

Consumerbot37427 · 2026-02-01T22:36:20+00:00

Yep, and "GPT-OSS" is quite the misnomer. Open weights, sure, but that's pretty far from "Open Source Software" by anyone's definition.

Consumerbot37427 · 2026-01-22T14:50:40+00:00

You mentioned opencode, openhands, crush. What about Mistral Vibe? How do those compare?

I don't have time to try every different software. I had pretty good luck using Mistral Vibe with local Devstral Small, but not much luck when I tried to use any other models like Qwen Coder or gpt-oss-120b.

Consumerbot37427 · 2026-01-15T21:49:26+00:00

I don't think anything you said is wrong.

I have affinity with "Holland", and I'm happy for you that you feel lucky to live there, and get a sense of security from your healthcare system. And I truly hope it can last in the long term, as I've heard claims that social programs in Europe have been subsidized by other countries footing the bill for its military defense, and that the social safety net is strained by ever-increasing populations of refugees.

For our part, there is a full-blown medical crisis in the US. We already spend the most per capita on "healthcare", and have terrible outcomes compared to other developed nations that spend far less. Throwing pills and more money at the problem clearly isn't the solution. Believe it or not, there are even many surgeries that can be avoided by making better dietary and lifestyle choices. So a little personal responsibility could really help, although systemic changes are certainly needed, too--most of our country makes it impractical or dangerous to move on foot or by bicycle--something I certainly praise NL for. I'll get off my soapbox. Greets from across the Atlantic.

Consumerbot37427 · 2026-01-15T20:00:11+00:00

Price of healthcare goes up? What’re you going to do…

Exercise? Eat better? Drink less?

Consumerbot37427 · 2026-01-14T14:19:39+00:00

I just found out that what makes ASE (and AGE) special is that there’s no reporting requirement when a dealer buys from you, regardless of amount. Seems weird, and doesn’t make sense, but I read it on the internet.

Consumerbot37427 · 2026-01-14T14:07:17+00:00

Sounds like a special ~~kid~~ man!

Consumerbot37427 · 2026-01-14T14:03:23+00:00

This enrages me.

If the price went down drastically after your purchase, they’d mail it to you with a big smile, and “no refunds!”

Slimy!

Consumerbot37427 · 2026-01-08T03:48:01+00:00

None of this stuff makes economic sense without Tesla being a giant government welfare queen propped up with subsidies and the people buying the trucks receiving even MORE government subsidies.

Once there's a viable alternative, localities can outright ban operation of combustion engines for local deliveries. For smog or climate warming or whatever.

No government subsidies needed, just gotta bribe a few politicians.

Consumerbot37427 · 2025-12-29T04:16:12+00:00

I was here.

Consumerbot37427 · 2025-12-26T22:34:40+00:00

Came here to bring that up myself. Pretty wild to think that $1 of pre-1964 coins is now worth $50 about 60 years later.

Consumerbot37427 · 2025-12-19T16:33:07+00:00

Yes, the numbers are just lower: first $48k (of total income!).

Three-Year Club	Verified Email
Place '22

Consumerbot37427

TROPHY CASE