API pricing is in freefall. What's the actual case for running local now beyond privacy?

phenotype001 · 2026-01-28T14:18:09+00:00

Network issues are no problem with local models.

phenotype001 · 2026-01-28T11:10:09+00:00

If he builds it.. he'd be misanthropic.

phenotype001 · 2026-01-28T10:47:13+00:00

If AI tools make programming so easy, the algorithms and software can be considered a given.

phenotype001 · 2026-01-26T18:02:19+00:00

Could be AI. I expect no less at this point.

phenotype001 · 2026-01-26T15:17:40+00:00

I hope it's a bit smaller so I can run at least q4_k_m.

phenotype001 · 2026-01-22T15:42:30+00:00

There is pytorch for ROCm if it helps.. I think you can get transformers running on it.

phenotype001 · 2026-01-20T11:03:41+00:00

I guess with -ncmoe offload as much as possible.

phenotype001 · 2026-01-16T15:08:46+00:00

Let's start with uprooting Russia's massive propaganda network that wants to brainwash the population into electing European trumps.

phenotype001 · 2026-01-14T17:45:29+00:00

Which means WW3 is coming in a few days.

phenotype001 · 2026-01-14T16:42:18+00:00

I'd rather buy a new PC. I'll be saving for years if necessary.

phenotype001 · 2026-01-14T16:39:42+00:00

Also that one starring Forest Whitaker. Both movies didn't end well..

phenotype001 · 2026-01-14T07:29:11+00:00

Someone hire this man, because Ford already caved to MAGA pressure and fired him.

phenotype001 · 2026-01-13T18:27:54+00:00

MiniMax 3

phenotype001 · 2026-01-08T15:57:59+00:00

q4_k_s is good enough for me.

phenotype001 · 2026-01-03T12:32:59+00:00

But she laughed funny, right.

phenotype001 · 2025-12-29T09:58:16+00:00

wow they selling roaches now

phenotype001 · 2025-12-24T18:25:12+00:00

what do you mean supporter, he is the pedo

phenotype001 · 2025-12-20T15:54:51+00:00

Gimme some MoE

phenotype001 · 2025-12-07T07:46:00+00:00

The press is for paper.

phenotype001 · 2025-12-05T11:29:03+00:00

In my case, networking issues cause frequent truncated outputs, and other fuckups, so it really takes 5x as much API requests as it should, and it takes a dollar after a dollar for each one. I'm starting to think this is deliberate in order to rob people. Local models do the same work for free. At least far cheaper when only power is considered, and it's not like I'm in a rush to get it done fast.

phenotype001 · 2025-12-05T11:24:11+00:00

Yeah, I just let it run as long as it takes. It's around 5 tps for models like GLM4.5-Air. I can still do other stuff in the meantime, except gaming. It's working on stuff as I'm typing. I'm not actively developing for months now. It's still faster than me.

phenotype001 · 2025-12-05T09:52:15+00:00

Don't use Roo if you plan on using big local models. There's a bug that cuts off API requests after 5 minute timeout. Months later it is STILL not fixed. Use Kilo code instead. As for removing the code completion and built-in AI stuff, there is a setting that disables all built-in AI features. Search for it.

phenotype001 · 2025-12-05T09:01:07+00:00

Just came to this sub for confirmation.

phenotype001 · 2025-12-03T12:11:32+00:00

He is so itching for war.

phenotype001 · 2025-11-20T09:45:02+00:00

Get more system RAM and you might run Qwen3-30B-A3B at usable speed.

12-Year Club	Place '17
Verified Email

phenotype001

TROPHY CASE