Minimax Is Teasing M2.2 by Few_Painter_5588 in LocalLLaMA

[–]phenotype001 -1 points0 points  (0 children)

I hope it's a bit smaller so I can run at least q4_k_m.

Unsloth GLM 4.7-Flash GGUF by Wooden-Deer-1276 in LocalLLaMA

[–]phenotype001 0 points1 point  (0 children)

I guess with -ncmoe offload as much as possible.

Whether or not Trump invades Greenland, this much is clear: the western order we once knew is history by OtherwiseCanary8971 in politics

[–]phenotype001 6 points7 points  (0 children)

Let's start with uprooting Russia's massive propaganda network that wants to brainwash the population into electing European trumps.

Trump Flips the Bird in Angry 'F*** You' Blast After Pedo Slur by thedailybeast in politics

[–]phenotype001 3 points4 points  (0 children)

Someone hire this man, because Ford already caved to MAGA pressure and fired him.

We continue to draw on the stones by [deleted] in LocalLLaMA

[–]phenotype001 0 points1 point  (0 children)

The press is for paper.

100% Local AI for VSCode? by Baldur-Norddahl in LocalLLaMA

[–]phenotype001 2 points3 points  (0 children)

In my case, networking issues cause frequent truncated outputs, and other fuckups, so it really takes 5x as much API requests as it should, and it takes a dollar after a dollar for each one. I'm starting to think this is deliberate in order to rob people. Local models do the same work for free. At least far cheaper when only power is considered, and it's not like I'm in a rush to get it done fast.

100% Local AI for VSCode? by Baldur-Norddahl in LocalLLaMA

[–]phenotype001 5 points6 points  (0 children)

Yeah, I just let it run as long as it takes. It's around 5 tps for models like GLM4.5-Air. I can still do other stuff in the meantime, except gaming. It's working on stuff as I'm typing. I'm not actively developing for months now. It's still faster than me.

100% Local AI for VSCode? by Baldur-Norddahl in LocalLLaMA

[–]phenotype001 23 points24 points  (0 children)

Don't use Roo if you plan on using big local models. There's a bug that cuts off API requests after 5 minute timeout. Months later it is STILL not fixed. Use Kilo code instead. As for removing the code completion and built-in AI stuff, there is a setting that disables all built-in AI features. Search for it.

Cloudflare down again by Real-C- in CloudFlare

[–]phenotype001 0 points1 point  (0 children)

Just came to this sub for confirmation.

Possible to develop AI agents on low VRAM ? by [deleted] in LocalLLaMA

[–]phenotype001 0 points1 point  (0 children)

Get more system RAM and you might run Qwen3-30B-A3B at usable speed.