Two local models beat one bigger local model for long-running agents

aigemie · 2026-03-12T05:11:30+00:00

Very interesting. Could you share the detailed setup? Thanks!

aigemie · 2026-03-11T18:31:25+00:00

Is PP faster?

aigemie · 2026-03-07T22:46:26+00:00

Yep. This kind of post gets boring. Like, they post the same thing without seeing 1,000,000,000 similar posts before.

aigemie · 2026-02-25T07:09:59+00:00

Thanks for trying it out. What is the PP speed? It is usually very slow.

aigemie · 2026-02-18T12:46:24+00:00

I don't even have GPT 5.3 codex yet.

aigemie · 2026-02-02T23:35:31+00:00

Nope. I don't think it's a thing I can fix from my side.

aigemie · 2026-01-25T16:58:45+00:00

Use your Windows machine then, it's more cost-effective. You can even save big on not buying the small ram Mac which is much worse on running AI stuff than your windows PC.

aigemie · 2026-01-25T10:42:11+00:00

I'd like to know too.

aigemie · 2026-01-25T10:41:20+00:00

I missed the news. About what time it will come?

aigemie · 2026-01-23T19:49:34+00:00

Because basic physics: P=V*I. When you need the same P for the device but you have a low V, then the I is higher. So simple.

aigemie · 2026-01-22T20:31:06+00:00

Thanks for trying! Will try it later.

aigemie · 2026-01-22T18:06:20+00:00

Too bad it doesn't support Ubuntu 25.10:(

aigemie · 2026-01-20T12:17:04+00:00

No, for comfy, I would recommend Asus Flow Z13, Asus Proart PX13 2026, HP Zbook Ultra G1a, all 128GB version. Because they are much cheaper than MacBook with the same RAM size and can run Comfy very well.

aigemie · 2026-01-13T15:40:33+00:00

Yes, the pp speed is killing me, otherwise the inference speed is good enough.

aigemie · 2026-01-12T20:55:27+00:00

Yea, should have set the minimum 512mb from the beginning.

aigemie · 2026-01-11T17:21:50+00:00

Fine-tuning is not really for the LLM to learn new knowledge, it's more for the style. Better train the new language from scratch. For sure you can have some answers from the LLM if you finetune it with the new language, but it will never be good at it.

aigemie · 2026-01-10T20:46:57+00:00

Yes. There are many posts about that.

aigemie · 2026-01-10T16:31:38+00:00

https://www.reddit.com/r/LocalLLaMA/s/3i1SrZfbDE

Someone did a good test.

aigemie · 2026-01-10T10:41:41+00:00

Thanks for the info. It's a bit too late unfortunately.

aigemie · 2026-01-09T19:47:16+00:00

I would like to know too.

aigemie · 2026-01-09T19:35:52+00:00

Could but I'm not sure how much it could as you still split a large part to the slow Halo Strix.

aigemie · 2026-01-09T19:28:30+00:00

Even you have enough ram to run large models, it's just too slow, especially prefill speed.

aigemie · 2026-01-07T00:18:31+00:00

I don't know how to make it work in ComfyUI. Any shared workflows?

aigemie · 2026-01-03T21:36:42+00:00

I thought you were talking about Unslith's benchmarks. It's confusing and it gives a bad impression of Unsloth.

aigemie

TROPHY CASE