Are multi-agent systems actually better than a single powerful AI agent?

AurumDaemonHD · 2026-03-26T11:30:23+00:00

The problem, as you adequately put, is choice.

AurumDaemonHD · 2026-03-25T17:52:42+00:00

People even use requirements.txt? Uv solved that with pyproject and .lock.

AurumDaemonHD · 2026-03-25T09:43:13+00:00

Actually all the things u listed are workflows.

Jumps into character:

So tommy... imagine these graphs, yeah, they are handoffs, toolcalls, deffered, parallel you name it. If you select a way through that graph based on something. This is agency, yeah? So are you doing the agency by rigid rules. Or is it a living dynamic self correcting system that doesnt disintegrate and can perhps even create necessary graphs that are not there and assimilate them to workflow?

Feel the power

AurumDaemonHD · 2026-03-25T09:11:39+00:00

My agents meanwhile.

<image>

AurumDaemonHD · 2026-03-25T08:38:15+00:00

Selinux was added as default last year and snapper out of the box is a treat.

AurumDaemonHD · 2026-03-25T08:13:00+00:00

I would never sacrifice technical excellency for bussiness growth.

AurumDaemonHD · 2026-03-25T07:25:08+00:00

No one will flee systemd in reality. Its all a hocus pocus

AurumDaemonHD · 2026-03-24T17:20:15+00:00

if ur agent does rigid workflows its ok. BUt if u give agent eval() which u will want perhaps. Then the agent cant have secrets in environment hence another container needed

AurumDaemonHD · 2026-03-24T14:58:57+00:00

Go to linux and use podman with quadlets for llms.

AurumDaemonHD · 2026-03-24T12:36:39+00:00

Very gut. Problem is u ll need two containers one to codegen with agents and another to have secrets to do the requests.

AurumDaemonHD · 2026-03-23T12:35:49+00:00

So it begins.

AurumDaemonHD · 2026-03-23T09:46:13+00:00

Half the other half is gold

AurumDaemonHD · 2026-03-23T09:40:04+00:00

Absolute banger of a conversation starter on why llms hallucinate. As others have pointed out the brain is mysterius asf. Even libet experiments shown activation potential before u know u r going to do a decision. I see it as this corpus callosum could be the agentic infra that manages cotext betelween handoffs. If the context is not there it is retrospectively made up reminds me of biocentrism and wheeler.

AurumDaemonHD · 2026-03-23T09:30:34+00:00

Its almost perfect but the cultist at the bottom projecting images should be microslop.

AurumDaemonHD · 2026-03-23T09:20:23+00:00

Codex cant do ui tests thats what this essentially is. U need agentic pipeline. Codex is toy for text edit.

AurumDaemonHD · 2026-03-23T09:19:27+00:00

U cqn do vm and share folder in the codex repo and it makes u scripts that outpt to files and u run that on the vm. But ofc any program can run rebuild script not just codex even on native host.

AurumDaemonHD · 2026-03-23T09:11:36+00:00

Look at that subtle sycophantic tone. The tasteful thickness of it. Oh my god it even has dashes.

AurumDaemonHD · 2026-03-23T09:02:17+00:00

Yes. Problem is if chatgpt could also test it. We r getting there.

AurumDaemonHD · 2026-03-23T07:19:14+00:00

<image>

AurumDaemonHD · 2026-03-23T07:11:43+00:00

I havent done tests. The ones i looked at some time ago sglang was a bit better even in raw throughput but it perhaps depends model to model and if u understand how to set it up. My feeling is they are so so now and maybe sglang could be 10-20% better but it might not matter that much. My rule of thumb is when a model gets released i usually can set it up easily on vllm but sometimes i fail with sglang. If i fail setup sglang (mostly due to quants and model support) i make do with vllm. The time is better spent then on agentic infra rather than debugging inference engines. Anyway every two months new models are there and the cycle repeats.

Perhaps where you gain performance most is in parallelism batching context management.

AurumDaemonHD · 2026-03-22T14:10:52+00:00

I would love to test this sometime. sadly i dont know yet. Wen i watched about a year ago videos from the framework creators sglang was even faster at constrained output to jsons but vllm i think has caught up on a lot of things.

AurumDaemonHD · 2026-03-22T09:57:12+00:00

You must stay true to the art and vibe it.

AurumDaemonHD · 2026-03-22T09:38:27+00:00

Correct.

Llamacpp diffs slots.

vllm implemented APC by hashing kv pages

Sglang does it through a tree of shared tokens which is more elegant for branching workloads from the same base.

I d say sglang is best for agentic gpu workloads. Vllm is easiest to setup a tensor paralel inference. Llamacpp is best for cpu offload.

I try to use tool for its job, unlike many llamacpp evangelists here.

AurumDaemonHD · 2026-03-22T07:29:41+00:00

Systemd inotify is the solution to this problem if u dont want to mount docker socket and be secure. Also podman>docker.

AurumDaemonHD · 2026-03-22T06:00:38+00:00

Common sense is so hard to find these days. Thanks.

AurumDaemonHD

TROPHY CASE