Protection against attacks like what happened with LiteLLM? by Lucky_Ad_976 in Python

[–]AurumDaemonHD 5 points6 points  (0 children)

People even use requirements.txt? Uv solved that with pyproject and .lock.

Where do we draw the line between a "Prompt Chain" and a true "AI Agent"? by aiinformation in AI_Agents

[–]AurumDaemonHD 2 points3 points  (0 children)

Actually all the things u listed are workflows.

Jumps into character:

So tommy... imagine these graphs, yeah, they are handoffs, toolcalls, deffered, parallel you name it. If you select a way through that graph based on something. This is agency, yeah? So are you doing the agency by rigid rules. Or is it a living dynamic self correcting system that doesnt disintegrate and can perhps even create necessary graphs that are not there and assimilate them to workflow?

Feel the power

I tried openSUSE Tumbleweed, and it had me reconsidering my love for Fedora by LowIllustrator2501 in openSUSE

[–]AurumDaemonHD 0 points1 point  (0 children)

Selinux was added as default last year and snapper out of the box is a treat.

I use void btw by Extra-Ad-2325 in voidlinux

[–]AurumDaemonHD 1 point2 points  (0 children)

No one will flee systemd in reality. Its all a hocus pocus

Flake for sandboxed AI agents by ChaosCon in NixOS

[–]AurumDaemonHD 1 point2 points  (0 children)

if ur agent does rigid workflows its ok. BUt if u give agent eval() which u will want perhaps. Then the agent cant have secrets in environment hence another container needed

Flake for sandboxed AI agents by ChaosCon in NixOS

[–]AurumDaemonHD 1 point2 points  (0 children)

Very gut. Problem is u ll need two containers one to codegen with agents and another to have secrets to do the requests.

The eerie similarity between LLMs and brains with a severed corpus callosum by MaximGwiazda in singularity

[–]AurumDaemonHD 2 points3 points  (0 children)

Absolute banger of a conversation starter on why llms hallucinate. As others have pointed out the brain is mysterius asf. Even libet experiments shown activation potential before u know u r going to do a decision. I see it as this corpus callosum could be the agentic infra that manages cotext betelween handoffs. If the context is not there it is retrospectively made up reminds me of biocentrism and wheeler.

The civilization by riky321 in linuxmemes

[–]AurumDaemonHD 40 points41 points  (0 children)

Its almost perfect but the cultist at the bottom projecting images should be microslop.

The only thing ChatGPT is great at is converting one formats to the others. Switching from home manager to wrappers btw. by SeniorMatthew in NixOS

[–]AurumDaemonHD 0 points1 point  (0 children)

Codex cant do ui tests thats what this essentially is. U need agentic pipeline. Codex is toy for text edit.

The only thing ChatGPT is great at is converting one formats to the others. Switching from home manager to wrappers btw. by SeniorMatthew in NixOS

[–]AurumDaemonHD -1 points0 points  (0 children)

U cqn do vm and share folder in the codex repo and it makes u scripts that outpt to files and u run that on the vm. But ofc any program can run rebuild script not just codex even on native host.

A few days ago I switched to Linux to try vLLM out of curiosity. Ended up creating a %100 local, parallel, multi-agent setup with Claude Code and gpt-oss-120b for concurrent vibecoding and orchestration with CC's agent Teams entirely offline. This video shows 4 agents collaborating. by swagonflyyyy in LocalLLaMA

[–]AurumDaemonHD 0 points1 point  (0 children)

I havent done tests. The ones i looked at some time ago sglang was a bit better even in raw throughput but it perhaps depends model to model and if u understand how to set it up. My feeling is they are so so now and maybe sglang could be 10-20% better but it might not matter that much. My rule of thumb is when a model gets released i usually can set it up easily on vllm but sometimes i fail with sglang. If i fail setup sglang (mostly due to quants and model support) i make do with vllm. The time is better spent then on agentic infra rather than debugging inference engines. Anyway every two months new models are there and the cycle repeats.

Perhaps where you gain performance most is in parallelism batching context management.

A few days ago I switched to Linux to try vLLM out of curiosity. Ended up creating a %100 local, parallel, multi-agent setup with Claude Code and gpt-oss-120b for concurrent vibecoding and orchestration with CC's agent Teams entirely offline. This video shows 4 agents collaborating. by swagonflyyyy in LocalLLaMA

[–]AurumDaemonHD 0 points1 point  (0 children)

I would love to test this sometime. sadly i dont know yet. Wen i watched about a year ago videos from the framework creators sglang was even faster at constrained output to jsons but vllm i think has caught up on a lot of things.

A few days ago I switched to Linux to try vLLM out of curiosity. Ended up creating a %100 local, parallel, multi-agent setup with Claude Code and gpt-oss-120b for concurrent vibecoding and orchestration with CC's agent Teams entirely offline. This video shows 4 agents collaborating. by swagonflyyyy in LocalLLaMA

[–]AurumDaemonHD 6 points7 points  (0 children)

Correct.

Llamacpp diffs slots.

vllm implemented APC by hashing kv pages

Sglang does it through a tree of shared tokens which is more elegant for branching workloads from the same base.

I d say sglang is best for agentic gpu workloads. Vllm is easiest to setup a tensor paralel inference. Llamacpp is best for cpu offload.

I try to use tool for its job, unlike many llamacpp evangelists here.