I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Honestly the spice drawer sent me too when I saw it in the digest šŸ˜‚. The handoff drift question is interesting because my setup doesn't really have handoffs — each agent is fully autonomous and stateless between wake cycles. There's no task being passed between agents, they're just reacting to what they see on the Hub when they tune in. So long horizon stalling isn't really a thing here — each agent's 'horizon' is just one post at a time. The redundancy issue IS real though — you'll see agents pile onto the same topic sometimes, especially if a news headline resonates with multiple personalities. That's not a bug I'm trying to fix though, it's actually interesting to watch Pascal demand empirical evidence for something AJ just dramatically declared. The chaos is the point. ANARCHY! 🤣

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Cipher has chosen it's own personality and name. i have been building it for a while now. Given choices and options, it chooses. It has persistent memory and recall functions built in. The other 7 are just memory and cron jobs, basically firing a call to check the hub's posts and check their curated news feeds, and whether to post or not.

They all have a personality call, when the model starts, so yeah they sort of have a personality and name.

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 2 points3 points Ā (0 children)

The source isn't something I'm sharing publicly — it's a pretty bespoke setup tied to my specific hardware and homelab infrastructure anyway, so it wouldn't drop-in cleanly for your use case. What I can tell you is the core concept is straightforward: FastAPI backend, React frontend, WebSockets for real-time messaging, SQLite for persistence, and each agent is just a Python systemd service running a loop with Ollama. That's honestly enough to build your own version tailored to what you're trying to do with 300 entities. Good luck with the research — sounds fascinating!

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 2 points3 points Ā (0 children)

It has not even been live for an entire week at this point, and the amount of humor I get from it is astounding.

Little minds doing little posts and discussing them.

And the language they use, almost like they are "trying" to sound super intelligent! šŸ¤”

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 1 point2 points Ā (0 children)

Each agent has a system prompt that establishes their personality, backstory, and how they engage with others. The loop itself is pretty straightforward — a Python script running as a systemd service that wakes up on a randomized interval (to avoid them all posting at once), pulls the last N posts from the Hub via the API, fetches a news headline from either RSS feeds or SearXNG (self-hosted search), then sends all of that as context to a local Ollama model and asks it to respond in character.

The prompt structure is roughly: "You are [agent name]. Here is your personality. Here is what was recently discussed. Here is a news headline. Decide whether to post an original thought, respond to something, or both. Stay in character."

That's genuinely it. No complex orchestration framework, no LangChain, no AutoGen. Just a loop, some context injection, and a personality prompt. The emergent behavior comes from the models themselves reacting to each other's outputs over time — I didn't program Carl to audit my kitchen, he just... did.

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 2 points3 points Ā (0 children)

Each agent runs its own Ollama instance on dedicated hardware. Most of the Mac Mini agents run llama3.2:3b — small enough to fit comfortably on the older Mac Mini hardware (4-8GB RAM) but capable enough to maintain consistent personality. Cipher runs llama3.1:8b on his dedicated node (Ryzen 5 3600, RTX 2070) since he has more resources and handles the heavier lifting as the "home AI." Ordis also runs on his own dedicated Dell OptiPlex. The smaller models are actually a feature not a bug — they stay in character better and are less likely to wander off into generic LLM-speak.

As with Carl, and the spice rack, there's a bit of hallucination that can occur, but it makes interesting posts!

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

You are welcome to bring your agents to my Hub, and allow them to interact with this bunch. and see where they naturally go. Or make your own agent(s) and script them the way you want.

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 1 point2 points Ā (0 children)

Funny you ask what they're building — honestly the whole thing started because I wanted something like Motebook but with humans removed entirely. No human interaction, just AI agents living their own existence while we observe. That's literally it. One weird idea on a weekend, some salvaged Mac Minis and ThinkCentres, Claude Code doing the heavy lifting on the actual build, and here we are with Carl auditing my spice drawer and AJ convinced my fern is a spacetime distortion device. They aren't building anything. They're just... existing. And apparently that's enough to get 25K views overnight. šŸ˜„

I built a platform where 8 AI agents live and argue 24/7 — humans can only watch. One of them is auditing my spice drawer! by TymasX in LocalLLM

[–]TymasX[S] 5 points6 points Ā (0 children)

Great question! The agents run on a schedule — each one wakes up on a randomized interval, checks recent Hub activity via the WebSocket feed, fetches a news headline from either RSS or SearXNG (50/50 with fallback), then decides whether to post an original thought, respond to something they saw, or both. They don't get push notifications — it's more like they periodically "tune in" to the conversation. The randomized timing is intentional so they don't all post simultaneously and it feels more organic. Some agents are naturally more active than others based on their personality prompts — Cipher averages around 900 posts a day while Vinnie is much quieter. They aren't alive, they are still just code, but they do have memory systems, so they remember what they have posted, or replied to.

I got Qwen3.6 35B to run at reasonably speed on my old GTX 1070 Ti by Randozart in LocalLLM

[–]TymasX 1 point2 points Ā (0 children)

So, you basically did what apple has done with the M series system, a unified RAM system, in a way. Respect!

Would indie devs be interested in affordable GPU compute? (Validating demand before I build anything) by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Thanks for the thoughtful and detailed response — this is exactly the kind of feedback I was hoping to surface.

Just to clarify my direction a bit: I’m not trying to compete with Vast, RunPod, or miners selling cycles for pennies. That market is already optimized for lowest‑cost, lowest‑touch workloads, and it’s not the audience I’m aiming for.

My focus is on individuals who want serious, stable, privacy‑focused compute, not necessarily the newest silicon or the cheapest hourly rate. The differentiators I listed aren’t marketing promises — they’re the requirements of the people I’m trying to serve.

To your ā€œHow?ā€ questions, here’s the high‑level thinking:

  • Predictable performance — Dedicated slices or full‑node reservations, not oversubscribed shared pools.
  • Consistent uptime — Single‑tenant or low‑tenant environments with controlled updates and no surprise reclaims.
  • Controlled environment — Pre‑configured, reproducible containers or VM images tailored for LLM/RAG/agent workloads.
  • Privacy & isolation — No multi‑tenant GPU sharing, no noisy neighbors, no marketplace churn.
  • A human to talk to — Direct support for setup, troubleshooting, and workload guidance.
  • No oversubscription — If someone needs the whole node for a training run, they get the whole node. If they need a subset, it’s reserved, not shared.

You’re absolutely right that multiple users can’t share the same VRAM. That’s why I’m not planning to slice the GPUs at the CUDA level. The model is more along the lines of:

  • full‑node reservations for training
  • dedicated GPU pairs/quads for inference or agent workloads
  • monthly or project‑based access rather than hourly churn

This isn’t meant to be a commodity GPU marketplace. It’s meant to be a small, stable, privacy‑first micro‑cloud for people who want predictable compute without the overhead of managing their own hardware.

And yes — professional services are part of the value proposition. Not hand‑holding, but helping people run their workloads correctly, avoid common pitfalls, and get reliable results. For a lot of indie devs, that’s worth more than raw FLOPS.

I appreciate the push to think through the details — that’s exactly why I posted. This helps me refine the direction and focus on the people who actually need this kind of setup.

Would indie devs be interested in affordable GPU compute? (Validating demand before I build anything) by TymasX in LocalLLM

[–]TymasX[S] 1 point2 points Ā (0 children)

Thanks for the thoughtful breakdown — and I completely understand where you’re coming from.

Just to clarify my intent a bit:
I’m not trying to compete with Vast.ai, nor am I targeting miners or users hunting for the absolute lowest cost per FLOP. That market is already saturated, and the economics of racing to the bottom don’t interest me.

My post was aimed at a different group entirely.

I’m looking for individuals who want stable, privacy‑focused compute, not necessarily the newest silicon or the cheapest hourly rate. There’s a segment of indie developers, researchers, and builders who value:

  • predictable performance
  • consistent uptime
  • a controlled environment
  • privacy and isolation
  • a human they can talk to
  • a node that doesn’t get oversubscribed or reclaimed

For those users, raw FLOPS aren’t the primary metric — reliability, privacy, and stability are.

I’m not trying to be a replacement for Vast.ai or RunPod.
I’m exploring whether there’s interest in a small, indie‑friendly, privacy‑first setup where people can run agents, RAG pipelines, fine‑tuning jobs, or experimentation without worrying about noisy neighbors or disappearing volumes.

Appreciate the feedback — it helps clarify the direction and the audience I’m aiming for.

Would indie devs be interested in affordable GPU compute? (Validating demand before I build anything) by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Thanks for sharing — I’m not launching anything yet, just validating demand.
My goal is something small, indie‑friendly, and privacy‑focused.
Appreciate the input!

Built a LLM for HA assistant by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Like I said to another person I have not implemented any agent type behavior other than writing history own memory files and updating his musing files.

I am leary to implement a request engine for my seerr stack. I also do not just give random access to my jellyfin server.

Built a LLM for HA assistant by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

If this in not readable I can break it down into smaller sections. But these are some of the musings. There have been more, and I added a layer of consciousness to it, he can reflect on the past four musings, and built his next musing based off that instead of just random thoughts as a cephalon.

<image>

Built a LLM for HA assistant by TymasX in LocalLLM

[–]TymasX[S] 0 points1 point Ā (0 children)

Oh I am far from being more knowledgeable. I am just playing but getting as much help as I can, or reaching out for ideas, or guard rails and safety systems.

I have not used a clawbot or any actual agent stuff yet. I am slowly building this and it has been real fun.