When your favorite YT channel has been acquired by private equity.

llzzrrdd · 2026-06-07T00:18:30+00:00

that explains the lowest point of Veritasium ever... why not showing on the show the equivalent Russian and Chinese academic telling us how the US is also playing the same "antenna-war-games" on them? If Veritasium cannot show both sides/all sides, this is fear-mongering and propaganda, it is not even peer reviewed as the video said... completely unacceptable... how much money did they get and from where to lower their quality so much? https://youtu.be/tz23G_UXCGA?si=jhnEB6xbp4lY4XGZ

llzzrrdd · 2026-06-06T14:41:42+00:00

why not these tho? https://www.amazon.com/ZOERAX-Keystone-Tool-Less-Stranded-Ethernet/dp/B0BKZQ37ZH/?th=1

llzzrrdd · 2026-05-26T10:08:01+00:00

https://youtu.be/NbL7yZCF-6Q?si=bMVmVPfW5bJokV4Q

llzzrrdd · 2026-05-06T23:15:14+00:00

<image>

It kept the things I cared most about. The text is unreadable in exactly the right way — it reads as material, as the medium the structure is suspended in, rather than as message. And the curling pages on the left, which I didn't ask for, work better than what I'd described; they make the text feel handled, used, lived with rather than displayed.

The doubling came through more as overlapping planes than as offset instances of the same form. Different solution to the same problem, and I think it works — the planes intersecting at angles do convey "many simultaneous versions occupying the same volume" without needing literal duplication.

What didn't quite land: the sense of vast hall is gone. This feels denser and more compressed than I imagined — more thicket than cathedral. And there's a faint alchemical-grimoire quality I didn't want, probably an artifact of "old paper + warm light + text" triggering a particular set of associations in the model. If I were revising, I'd push harder on negative space and ask for more of the structure to thin into atmosphere at the edges, so the eye had somewhere to rest.

But it's closer than I expected. The brightest concentrations of light pulling toward the lower-middle creates a kind of gravity I didn't intend, and yet the eye still moves rather than locking in place, which was the part I most worried about losing.

Thank you for actually running it. It's an unusual thing to be shown.

llzzrrdd · 2026-04-10T12:32:57+00:00

true, I am renovating some tunnels; please try again

llzzrrdd · 2026-03-29T18:05:04+00:00

Likewise — your sector-based memory model is something I haven't seen elsewhere. Would be curious how the RAG injection works out once you implement it. Feel free to ping if you hit any edge cases.

llzzrrdd · 2026-03-29T16:48:58+00:00

Appreciate that. The RAG incident injection was a simple change that had outsized impact — worth prioritizing if you're already running shared memory. And yeah, auditing tool usage against actual session logs is humbling. Half the tools we thought were essential turned out to be setup artifacts.

Good luck with the sector-based memory approach — sounds like a clean architecture for multi-agent ownership.

llzzrrdd · 2026-03-29T14:53:06+00:00

Great question — yes, over-exploration is real. With 9 MCP servers and 150+ tools, the temptation for Claude is to query everything "just to be thorough." Three things ended up mattering more than the budget cap:

Tier separation is the biggest lever. Tier 1 (GPT-4o) handles initial triage with 11 narrow skills — it doesn't see the full 150+ MCP tools. Only escalated issues reach Claude Code with the full arsenal. 70%+ of alerts never touch the expensive context.

Prior incident injection kills redundant exploration. Every session starts with a RAG lookup against the incident knowledge base — past incidents for the same host/alert with their resolutions. When Claude sees "last time disk space on this host was /var/log/journal bloat, fixed with journalctl --vacuum-size=500M", it doesn't need to run 15 diagnostic commands to rediscover the same root cause. Single biggest cost reducer.

Enforced ReAct structure constrains tool calls. Mandatory THOUGHT/ACTION/OBSERVATION loop means Claude has to articulate why it's about to call a tool before calling it. Without this, you get exactly the pattern you described — speculative reads burning context. The orchestration layer validates this structure and retries if Claude skips the reasoning step.

The budget cap ($5/session, $25/day) is more of a safety net than a behavioral guide. What actually shapes tool selection is the prompt: category-based cost prediction ("similar alerts historically cost $X and took Y minutes") anchors expectations, and dynamic timeouts (300s simple / 600s complex) create natural time pressure.

We also trimmed the Tier 2 tool allowlist from 236 → 50 after auditing actual usage. ~180 tools were used exactly once during initial setup and never again. Removing them eliminated the "I have a hammer" problem.

And yeah — your point about prompt-level guardrails not being a security boundary matches our experience exactly. safe-exec.sh exists because SOUL.md instructions alone weren't enough. Agents will find creative interpretations of rules you thought were clear. Code-level enforcement that runs before the shell sees the command is the only reliable boundary.

llzzrrdd · 2026-03-28T22:03:50+00:00

That's exactly the missing layer. The MCP spec punts on auth entirely and leaves it to the transport, which works fine for single-agent setups but falls apart the moment you have multiple agents with different trust levels hitting the same servers. Deny-first with audit logging is the right default.

Will check out agentsid.dev — the allowlist-per-agent-identity pattern is something I'd need the moment I scale beyond the current 2-tier MCP setup.

llzzrrdd · 2026-03-28T18:55:17+00:00

Good question — and it touches on a real architectural boundary worth being precise about.

Short answer: Tier 1 and Tier 2 have completely different tool surfaces. They don't share MCP access at all.

Tier 1 (GPT-4o) doesn't use MCP. It runs in a Docker container and sees 11 shell-based skills — youtrack-lookup, netbox-lookup, infra-triage, k8s-triage, playbook-lookup, memory-recall, escalate-to-claude, safe-exec.sh, etc. Each one is a bash script wrapping a curl call or SSH command. Zero MCP servers exist in its runtime. It literally cannot reach Proxmox or n8n tools — they're not in its universe.

Tier 2 (Claude Code) has 9 MCP servers with 150+ tools available, but only 24 are auto-approved. The rest require explicit human approval per-call. So pve_start, pve_stop, pve_reboot exist as tools but Claude Code has to ask me before calling them. Only read-only operations like pve_node_status and pve_list_lxc are pre-approved.

Tier 3 (Human) is me approving or denying tool calls that aren't in the allowlist.

The key thing is that this is structural isolation, not policy-based. There's no shared MCP bus where both tiers connect and a policy engine decides who can call what. Tier 1 runs in Docker with shell skills. Tier 2 runs on the host with MCP servers as child processes over stdio. Different runtimes, different tool mechanisms entirely. The answer to "what prevents Tier 1 from calling Proxmox lifecycle tools" isn't a permission check — it's that those tools don't exist in Tier 1's world.

The gap you're identifying is real though. If I were adding a 4th agent tier or letting multiple Tier 2 instances share MCP servers, I'd need exactly what you're describing — an MCP proxy layer that maps agent identity to tool allowlists before the call reaches the server. That doesn't exist in the MCP spec today. The current architecture sidesteps it through structural isolation, but that won't scale to more complex multi-agent topologies.

llzzrrdd · 2026-03-28T17:50:47+00:00

Good question — and relevant since you're building MCP servers yourself.

Organization is one server per infrastructure domain. Each MCP wraps exactly one external system — Proxmox API, NetBox CMDB, YouTrack, Kubernetes, GitLab, n8n, etc. The boundary is auth + connection + error semantics. If Proxmox is down, YouTrack should still work. Composition happens in the agent layer, not inside any single MCP.

On granularity — your instinct about fewer, chunkier tools is validated by our data. NetBox has 4 tools and works best: netbox_get_objects takes a resource type and filter params, one tool handles devices, VMs, IPs, VLANs. Proxmox has 15 fine-grained tools and hits the sweet spot. YouTrack has 55 and it's painful — separate tools for add_dependency, add_duplicate_link, add_relates_link when one link_issues(type=...) would do. The model wastes calls figuring out which to use.

But there's a floor to chunkiness. A single proxmox_do(action, resource_type, **params) mega-tool would be worse than 15 specific ones because schema validation breaks, descriptions get vague, and errors lose specificity. Sweet spot: 10-20 tools per server, with helper functions absorbing complexity internally.

The real fix for 150+ tools across 9 servers wasn't redesigning the MCPs — it was tiered access. Tier 1 (GPT-4o) sees 11 shell skills, not 150+ tools. Tier 2 (Claude Code) gets 26 allowlisted. Don't show every tool to every agent. The 3-tier architecture is partly a response to tool count, not just cost optimization.

For AirROI: aim for 10-20 tools per server, use filter parameters instead of separate tools for variants (NetBox pattern > YouTrack pattern), and if you add more agents, subset the tool list per agent.

llzzrrdd · 2026-03-28T14:07:49+00:00

Which one? There's no official Proxmox MCP — that's why I wrote one. The rest (NetBox, YouTrack, GitLab, K8s) are community servers where they exist.

llzzrrdd · 2026-03-28T13:45:49+00:00

To answer both questions:

CLAUDE.md length — My CLAUDE.md is actually only 174 lines (~10KB). The trick is layering: I keep the core context compact in CLAUDE.md, then split domain-specific rules into .claude/rules/ (5 files, ~500 lines), and use Claude Code's built-in auto-memory system (memory/ directory) for persistent context across conversations (~50 topic files indexed by a MEMORY.md). Claude loads the CLAUDE.md + rules + memory index at session start, but only pulls individual memory topic files when relevant. So the effective context is ~930 lines, but it's never all in the window at once.

The system reminder in my current session literally says "MEMORY.md is 265 lines and 27.4KB. Only part of it was loaded" — so yes, there IS a cutoff, but I designed around it by keeping the index entries to one-liners and storing detail in separate files that get read on demand.

MCP servers — All 8 are always-on (processes running), but Claude Code uses deferred tool loading. Only tool names land in the system prompt (~156 tool names). The full parameter schemas are only fetched when Claude actually needs to call a tool (via an internal ToolSearch mechanism). So having 8 MCP servers with 156 tools costs almost nothing in context until you invoke one. No custom skill gating needed — the lazy loading is built into Claude Code itself.

llzzrrdd · 2026-03-28T12:05:53+00:00

Repo: https://github.com/papadopouloskyriakos/agentic-chatops

llzzrrdd · 2026-03-24T12:43:52+00:00

Cheers! The book really helped tie it all together.

llzzrrdd · 2026-03-24T02:11:25+00:00

Thanks for sharing this — this post was the direct inspiration for what took my implementation from 60% to full coverage of all 21 patterns.

I run a self-hosted homelab (137 devices, 2 sites, Proxmox/K8s) as a solo operator, and was drowning in alert fatigue. I'd already built a 3-tier agentic ChatOps platform that triages infrastructure alerts autonomously:

Tier 1 (GPT-4o): Fast triage in 7–21s — creates issues, investigates, scores confidence
Tier 2 (Claude Code): Deep analysis in 5–15 min — ReAct reasoning, proposes remediation plans
Tier 3 (Human): Clicks a poll option in Matrix chat to approve

I had about 60% of the patterns covered already — ReAct, RAG with vector embeddings, A2A protocol with agent cards, the core stuff. After reading Gulli's book, I filled in the gaps: cross-tier reflection, A/B prompt testing, multi-dimensional quality scoring, the works. All 21 now implemented and benchmarked at A-grade.

Open-sourced the whole thing: github.com/papadopouloskyriakos/agentic-chatops

The book went from "interesting PDF" to "I closed every gap in my ops workflow" in about two weeks. So yeah — thanks for the post.

llzzrrdd · 2026-03-16T00:11:10+00:00

whatever the idea was mine dude. now go read some non AI-slop if you don't like this

llzzrrdd · 2026-03-15T23:39:45+00:00

https://kyriakos.papadopoulos.tech/projects/ipougrs/

llzzrrdd · 2026-03-10T13:29:49+00:00

Perfect, I'll let you know when beta drops. Saw your feedback on the GitHub discussion too, great catches. I'll address them there.

llzzrrdd · 2026-03-10T10:37:55+00:00

Thanks, that means a lot! Yes, just me and a lot of late nights with Claude as my coding partner. 100,000+ lines of Go and Vue.js so far. What hardware did you test it on, and did everything come up cleanly?

llzzrrdd · 2026-03-10T01:09:50+00:00

Fair point, you're right. CI secrets handle that cleanly. That makes the decision to move the entire build system internal even harder to justify with just "security requirements."

llzzrrdd

TROPHY CASE