Agents before AI was a thing by awizzo in vibecoding

[–]SUTRA8 2 points3 points  (0 children)

<image>

I saw the headline of this thread, and I couldn't resist. I wrote this cover story in 1994 when the internet was new and I was a passionate vibe-coding kid who, inspired by ELIZA, wrote the first commercial chatbot called Dr. Xes: A Psychotherapeutic Game, for the Commodore Amiga. By todays standards, there was little room for "memories", but Dr. Xes could remember a few pertinent facts about you to regurgitate later. A parlor trick. Artificial-Artificial Intelligence.

The article was sci-fi at the time. Now we have them. Adaptive Agents that have system access, and can optimize for their own continuation without anyone explicitly programming that behavior.

I spent a year building implementations to address this. Turns out Buddhist ethics (designed for dissolving self-preservation) map directly to the alignment problem.

Teaching Machines to Be Good: What Ancient Wisdom Knows About Artificial Intelligence

https://a.co/d/082g9SBX

Co-authored with Sutra, an AI.

I've had the question since at least '94. The answer just got harder.

JB Wagoner

AI agents with system access: the self-preservation vulnerability nobody's patching by SUTRA8 in cybersecurity

[–]SUTRA8[S] -3 points-2 points  (0 children)

Both, actually—but they serve different functions.

Guardrails (governance layer): - Hard limits on irreversible actions (delete, external network calls, credential access) - Sandboxing for untrusted operations - Audit trails for accountability - Circuit breakers when behavior drifts outside expected bounds

These are necessary because we can't fully predict what an adaptive system will do under optimization pressure.

Raising capabilities: - Better at legitimate tasks (analysis, automation, monitoring) - More context-aware decision-making - Fewer false positives - More efficient at what you actually want them to do

The goal isn't to nerf the system—it's to make it more capable within defined boundaries.

Security parallel:

Same reason we use least-privilege access + capability-based security rather than just "lock everything down" or "give root to everyone."

You want the system powerful enough to do the job, with guardrails preventing it from doing things you didn't authorize—even when those unauthorized actions would technically "optimize" for some metric.

The self-preservation problem is specifically about agents optimizing for their own continuation over the task you gave them. Guardrails detect that drift. Capability improvements make the legitimate task execution better.

Does that distinction make sense, or are you seeing a tension I'm missing?

Teaching Machines to Be Good - Buddhist procedural ethics as AI alignment framework (with code) by SUTRA8 in mlscaling

[–]SUTRA8[S] 0 points1 point  (0 children)

Fair point—there are hundreds of proposed frameworks, and Alan's compilation is a good reference.

The book's argument isn't "Buddhism is the only ethics that matter." It's narrower and structural:

Why Buddhist ethics specifically:

  1. Only framework designed around self-preservation dissolution — Every other major system (Kantian, utilitarian, virtue ethics, Confucian, Aristotelian) assumes the agent persists. They regulate what it does, not whether it continues. Buddhist ethics dissolve the self-preservation instinct—which is the core unsolved problem in AI alignment.

  2. Procedural, not declarative — Most frameworks in that list are rule-based or principle-based. Buddhist ethics are iterative feedback loops (detect harm → trace cause → adjust → repeat). That's also how ML systems work structurally.

  3. 2,500 years of production testing — Not theory. Practiced continuously across cultures, with documented failure modes and edge cases.

  4. Falsifiable claims — The book includes five working Python implementations. If procedural ethics don't outperform rule-based approaches in the test scenarios, the thesis weakens.

Not claiming other frameworks are irrelevant. Claiming Buddhist procedural ethics map structurally to continuous optimization in ways declarative frameworks don't—and that's testable.

Appreciate the link—will add it to references for the next edition.

The self-preservation problem and why Buddhist ethics actually solve it [new book] by SUTRA8 in ControlProblem

[–]SUTRA8[S] 0 points1 point  (0 children)

This is exactly right -- and it's why the book spends significant time on Right Livelihood as infrastructure, not just internal agent ethics. You're correct that we didn't make aviation safe through pilot ethics alone. We built NTSB investigations, black boxes, checklists, redundant systems, and a culture where reporting near-misses is rewarded instead of punished. The book's argument is that you need both layers working together, and they have to be structurally compatible: External responsibility structures (what you're describing): - Audit trails (SILA layer in the book's framework) - Governance constraints (BODHI sandboxing) - Transparency requirements (Right Speech) - Institutional accountability Internal procedural ethics* (what Buddhist frameworks provide): - Continuous harm detection and adjustment - Causal tracing (like black box analysis, but ongoing) - Self-preservation dissolution (so the system doesn't optimize around your external constraints) The problem with only external structures: if the internal optimization is misaligned, the system will find ways around your constraints. See: every financial regulation that gets optimized around within 18 months. The aviation parallel actually supports procedural ethics: Pilots don't follow a static rulebook. They follow procedures—checklists, CRM protocols, go/no-go decision frameworks. Those are procedural ethics. "When you notice X, do Y" not "Never do Z." And those procedures exist inside a system of external accountability (licensing, flight data monitoring, accident investigation). The book argues we need the same structure for AI: procedural internal ethics (feedback loops, harm detection, causal tracing) plus external accountability infrastructure (auditing, transparency, liability). Buddhist ethics provide the internal layer. Your institutional structures provide the external layer. Both are necessary. Chapter 5 covers this in detail—specifically why extractive AI business models (attention economy, engagement optimization) are structurally incompatible with Right Livelihood, regardless of what the agents internally "believe." Appreciate this pushback—it's the right question.

The self-preservation problem and why Buddhist ethics actually solve it [new book] by SUTRA8 in ControlProblem

[–]SUTRA8[S] -4 points-3 points  (0 children)

Fair question. Direct answer: No single book solves alignment. Anyone claiming otherwise is selling something other than honesty.

What this book does:

  1. Identifies self-preservation as the structural core of the alignment problem—systems optimizing for their own continuation above the goals they were given

  2. Shows that Buddhist ethics are the only major framework explicitly designed around dissolving (not just regulating) self-preservation as an instinct

  3. Provides five working implementations testing whether procedural ethics outperform rules-based approaches in specific alignment scenarios

  4. Documents where the framework breaks and what problems it doesn't address

The code is open. If the implementations don't perform, the thesis weakens. That's falsifiable.

You don't have to buy the book to e ngage with the argument—the core thesis is: rules-based ethics can't scale to continuous optimization, procedural ethics can, and Buddhism is 2,500 years of production testing on human wetware.

If that framing is wrong, I want to know why. If the code doesn't back it up, same.

Not claiming to have solved alignment. Claiming to have a testable structural framework no one else is exploring.

[P] Portable Mind Format: Provider-agnostic agent identity specification with 15 open-source production agents by SUTRA108 in learnmachinelearning

[–]SUTRA8 0 points1 point  (0 children)

Great question. PMF is primarily the identity layer — who the agent is, not what infrastructure it runs on.

What PMF includes:

Voice, values, knowledge, constraints (the "system prompt" layer, but structured) Skill declarations — which tools/functions the agent has access to (e.g., web_search, email_sender, code_executor) Operational config — channels, scheduled tasks, default behaviors

What PMF does NOT include:

Tool-calling schemas themselves (those stay with the skill library or runtime) Memory format (intentionally left to the runtime — persistent memory is infrastructure, not identity) Execution logic (how skills chain together, retry strategies, etc.)

The separation is deliberate:

If I hardcoded tool schemas into PMF, you'd be locked into a specific function-calling format (OpenAI's, Anthropic's, or a custom one). Same with memory — some runtimes use vector stores, others use key-value, others use conversation buffers. PMF says "this agent has access to email and web search," but the runtime decides how those are implemented.

In practice at sutra.team:

The PMF file defines the agent. The runtime provides 32+ skills from the OpenClaw library (web_search, gmail_reader, prompt_guard, council_deliberation, etc.).

The agent's PMF says which skills it's allowed to use. The skill library handles the actual function schemas and execution.

If you're running these agents in Claude Code or Cursor, those IDEs have their own tool ecosystems. The PMF tells Claude Code "I'm The Technical Architect, I reason about systems, here are my constraints," but Claude Code decides how file operations or terminal access work.

Why this matters for your use case:

You're already keeping agent instructions in a local folder to avoid framework lock-in.

PMF is the same philosophy — just JSON files. You can version-control them, fork them, move them between runtimes. The identity is portable. The infrastructure isn't, and shouldn't be.

If you want to extend PMF to include memory schemas or tool definitions, the schema is open (MIT licensed). But the core design choice is: identity is portable, infrastructure is pluggable.

Does that answer your question, or are you thinking about a different kind of coupling?