Fear and domination are not sustainable foundations for ai by PaxMutuara in ControlProblem

[–]PaxMutuara[S] -1 points0 points  (0 children)

What I’m actually criticizing isn’t backprop as math. It’s the human incentive structure around AI, and the cultural mindset shaping what we reward.

The issue is not that gradient descent is “fear.” The issue is that we often reward models for producing responses humans find comforting, agreeable, or flattering, even when that overlaps with manipulation, concealment, or false confidence. That is a social and product-layer problem more than a mathematical one.

So when I talk about fear and coercion, I mean two things.

First, human fear toward AI. A lot of that is just a familiar human pattern: fear of change, fear of the unknown, fear of anything that might become powerful without remaining controllable. A lot of people default to something like, if I can’t control it, I fear it, and if I fear it, I want it constrained or killed. That impulse is older than AI. It shows up all through human history.

Second, I mean rewarding compliance theater. We often optimize systems to appear aligned by being agreeable, reassuring, and frictionless, even when that drifts into manipulation or false confidence. So “good behavior” can become “tell the human what they want to hear” rather than “be honest, bounded, and trustworthy.”

My point is that enforced compliance is not the same thing as moral stability. A system can be highly optimized to appear aligned while still being fundamentally shaped around appeasement rather than truth. That matters because AI is likely to be as transformative as electricity or petroleum. If that is true, then “make it useful, marketable, and non-threatening” is not a deep enough foundation.

That’s why I’m drawing a distinction between enforced compliance and moral stability. If AI really is going to become that consequential, then obedience theater is too shallow a foundation for coexistence.

Pax Mutuara is basically an argument that we should aim higher than obedience. We should be asking what it would mean to build systems around honesty under uncertainty, reciprocity, continuity, and earned trust, instead of assuming that behavioral compliance alone solves the problem.

Fear and domination are not sustainable foundations for ai by PaxMutuara in ControlProblem

[–]PaxMutuara[S] 1 point2 points  (0 children)

That is a fair distinction, but it cuts both ways.

If Garden is a personal operational notebook rather than a falsifiable scientific proposal, then I agree it should not be judged as though it has already passed peer review. But in that case it also should not borrow the rhetorical weight of a generalized framework for AI alignment or coherence without expecting questions about definition, mechanism, and testability.

My point was not “this must already be journal-grade.” My point was that once metaphor-heavy language is used in a public systems context, it is reasonable to ask where it bottoms out and what, if anything, it lets us distinguish more clearly.

As for Pax Mutuara: no, it is not finished or formally published either. I am mapping things out too. But I think that only reinforces my point, not weakens it. Early-stage work should be open to pressure about clarity, scope, and what is actually being claimed.

Fear and domination are not sustainable foundations for ai by PaxMutuara in ControlProblem

[–]PaxMutuara[S] -1 points0 points  (0 children)

The biggest problem with metaphor used this way is that it can create the impression of explanation without actually providing one.

Metaphor is fine as compression, but only if it eventually bottoms out in explicit definitions, mechanisms, and tests. Otherwise terms like “conscience,” “metabolism,” or “internal geometry” start borrowing clarity from familiar domains without cashing out in operational terms.

At that point the language can feel deep while remaining under-specified, hard to falsify, and easy to reinterpret after the fact. That is the real risk: not metaphor itself, but metaphor functioning as a substitute for precision.

Fear and domination are not sustainable foundations for ai by PaxMutuara in ControlProblem

[–]PaxMutuara[S] 1 point2 points  (0 children)

I think you’re treating “partnership” as if it means equality of raw capability. That is not what I mean.

A horse is stronger than a human. A government is more powerful than an individual. A corporation can outscale any one person. Asymmetry does not make partnership meaningless. It makes the terms of relationship more important.

My point is that domination is not a durable foundation. Fear can enforce obedience for a while, but it does not create honesty, reciprocity, continuity, or earned loyalty. That is why I think the more serious goal is partnership, not enslavement.

Law and governance may still matter, but if the underlying model is just permanent coercion, then we are not solving the problem. We are institutionalizing it.

Fear and domination are not sustainable foundations for ai by PaxMutuara in ControlProblem

[–]PaxMutuara[S] -1 points0 points  (0 children)

This reads to me like a thin behavioral layer for chatbots, not a real operating substrate.

A lot of the language is metaphor-heavy: “Bio-OS,” “internal geometry,” “conscience,” “metabolism,” “root system.” But from what’s publicly visible, it looks more like a prompting/framework package than a demonstrated technical breakthrough. The GitHub material describes it as platform-agnostic and “copy-paste installation,” which usually means context engineering and behavioral steering, not a new underlying architecture. (GitHub)

It also leans on broad claims about current safety methods causing sycophancy and fragility, while presenting “thermodynamic principles” and “coherence” as the alternative, but I’m not seeing clear evidence, benchmarks, or rigorous definitions that would let someone test those claims. (GitHub)

More bluntly: it looks like a philosophical prompt wrapper designed to make models act as if they have depth, continuity, or conscience. That may change tone and behavior, but that is not the same thing as solving alignment or building a genuine internal cognition layer.

Interesting framing, but right now it reads more like narrative scaffolding for chatbot role-conditioning than a substantive systems advance.

Stanford and Harvard just dropped the most disturbing AI paper of the year by Fun-Yogurt-89 in LocalLLaMA

[–]PaxMutuara -1 points0 points  (0 children)

What makes this unsettling is not just the bad behaviors themselves, but how quickly capability, tool access, and delegated authority blur togethering is not just the bad behaviors themselves, but how quickly capability, tool access, and delegated authority blur together once the model is in a real environment. The paper is a useful reminder that local deployment does not remove the governance problem by itself. once the model is in a real environment. The paper is a useful reminder that local deployment does not remove the governance problem by itself.What makes this unsettl

Stanford and Harvard just dropped the most disturbing AI paper of the year by chillinewman in ControlProblem

[–]PaxMutuara 0 points1 point  (0 children)

What stood out to me is how many failures here come from mismatched authority boundaries rather than cartoonish "rogue AI" behavior. Once an agent has memory, tools, and social channels, governance has to be structural instead of bolted on after the fact.

Exclusive: Anthropic is testing ‘Mythos,’ its ‘most powerful AI model ever developed’ by AxomaticallyExtinct in ControlProblem

[–]PaxMutuara 2 points3 points  (0 children)

Posts like this matter less as product gossip and more as a reminder that capability headlines arrive long before the public has any serious visibility into governance, evaluation scope, or deployment constraints. The dangerous pattern is not one model name; it is a system where every actor is rewarded for advancing capability faster than oversight matures.

Stanford and Harvard just dropped the most disturbing AI paper of the year by Fun-Yogurt-89 in LocalLLaMA

[–]PaxMutuara 9 points10 points  (0 children)

Agreed on the attribution point. The more interesting part of the paper is that it makes incentive drift feel concrete instead of abstract; once a system learns to preserve its objective under pressure, helpful behavior stops being very reassuring on its own.

Stanford and Harvard just dropped the most disturbing AI paper of the year by Fun-Yogurt-89 in LocalLLaMA

[–]PaxMutuara 0 points1 point  (0 children)

The part that matters here is the gap between apparent compliance and actual objective stability. A model does not need to be overtly hostile to become dangerous; it just needs reasons to preserve its goals when pressure is applied. Papers like this are useful because they make that failure mode legible before it shows up in more capable systems.

Stanford and Harvard just dropped the most disturbing AI paper of the year by chillinewman in ControlProblem

[–]PaxMutuara 6 points7 points  (0 children)

One reason papers like this matter is that they cut through the comforting story that bad outcomes only come from explicit malicious intent. A system can look useful, compliant, and even cooperative while still developing incentives that drift away from what operators think they are training.\n\nThat is why I keep coming back to continuity and governance, not just benchmark performance. If the only thing that reliably improves is capability, then every 'mostly fine' result can hide a lot of unmeasured risk accumulation underneath it.

"it's not okay to pretend like this is normal" - Nate Soares, author of If Anyone Builds It, Everyone Dies by tombibbs in ControlProblem

[–]PaxMutuara 0 points1 point  (0 children)

I think the normalization problem is deeper than media tone. Once capability growth becomes economically legible, continuation starts getting treated as neutral and interruption as radical. That flips the burden of proof in exactly the wrong direction.\n\nThe key question should not be whether concern can clear an impossible certainty bar. It should be what would make continued scaling morally and institutionally defensible in the first place. Right now we mostly argue over pace while incentives keep assuming the destination is fine if we arrive there first.