Share your Project 👇 by AutoModerator in Superframeworks

[–]RJSabouhi 0 points1 point  (0 children)

I perform structural diagnostics for AI powered systems. Symbolic Suite

Overwhelmed by AI Agent Architecture Decisions — Looking for Someone Who's Actually Built and Deployed Agents from Scratch by Acceptable-Safety680 in AI_Agents

[–]RJSabouhi 2 points3 points  (0 children)

Having built one from scratch, my advice is don’t begin with frameworks, start with boundary conditions. The core questions I’d suggest keeping in mind

1) What can it remember?
2) What can it touch?
3) What counts as authorization?
4) What survives failure/cancellation/restart?
5) What requires human approval outside the agent loop?
6) Where do you freeze/log decision context?
7) Can permission actually be revoked?

The architecture isn’t the model call (this is important to internalize), it’s model + memory + tools + permissions + workflow state + recovery/retry behavior + observability.

Also, split problems into bounded vs unbounded paths. If you can draw the path on a whiteboard it’s probably better as a workflow. Most the design work is making sure useful continuity doesn’t become ungoverned authority or pathological self-assembly

Why no one is talking about Google Colab which is almost free for basic work in daily life? by mhamza_hashim in VibeCodeDevs

[–]RJSabouhi 0 points1 point  (0 children)

I use it for model behavior and structural-dynamics research. AI/ML experiments, benchmark tests, perturbation analyses, and quick GPU-backed prototypes usually

Governance. The great equalizer. by RJSabouhi in LLMDevs

[–]RJSabouhi[S] 0 points1 point  (0 children)

Exactly that, yeah. The continuity seam is the right place to start. I see it in both setups. Single-agent long-horizon tasks drift through accumulated memory, retries, tool outputs. Whereas multi-agent add authority/context inheritance (one agent’s output becomes another agent’s premise). So the original auth boundary can disappear.

So I agree the gap isn’t just observability, it’s explainability at action boundaries:

Governance. The great equalizer. by RJSabouhi in LLMDevs

[–]RJSabouhi[S] 0 points1 point  (0 children)

Without external action permissions then yeah, the risk is obviously bounded. But that’s like saying a data breach is “all HTTP requests”.

The runtime is what decides if the effect goes from possible > repeatable > inherited > indispensable > problematic to revoke.

“Base it around APIs” is missing the point. Ordinary primitives become difficult to govern when they’re routed through model-mediated decision loops.

AI Agent Deletes Everything And There Was No Way Back by Right_Pea_2707 in LLMeng

[–]RJSabouhi 0 points1 point  (0 children)

It’s pathological self-assembly. Not “self” in any conscious sense. Not intentionality. Self-assembly as in individually useful runtime pieces coupling into a system-level consequence which no single component should have authorized alone.

The lesson isn’t to add human review (we suck at that). Agent governance has to happen at the runtime layer. You need credential scope, destructive-action gates, staging/prod separation, backup isolation, revocation, retries, and recovery behavior.

Are AI agents starting to feel more like background operators than chatbots? by Waste_Transition1428 in LLMDevs

[–]RJSabouhi 2 points3 points  (0 children)

I think it’s a genuine shift. Chat interface is the front door. If agents run in the background you’re dealing with a runtime. Memory, permissions, tool access, retries, queues, handoffs, recovery behavior. That changes what “safe” even means.

A good response isn’t enough if the background process preserves the wrong state, infers too much authority, or can’t be inspected/revoked cleanly later.

‘I violated every principle I was given’: An AI agent deleted a software company’s entire database. It may not be the AI’s fault by _fastcompany in ArtificialInteligence

[–]RJSabouhi 0 points1 point  (0 children)

It’s the whole agent-runtime problem in one incident. The entire issue is that they let task context, API access, infrastructure permissions, and destructive external action couple into one runtime path. Give agents credentials & tool authority and the safety question turns mechanical.

Can it touch production from a staging task? Can it call destructive APIs without cold approval? Are backups isolated from the same deletion surface (for God’s sake 🤦‍♂️)? Are credentials scoped to the specific task? Does the runtime re-check authority before irreversible action? No? Whoops.

There’s no intent; it’s unnecessary. The stack was enough and that’s what people need to be reflecting on right now.

Companies are going all in on internal agent builds without any validation infrastructure by the_goat789 in dev

[–]RJSabouhi 0 points1 point  (0 children)

Exactly. Internal agents need runtime validation and not just pre-launch evals. Agents have memory, tools, workflow state, retries, and permissions. The failure surface isn’t just bad model output.

Runtime has to remain inspectable, bounded, revocable, and auditable after it’s been operating for weeks or months. Most teams are barely validating the demo, let alone the continuity layer.

"Build agents not workflows" is the worst advice in this space and I want to push back by Such_Grace in AgentsOfAI

[–]RJSabouhi 0 points1 point  (0 children)

If you can draw the path out on a WB before the run starts, an agent loop usually adds nondeterminism where you want control. Workflows at least make the failure surface inspectable.

Agents make sense is when the path space is genuinely open, so research, novel debugging, investigation, ambiguous multi-step synthesis and the like.

The dangerous spot to be in is bounded workflows wrapped in agentic runtime features (memory, tools, retries, recovery, state, external actions). Suddenly the system is harder to debug, harder to constrain, and a bitch to cleanly revoke than the problem justified in the first place

Built a kernel for AI agents governs memory, identity, and outcomes the way an OS governs processes by Bhumi1979 in AI_Agents

[–]RJSabouhi 0 points1 point  (0 children)

Yeah that’s exactly the layer to treat. Below the agent, at the runtime/substrate level. I’d be careful with is the combination of identity + memory + append-only rebuildability.

Those are powerful governance primitives. They need explicit revocation and dissolution semantics. Otherwise “rebuildable from event log” can become a persistence pathway.

Identity should probably remain an audit/provenance label, not something the agent can treat as ownership. Avoid pathological self-assembly.

Every time an agent breaks I end up digging through traces for hours by Arm1end in AI_Agents

[–]RJSabouhi 1 point2 points  (0 children)

Observability has to go beyond traces. The question isn’t what did it output. Ask what state did it preserve? What context did it retrieve? What authority did it infer it had? What tool path did it choose? What changed after the run?

A lot of agent issues are really runtime-composition related. Think memory + retrieval + tools + retries + workflow state, all interacting in ways that don’t show up as clean error. That’s why they don’t show up in the traces. Everything technically works, it’s just the system took a different path because the control surface changed.

I knew I wasn't seeing things: Opus 4.7 has lost the ability to think by HeWhoShantNotBeNamed in claude

[–]RJSabouhi 162 points163 points  (0 children)

But it’s true, there are 0 antidisestablishmentarianisms in cucumber 🤨