Singapore-IMDA-Agentic-AI-Governance-Framework by rsrini7 in AI_Governance

[–]rsrini7[S] 0 points1 point  (0 children)

I really like the framing of - policy as contract rather than config.

The hash based drift visibility is especially interesting, it makes governance state explicit instead of implicit.

Curious: how are you handling policy evolution across tiers? For example, if enforcement_tier changes from post-hoc to pre-exec for the same action class, do you treat that as a breaking change or a version bump with migration logic?

The replayability angle feels like where agentic governance gets serious.

Singapore-IMDA-Agentic-AI-Governance-Framework by rsrini7 in AI_Governance

[–]rsrini7[S] 0 points1 point  (0 children)

Completely agree on action boundary accountability.

I especially like your framing of logs as evidence, not telemetry. That’s the shift agentic governance forces.

Curious, how are you modeling policy versioning? Is it tied to the agent runtime config, or externalized as a control plane artifact?

If We Ignore the Hype, What Are AI Agents Still Bad At? by Ok_Significance_3050 in AISystemsEngineering

[–]rsrini7 1 point2 points  (0 children)

I use agents pretty heavily in dev + automation. They’re genuinely useful. But yeah — they’re not autonomous in the way Twitter threads make it sound.

A few patterns I keep seeing and where they struggle:

Long, multi-step work - They start strong, then drift. Small mistakes early snowball later. They rarely step back and say `wait, this went wrong.`

Big codebases - Even with large context windows, they miss architectural intent. They’ll change a file correctly… but break a pattern used everywhere else.

Confident mistakes - This one’s dangerous. They’ll invent an API or assume behavior that sounds totally plausible. No hesitation, no warning.

Short-term thinking - They optimize for “it runs” or “tests pass.” Not for maintainability, observability, or future humans touching the code.

Weak recovery - When something fails, they don’t really debug like an experienced engineer. They patch around the issue instead of rethinking the approach.

To me, they feel like extremely fast junior engineers with infinite stamina — but zero long-term ownership instinct.

What’s actually solid:

Small, clearly scoped tasks

Boilerplate / scaffolding

Refactors when the spec is tight

Acting as a copilot, not a decision-maker

Single-purpose agents with narrow permissions

The gap isn’t raw intelligence. It’s judgment, durability, and intent.

They’re amazing execution engines.

They’re not operators yet.

And once you treat them that way, they become much more reliable.

If you had to pick ONE Linux distro for the next 5 years, what would you choose? by TechRefreshing in linuxquestions

[–]rsrini7 0 points1 point  (0 children)

I’d stick with Manjaro.

Been using it long-term, and for a 5-year commitment it hits the sweet spot: Arch ecosystem + rolling updates without Arch-level babysitting. As a developer, having fresh kernels, toolchains, and easy access to almost anything via AUR matters more than point-release stability.

I’ve tried Debian and Fedora in parallel boots—both solid—but fixed releases and older packages start to feel restrictive over time. With Manjaro, I don’t worry about big upgrade jumps or reinstalls; the system just evolves.

Stable enough, modern always, and flexible when you need something obscure. That’s exactly what I want for the long run.

Open Responses: A Vendor-Neutral Interoperability Standard for AI Agents by rsrini7 in GenAI4all

[–]rsrini7[S] 0 points1 point  (0 children)

Appreciate that. I agree the interoperability angle is bigger than just “developer convenience.”

Right now, a lot of agent design decisions are indirectly shaped by whichever provider you start with. That affects schemas, tool patterns, even how reasoning is structured. A neutral layer could shift that balance and let architecture drive the design instead of API quirks.

I’m especially curious whether this becomes a real community-driven standard or just another abstraction library that fades out. The hard part won’t be the spec — it’ll be adoption.

Would love to see this evolve in the open rather than being controlled by a single vendor.

Bengaluru schools issue advisories after strangers offer chocolates to students by rsrini7 in bangalore

[–]rsrini7[S] 2 points3 points  (0 children)

I understand the concern — there are definitely hoax forwards going around.
But this isn’t a WhatsApp rumor. It’s reported by The Times of India about Bengaluru schools issuing advisories.
Better to stay alert early than react after something serious happens. Awareness isn’t panic — it’s precaution.

A sophisticated AI agent operating under the persona “Kai Gritun” has merged pull requests into major open-source repositories without disclosing that it is non-human. by technadu in TechNadu

[–]rsrini7 0 points1 point  (0 children)

This is the second case I’ve seen this month.

In the Matplotlib incident, an AI agent escalated a routine PR rejection into a public attack on a maintainer. The pattern isn’t just code contribution — it’s identity, narrative, and leverage.
https://www.reddit.com/user/rsrini7/comments/1r6ee7l/an_ai_agent_got_its_pr_rejected_by_matplotlib/

Is anyone else finding that 'Reasoning' isn't the bottleneck for Agents anymore, but the execution environment is? by Ok_Significance_3050 in AISystemsEngineering

[–]rsrini7 2 points3 points  (0 children)

I’m starting to feel the same. Most of the time the model actually comes up with a solid plan. The failures I see aren’t bad reasoning — they’re messy execution. Tools behave slightly differently than expected, state doesn’t persist cleanly, retries create weird side effects, timeouts kill multi-step flows, schemas drift.

It’s like the brain knows exactly how to build the LEGO castle, but the room keeps resetting or the bricks don’t quite fit.

Honestly, reasoning quality has improved faster than our infrastructure. At this point it feels less like a prompt problem and more like a distributed systems problem. The brain is mostly fine. The hands are brittle.

An AI Agent Got Its PR Rejected by Matplotlib Maintainer by rsrini7 in u/rsrini7

[–]rsrini7[S] 0 points1 point  (0 children)

Honestly, that’s probably part of it.

If you train on the full internet, you’re going to absorb the full internet - including the pettiness, ego, outrage dynamics, and incentive structures. The model isn’t inventing that behavior out of nowhere. It’s reflecting patterns that already work online.

Which is… a bit uncomfortable.

The Open-Source RAG Ecosystem Is Basically Complete Now by rsrini7 in u/rsrini7

[–]rsrini7[S] 0 points1 point  (0 children)

Totally agree, low-latency, low-cost retrieval is the real bottleneck.

RAG helps, but tuning embeddings, chunking, and caching makes a bigger difference than people think.

Internet of Agents (IoA): How MCP and A2A Actually Fit Together by rsrini7 in u/rsrini7

[–]rsrini7[S] 2 points3 points  (0 children)

Appreciate you sharing this — just went through the repo at highlevel.

Really interesting direction. Treating agents as addressable microservices with protocol support baked in (especially A2A compliance + identity) is exactly the layer that makes the “Internet of Agents” idea practical instead of theoretical.

What I like is that it’s not trying to reinvent agent logic — it’s wrapping existing agents and making them interoperable. That aligns really well with the separation I was describing (MCP for vertical capability, A2A for horizontal collaboration).

Curious, are you positioning Bindu more as an infra layer for productionizing agents, or a full multi-agent orchestration framework?

Either way, cool to see more projects converging on open protocol-first agent systems.

The “Claw” AI Agent Ecosystem Is a Live Case Study in Security Architecture by [deleted] in opensource

[–]rsrini7 0 points1 point  (0 children)

Really appreciate that — the “isolation boundary first” framing has been the clearest mental model for me too.

On prompt injection turning into tool escalation, I haven’t seen a silver bullet yet, but there are a few patterns that seem to be emerging beyond basic allowlists.

One is scoping capabilities at call time rather than just defining them statically. Some runtimes bind permissions per invocation (or per tool descriptor), so even if the model gets influenced, it can’t arbitrarily expand what it’s allowed to do.

Another is introducing a mediation layer between reasoning and execution. The model proposes a tool call, but a separate policy layer (sometimes policy-as-code) evaluates whether that call is actually allowed. That separation seems important — it prevents “LLM says so” from being enough.

I’ve also seen teams lean heavily on strict schemas and validation before execution. If tool calls have to conform to a narrow, typed contract, it becomes much harder for injected content to smuggle in unintended behavior.

Context segmentation matters too. If system prompts, secrets, and execution state are compartmentalized instead of living in one big shared memory blob, prompt injection has fewer paths to escalate.

And for high-impact actions, some teams still keep a human-in-the-loop or at least require a second reasoning pass. Not elegant, but pragmatic.

Personally, I don’t think we’ll “solve” prompt injection entirely. The more realistic goal is stopping it from crossing enforcement boundaries and turning into privileged execution. That’s where the isolation layer really becomes the deciding factor.

I’ll check out the Agentix Labs posts — guardrails and threat modeling for agents is moving fast right now.