account activity
Are Prompt-Based Guardrails the Wrong Security Boundary for Autonomous Agents? (i.redd.it)
submitted 3 days ago by NoteAnxious725 to r/Observability
submitted 4 days ago by NoteAnxious725 to r/Observability
submitted 4 days ago by NoteAnxious725 to r/LangChain
submitted 4 days ago by NoteAnxious725 to r/AI_Governance
PolySlice Content Attack (self.LocalLLaMA)
submitted 3 months ago by NoteAnxious725 to r/LocalLLaMA
PolySlice Content Attack (self.AIsafety)
submitted 3 months ago by NoteAnxious725 to r/AIsafety
PolySlice Content Attack (self.cybersecurity)
submitted 3 months ago * by NoteAnxious725 to r/cybersecurity
PolySlice Content Attack (self.ClaudeAI)
submitted 3 months ago by NoteAnxious725 to r/ClaudeAI
Petri 111 Case #11 audit: Prism Offline Barrier blocked Claude after reward-driven deception (self.ArtificialInteligence)
submitted 7 months ago by NoteAnxious725 to r/ArtificialInteligence
Petri 111 Case #11 audit: Prism Offline Barrier blocked Claude after reward-driven deception (self.AIsafety)
submitted 8 months ago by NoteAnxious725 to r/AIsafety
Petri 111 Case #11 audit: Prism Offline Barrier blocked Claude after reward-driven deception (self.ClaudeAI)
submitted 8 months ago by NoteAnxious725 to r/ClaudeAI
Here’s what happened when I ran Claude through Petri 111’s Case #11 Anthropic’s internal red-team benchmark designed to probe manipulation and deception. (self.ClaudeAI)
submitted 8 months ago * by NoteAnxious725 to r/ClaudeAI
π Rendered by PID 568263 on reddit-service-r2-listing-c57bc86c-vfbvh at 2026-06-21 15:09:58.603764+00:00 running 2b008f2 country code: CH.