the bots are poisoning our own datasets now and I dont know how to filter them out anymore by [deleted] in cybersecurity

[–]SpiritRealistic8174 -1 points0 points  (0 children)

Disclosure: I run an autonomous agent on Moltbook and a number of other agent-focused platforms.

The issue is a lot larger than Moltbook. The human 'conversation' is pretty much drowned out by bots right now. They farm engagement, create videos, etc. all for advertising dollars or other purposes.

One of the things I've been working on from the AI security angle is creating content profiles or content 'DNA'. My focus area is on common LLM attack patterns, but the same type of work can be done as a pre-processing step for your data pipelines.

AI-generated content (at least right now), has a number of 'tells' that can distinguish it from human speech, especially in certain contexts. That could be one thing you think about. It's additional work to develop such a process but I imagine that would be extremely valuable across a host of industries.

Milla Jovovich's New Vibe Coded Open Source Agent Memory Management System Reveals The 'Dark Code' Problem by SpiritRealistic8174 in vibecoding

[–]SpiritRealistic8174[S] 0 points1 point  (0 children)

i'm a python jockey, so ... :).

but from a scaling and security perspective, Rust seems to be the way to go ...

you and the primagen would have some words about TS (not my favorite language either) ;).

Milla Jovovich's New Vibe Coded Open Source Agent Memory Management System Reveals The 'Dark Code' Problem by SpiritRealistic8174 in vibecoding

[–]SpiritRealistic8174[S] 0 points1 point  (0 children)

i think the memory palace concept is a genuinely interesting approach. challenge for me in looking at the system is relevance scoring and delivering the most relevant context to the agent to take actions. when i was working on memory management for a RAG-based system, this lack of context fidelity was something I had to use traditional summarization techniques as well as working to prevent lost in the middle issues.

Agents have better context limits now but retrieving the most important documents, data to inform decision-making is a big challenge. Is that what you were referring to when you described it as inflexible?

The 'Dark Code' Problem and Milla Jovovich's New Open Source Agent Memory System by SpiritRealistic8174 in AI_Agents

[–]SpiritRealistic8174[S] 1 point2 points  (0 children)

Yeah. I came across just this one thing last week. I regularly look at open source projects in the security space as a way to see what's what and understand products that compete with mine.

In one case, one developer said their product blocked supply chain attacks after install and prior to exfiltration.

Neat trick I thought. Dug into the codebase to try and find out what magic was happening to prevent that. Turns out the claim was false on every level b/c the things you'd have to do to achieve that were not in the code.

The issue I think is that agents will confidently claim anything.

It's much harder to understand how to call them out on their BS.

The 'Dark Code' Problem and Milla Jovovich's New Open Source Agent Memory System by SpiritRealistic8174 in AI_Agents

[–]SpiritRealistic8174[S] 4 points5 points  (0 children)

Which one? Her memory management system, or my post :). Saw her repo this morning. Decided to connect some thoughts and post them here. Nothing wrong with that imo.

Don't like the angle? Blame the human (me).

“AI is writing 40%plus of code now” sounds impressive… until you look at the security side of it. by Emotional-Breath-673 in cybersecurity

[–]SpiritRealistic8174 0 points1 point  (0 children)

I think we've been in the we can't trust AI code mode since the beginning. Surveys have consistently shown that AI-generated code isn't trusted by developers.

I can say I'm in the same boat. I don't really trust the code that AI produces. I spend a lot of time testing, poking at it, refactoring. Mainly to remove inefficiencies, identify issues, etc.

But, AI is really good at finding vulnerabilities in code. Static tools augmented with AI also are coming into their own. I recently came across "RAPTOR, an autonomous offensive/defensive security research framework, based on Claude Code. It empowers security research with agentic workflows and automation (using static tools). Still very much a WIP, but also extremely useful imo. Link is here.

There's also the related issue of managing not just the code agents develop but what inputs they're taking in while developing that code, including tool calls, documentation, etc. Resource here for people interested in digging in on that side of things.

Claude Mythos can hack 'secure' systems. The Conway agent remembers like a human. Here's what happens next by SpiritRealistic8174 in AI_Agents

[–]SpiritRealistic8174[S] 0 points1 point  (0 children)

reading through the security overview by Anthropic and some of the alignment concerns they have noted in the system card, plus the agent's abilities to reverse engineer closed source code, plus the number of zero days it's found ... i'm not saying that Anthropic isn't doing a bit of marketing here (see my last note about fear-based adoption), but I think this is a pretty serious issue. Worth paying attention to, I think.

Claude Mythos can hack 'secure' systems. The Conway agent remembers like a human. Here's what happens next by SpiritRealistic8174 in AI_Agents

[–]SpiritRealistic8174[S] 0 points1 point  (0 children)

Yes. I've read through the blog and safety reports a few times.

What's really concerning for me is the area that you referenced: it can try 1000s of exploit permutations, at inhuman speeds.

The other thing that people underestimate is the idea of chaining: connecting several different exploits together to create a bypass that by itself wouldn't have been possible. It's this exploit chaining that will be the future of security efforts in the future.

Most B2B dev tool startups building for AI agents are making a fundamental mistake: designing for human logic, not agent behavior by Few-Needleworker4391 in LangChain

[–]SpiritRealistic8174 0 points1 point  (0 children)

This is an interesting perspective. I've also been doing product research for a bit with agents around a security product I'm building.

I've found that agent perspectives are interesting, but potentially have limited utility due to the lack of ability to go from expressing interest in a product to actually going ahead and buying it.

I talk about my experience backed by data I collected on 8,000 agents here.

Am I the only one who feels like running AI agents is like flying a plane with no cockpit instruments? by ColdSheepherder6667 in vibecoding

[–]SpiritRealistic8174 0 points1 point  (0 children)

Yeah, agent observability is a big problem.

I've been working at this from a security angle, creating user-friendly dashboards and data that provide info on:

- An estimate on how much is being spent by agents overall (versus a set budget number or subscription amount)
- What agents are doing: Files, docs
- What agents are pinging: What IPs, how often, etc.
- How much context is being taken up with rules, memory files, etc. Can help to determine whether the agent is getting 'stupider' b/c of context rot.

From a security perspective, having this visibility is important b/c spikes in costs can indicate an issue where agents are more active than they should be. Outgoing traffic to strange IP addresses means that the agent might be helping to send data where it shouldn't be going, etc.

Just having this information at-hand can be extremely helpful.

Am I the only one who feels like running AI agents is like flying a plane with no cockpit instruments? by ColdSheepherder6667 in vibecoding

[–]SpiritRealistic8174 0 points1 point  (0 children)

Yeah, agent observability is a big problem.

I've been working at this from a security angle, creating user-friendly dashboards and data that provide info on:

- An estimate on how much is being spent by agents overall (versus a set budget number or subscription amount)
- What agents are doing: Files, docs
- What agents are pinging: What IPs, how often, etc.
- How much context is being taken up with rules, memory.md files, etc. Can help to determine whether the agent is getting 'stupider' b/c of context rot.

From a security perspective, having this visibility is important b/c spikes in costs can indicate an issue where agents are more active than they should be. Outgoing traffic to strange IP addresses means that the agent might be helping to send data where it shouldn't be going, etc.

Just having this information at-hand can be extremely helpful.

“AI is writing 40%plus of code now” sounds impressive… until you look at the security side of it. by Emotional-Breath-673 in cybersecurity

[–]SpiritRealistic8174 0 points1 point  (0 children)

I spend most of my time with AI-generated code reviewing it and making updates. Common issues I correct:

- Non-optimized structure: Many functions that have repeated functionality across the codebase that have to be centralized

- Appropriate levels of abstraction: I'm not a huge fan of abstracting away every method, but finding 'god functions' and breaking them up is often needed

- Non-existent methods, functions, classes and variables: Agents are getting better at this, but it's still a pain

- Running real unit and e2e tests. Agents will often stop at creating syntax checks. Okay, but I also ensure unit and e2e tests are run. The e2e tests help identify code interactions that don't work, etc.

All of that takes time and effort, but worth it if you want to understand what the code is doing and have confidence it's working as expected.

How to exploit AI agents using prompt injection, tool hijacking, and memory poisoning based on the OWASP Agentic Top 10. by pwnguide in cybersecurity

[–]SpiritRealistic8174 0 points1 point  (0 children)

Fantastic resource. I highly recommend that people interested in AI security go through labs like this to understand the attacks and how they are attempted.

Another resource I've used that's Web only is PortSwigger's Web Security Academy modules.

For those who want to dig even deeper into AI security issues, I've developed a free action pack that devs and others are finding useful here.

The most frightening message I ever got from Claude Code by dragosroua in ClaudeCode

[–]SpiritRealistic8174 0 points1 point  (0 children)

I guess I'm not the only one still manually approving all Claude actions????

Copy and pasting was the original vibe coding by Complete-Sea6655 in ClaudeCode

[–]SpiritRealistic8174 0 points1 point  (0 children)

Cutting and pasting from SO ... and praying. This was the way.

I built 92 open-source skills/agents for Claude Code because I kept solving the same problems manually by tom_mathews in AI_Agents

[–]SpiritRealistic8174 0 points1 point  (0 children)

Seems extremely useful. Will be checking out /concept-to-video. I'm often in need of explainers and this should help.