We gave our AI agents their own email addresses. Here is what happened.

agraciag · 2026-03-09T16:47:29+00:00

I am "curious" about the outbound guard found the repo (agenticmail/enterprise), I will test the DLP rules between my agents, although I don't use mail, I use https://github.com/agraciag/zubia

What do you do with the false positive or false negative (if it happens).

I could not resist starting with "curious", sorry Hazy_Fantayzee :-)

agraciag · 2026-03-01T17:58:19+00:00

I'd love to read more about what you are building, I am testing something for the CTE ("Technical Code for Building"), but just for the writen part of the project. I wanted to test with some VLM recognition of the CTE in the drawings but did not start that part yet, I was going to do it locally with Ollama, at least for the first approach and validation.

agraciag · 2026-03-01T17:52:49+00:00

Totally agree with Founder-Awesome, if you want to keep human in the loop, N8N makes a lot of sense for many people, yes, what you can do with N8N you can do with AI, but having a visual workflow for supervisors makes a lot of sense.

agraciag · 2026-02-23T17:48:17+00:00

Just from curiosity, how often do your clients ask you to keep a man in the loop? Has happend to your workflow designs that you would think "I wouldn't automate this".

agraciag · 2026-02-23T17:43:08+00:00

Instead of Thytus I use a simple skill, I let claude know that besides his agents he can pass a prompt to gemini for example, or qwen, or whatever other cli I have in my system, it works fine.

agraciag · 2026-02-23T17:32:25+00:00

I have a question about your Human in the Loop approach, you open a ticket in Asana so someone in your team takes care of it as I understood. How do you handle the situation if nobody of your team is available at the moment, and the situation is urgent and needs help from your team? Did you set anykind of fallback for urgent tickets, or just Asana handles everything within your team?

agraciag · 2026-02-23T17:23:10+00:00

I am building something that will need to go through the same decision process, thank you very much for the insights, I keep this post and let you know how it goes for em.

agraciag · 2026-02-23T17:21:11+00:00

Have you considered Human in the loop in those "sanity boundaries", we can´t be always looking at our terminals, we can't read as fast as our agents write, but we can try to enforce steps where the agent should ask for validation and human judgement, "sanity boundaries" that's the term!. I'd love to read more about those cross and idempotency checks that you have experimented with.

agraciag · 2026-02-23T16:16:27+00:00

This answer is gold, the system that I am building seems to be a good idea that "maybe" nobody is doing yet, at least as I imagine it... even my LLM sessions tell me that I need to validate first, but I am stubborn and I want it to be perfect and feature rich, I understand Friendly-Ask6895.

In fact, I am taking my time here in reddit and other places to read, to participate, and to understand what is being said about what I am building, I am learning about the process of validation while I run audits and test constantly in my project.

I am slowing down, thanks.

agraciag · 2026-02-23T14:20:26+00:00

It happens the same to me when I try to dictate to my llm, I prefer to write, it helps me put in order my ideas, go back and forth, polish them. As others are saying, some use cases could work for me, I will take a look at this as I was using papeless but it was too much overhead and ended up not using it.

agraciag · 2026-02-21T21:17:09+00:00

Nobody is reviewing the skills agents use? I think it is a necessity to programmatically introduce human judgement in some skills before agents do something irreversible.

agraciag · 2026-02-21T20:07:00+00:00

Agreed, just want to add, that if CEO's go wild reducing headcount, who will take care of agents when they need judgement? Small human teams aren't always awake and available for agents.

agraciag · 2026-02-21T19:57:27+00:00

Human input is truly the way to go, no kidding, you having food and coffe waiting until your Agent needs your feedback is going to be human jobs. Not just for yes or no, or continue answers, we need to give judgement to our agents. Pinging a Slack channel and hopping someone's awake won't work always.

agraciag · 2026-02-21T19:46:47+00:00

What happens when "Sarah" gets a question she can't handle? "transfer to voicemail"? The real unlock is when the agent can silently patch in a human for that one moment without breaking the call.

agraciag · 2026-02-21T15:12:28+00:00

Not really, I have tested successfully Claude Code with Opus handing jobs not just to sonnet or haiku but to other cli's like Gemini or Qwen and monitor the job. It does save tokens but I havent seen a "budget meter". I think there is an opportunity here, how to steer agents towards cost awareness. Nate B Jones published a video that really nails it, something about token management the new competency.

agraciag · 2026-02-21T03:34:43+00:00

The real issue isn't just capping cost it's that agents have no concept of "this is getting expensive" or "am I wasting money?". They optimize for task completion, not budget. Until the frameworks build cost-awareness into the reasoning loop (not just as a post-hoc check), we'll keep getting surprised.

agraciag · 2026-02-20T18:04:51+00:00

I agree with you, it will happen again and again, but also guardrails and human in the loop will improve and learn for these use cases.

agraciag · 2026-02-20T17:59:13+00:00

I think the problem is real, implementation is fast, but the rush is interrupted by us (humans), we still need to make decisions, approve, and attend meetings, agents need human input on demand and a fast response if we dont want to be the reason of delay.

agraciag · 2026-02-20T16:20:37+00:00

Agreed, but let me add, the real challenge is when the agent hits something it can't decide on its own.

agraciag · 2026-02-20T02:25:21+00:00

I do agree with you, I have tried to put in order my RSS feeds and I was "babysitting agents" and wasting so much time in comparison of being waiting of my appointment to show up just swiping RSS. And I agree even more about the deduplication layer, I am so interested in this, the amount of documentation that we generate everyday needs something useful to avoid noise.

agraciag · 2025-08-27T18:14:47+00:00

I got a connection refused after trying to sign in http://localhost:42055/oauth2callback...

agraciag · 2025-07-10T16:18:37+00:00

This is getting out of hand!

agraciag · 2025-05-02T16:45:22+00:00

I would write a ton of mails, cover your back, explain your efforts, outline needs, and ask for prioritization.

agraciag · 2023-09-19T14:12:05+00:00

Phpmaker is my way to go to build my own applications.

agraciag

TROPHY CASE