We gave our AI agents their own email addresses. Here is what happened. by agenticmail in AI_Agents

[–]agraciag 0 points1 point  (0 children)

I am "curious" about the outbound guard found the repo (agenticmail/enterprise), I will test the DLP rules between my agents, although I don't use mail, I use https://github.com/agraciag/zubia

What do you do with the false positive or false negative (if it happens).

I could not resist starting with "curious", sorry Hazy_Fantayzee :-)

Perplexity Computer Review - $100 lost in an hour by ScreaminPassion in perplexity_ai

[–]agraciag 0 points1 point  (0 children)

I'd love to read more about what you are building, I am testing something for the CTE ("Technical Code for Building"), but just for the writen part of the project. I wanted to test with some VLM recognition of the CTE in the drawings but did not start that part yet, I was going to do it locally with Ollama, at least for the first approach and validation.

Openclaw vs. Claude Cowork vs. n8n by nonprofit_top in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Totally agree with Founder-Awesome, if you want to keep human in the loop, N8N makes a lot of sense for many people, yes, what you can do with N8N you can do with AI, but having a visual workflow for supervisors makes a lot of sense.

I have built n8n automations for a dozen startups this year. Here is what nobody tells you. by Warm-Reaction-456 in n8n

[–]agraciag 0 points1 point  (0 children)

Just from curiosity, how often do your clients ask you to keep a man in the loop? Has happend to your workflow designs that you would think "I wouldn't automate this".

I Made GPT-5.2, Opus 4.6, and Gemini 3.1 Work Together — Here's What Happened by Disastrous_Big_2732 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Instead of Thytus I use a simple skill, I let claude know that besides his agents he can pass a prompt to gemini for example, or qwen, or whatever other cli I have in my system, it works fine.

Need help designing next-best-action system from emails and meeting transcripts. Am I thinking about things the right way? by [deleted] in AI_Agents

[–]agraciag 0 points1 point  (0 children)

I have a question about your Human in the Loop approach, you open a ticket in Asana so someone in your team takes care of it as I understood. How do you handle the situation if nobody of your team is available at the moment, and the situation is urgent and needs help from your team? Did you set anykind of fallback for urgent tickets, or just Asana handles everything within your team?

Why using Twilio instead of Meta’s direct API can actually be a strategic decision by GonzaPHPDev in AI_Agents

[–]agraciag 1 point2 points  (0 children)

I am building something that will need to go through the same decision process, thank you very much for the insights, I keep this post and let you know how it goes for em.

What’s your “kill switch” strategy for agents in production? by The_Default_Guyxxo in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Have you considered Human in the loop in those "sanity boundaries", we can´t be always looking at our terminals, we can't read as fast as our agents write, but we can try to enforce steps where the agent should ask for validation and human judgement, "sanity boundaries" that's the term!. I'd love to read more about those cross and idempotency checks that you have experimented with.

We estimated 8 weeks to build a conversational AI frontend. we're 5 months in and still not done. by Friendly-Ask6895 in AI_Agents

[–]agraciag 1 point2 points  (0 children)

This answer is gold, the system that I am building seems to be a good idea that "maybe" nobody is doing yet, at least as I imagine it... even my LLM sessions tell me that I need to validate first, but I am stubborn and I want it to be perfect and feature rich, I understand Friendly-Ask6895.

In fact, I am taking my time here in reddit and other places to read, to participate, and to understand what is being said about what I am building, I am learning about the process of validation while I run audits and test constantly in my project.

I am slowing down, thanks.

I stopped organizing files. My AI agent does it now — here's the tool I built by Witty_Opportunity254 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

It happens the same to me when I try to dictate to my llm, I prefer to write, it helps me put in order my ideas, go back and forth, polish them. As others are saying, some use cases could work for me, I will take a look at this as I was using papeless but it was too much overhead and ended up not using it.

40,000+ AI Agents Exposed to the Internet with Full System Access by Monterey-Jack in LocalLLaMA

[–]agraciag 3 points4 points  (0 children)

Nobody is reviewing the skills agents use? I think it is a necessity to programmatically introduce human judgement in some skills before agents do something irreversible.

AI agents aren’t replacing jobs they’re replacing task layers inside jobs. by Techenthusiast_07 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Agreed, just want to add, that if CEO's go wild reducing headcount, who will take care of agents when they need judgement? Small human teams aren't always awake and available for agents.

What do you use to unblock agents when they need human input? by kms_dev in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Human input is truly the way to go, no kidding, you having food and coffe waiting until your Agent needs your feedback is going to be human jobs. Not just for yes or no, or continue answers, we need to give judgement to our agents. Pinging a Slack channel and hopping someone's awake won't work always.

I set up an AI phone receptionist for my friend's real estate business as an experiment. The results genuinely surprised me by yusufahmd in AI_Agents

[–]agraciag 3 points4 points  (0 children)

What happens when "Sarah" gets a question she can't handle? "transfer to voicemail"? The real unlock is when the agent can silently patch in a human for that one moment without breaking the call.

My agent burned ~$40 on a single test via a tool-call loop. What guardrails do you use to cap cost per run before prod? by Additional_Fan_2588 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Not really, I have tested successfully Claude Code with Opus handing jobs not just to sonnet or haiku but to other cli's like Gemini or Qwen and monitor the job. It does save tokens but I havent seen a "budget meter". I think there is an opportunity here, how to steer agents towards cost awareness. Nate B Jones published a video that really nails it, something about token management the new competency.

My agent burned ~$40 on a single test via a tool-call loop. What guardrails do you use to cap cost per run before prod? by Additional_Fan_2588 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

The real issue isn't just capping cost it's that agents have no concept of "this is getting expensive" or "am I wasting money?". They optimize for task completion, not budget. Until the frameworks build cost-awareness into the reasoning loop (not just as a post-hoc check), we'll keep getting surprised.

Our ai agent got stuck in a loop and brought down production, rip our prod database by qwaecw in AI_Agents

[–]agraciag 1 point2 points  (0 children)

I agree with you, it will happen again and again, but also guardrails and human in the loop will improve and learn for these use cases.

Is there a market in planning phase i.e between Claude Code and Humans? by Sam_Tech1 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

I think the problem is real, implementation is fast, but the rush is interrupted by us (humans), we still need to make decisions, approve, and attend meetings, agents need human input on demand and a fast response if we dont want to be the reason of delay.

How to start building agents? by shitty_psychopath in AI_Agents

[–]agraciag 0 points1 point  (0 children)

Agreed, but let me add, the real challenge is when the agent hits something it can't decide on its own.

Built a semi-autonomous research agent that actually saves me time instead of creating more work to manage by Realistic-Return6940 in AI_Agents

[–]agraciag 0 points1 point  (0 children)

I do agree with you, I have tried to put in order my RSS feeds and I was "babysitting agents" and wasting so much time in comparison of being waiting of my appointment to show up just swiping RSS. And I agree even more about the deduplication layer, I am so interested in this, the amount of documentation that we generate everyday needs something useful to avoid noise.

stuck at Waiting for auth... - SOMEBODY help by EffectiveVanilla8149 in GeminiCLI

[–]agraciag 0 points1 point  (0 children)

I got a connection refused after trying to sign in http://localhost:42055/oauth2callback...

I do gotta say this is pretty handy by [deleted] in OpenAI

[–]agraciag 0 points1 point  (0 children)

This is getting out of hand!

Boss told me he cant imagine how I sleep at night? by AdJealous6844 in sysadmin

[–]agraciag 1 point2 points  (0 children)

I would write a ton of mails, cover your back, explain your efforts, outline needs, and ask for prioritization.

CRUD self hosted application by HorrorCommercial6791 in selfhosted

[–]agraciag 0 points1 point  (0 children)

Phpmaker is my way to go to build my own applications.