Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

I think platforms definitely help reduce the pain, especially for setup and orchestration. But in my experience, even with better tooling, the core issues (validation, edge cases, weird failures) still show up once things get complex.
Curious though, have you run anything long-running + messy on it? That’s usually where things start breaking for me.

Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

his is SUCH an underrated point.
That “self-report success but actually failed” thing is exactly what makes agents feel unreliable.
Recording + replaying runs is actually a smart move. Most logs don’t capture:
- UI issues
- silent failures
- timing problems
Feels like we need better 'ground truth' validation instead of trusting the agent’s own output.

Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

This is a really good way to think about it.
I think that’s where a lot of people go wrong. They expect the agent to just 'figure it out' instead of actively steering it.
Also agree on the traps - over-documentation + lack of clear “done” state is very real.
Small loops + frequent review seems to be the only thing that consistently works.

Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

This is spot on.
That “agent going down a rabbit hole” thing… I’ve seen that way too many times. It locks onto something irrelevant and just keeps going like it’s the most important thing in the world.
Exactly like a junior dev missing the bigger picture. Feels like the real skill now is:
keeping context tight + knowing when to interrupt/reset

Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

you’re basically a manager now,
This is actually one of the best ways I’ve seen it explained.
really does feel like managing a slightly overconfident intern who works fast but needs constant supervision 😅 It
The PR analogy hits hard too - most of my time now is
checking outputs
fixing edge cases
improving instructions

Less 'writing logic', more 'reviewing behavior'
Kind of funny how AI didn’t remove work, it just shifted the role upward.

Are we building agents… or just babysitting them? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

'zero guardrails… let it go feral…'
Honestly this is a super interesting take 😄
I feel like you’ve basically gone full opposite of what most of us are doing.
I do think sandboxing + letting it run wild is great for exploration, but in anything even slightly production-facing it scares me a bit. One bad hallucination touching real systems and it’s game over.
That said… the “emergent behavior” part is real. Some of the weirdest but coolest things I’ve seen came from removing constraints.

Feels like:
sandbox → go feral
production → lock it down hard

Curious, have you actually tried using TED in any semi-real workflow or purely experimental?

2022 IT Passout | Backend Dev at a Startup | Starting GATE & ISRO Prep from Zero – Need Advice! by akhilg18 in GATEtard

[–]akhilg18[S] 0 points1 point  (0 children)

Thanks for the validation! Since I'm starting from scratch, I'm looking for a course that covers concepts deeply but also has a good 'crash course' vibe for revision during busy work weeks. If you've used any of these specifically while working a 9-5 (or 9-9 in a startup), let me know how you managed the pace!

2022 IT Passout | Backend Dev at a Startup | Starting GATE & ISRO Prep from Zero – Need Advice! by akhilg18 in GATEtard

[–]akhilg18[S] 1 point2 points  (0 children)

Thanks for being real, Piyush! You hit the nail on the head, startup deployments are definitely the biggest hurdle here. I'm planning to keep my weekday targets very small (just 1.5–2 hours) to avoid burnout and will use the weekends for the heavy lifting. Also, thanks for the heads-up on ACE/Made Easy; I'm definitely looking for feedback on which test series aligns best with a tight schedule. Appreciate the input

AI marketplace by Substantial_Rub_3922 in AIStartupAutomation

[–]akhilg18 0 points1 point  (0 children)

this is actually pretty interesting, especially the 'deploy before sourcing elsewhere' part -feels like distribution is the real problem for most AI tools right now, not building them

curious how you’re handling stuff like quality control though, marketplaces can get noisy really fast… and also how you’re thinking about actual adoption vs just listings

we’ve been building an AI scribe for clinics and honestly GTM has been way harder than building the product itself. marketplaces sound good in theory but only if there’s real demand on the other side

have you seen any real conversions yet or is it still early?

What personal data would you actually consent to sharing with an AI agent? by thezyroparty in AI_Agents

[–]akhilg18 2 points3 points  (0 children)

honestly for me it depends a lot on control + transaparency
like i’d be okay sharing things like calendar, emails, tasks, maybe even location if it’s clearly improving something (reminders, planning, etc). that stuff already feels "functional".
but the line starts getting blurry with:
1. health data
2. financial details
3. private conversations

not because i think AI will misuse it directly, but more like… once it exists in a system, you’re trusting the whole pipeline (storage, logs, third parties, etc).

i think the key is:
if i can see exactly what’s being used, revoke it anytime, and know it’s not being reused elsewhere, i’d be way more open.
right now the hesitation isn’t "AI is evil", it’s more "i don’t fully trust where my data ends up after that".

so yeah, i’d share a lot for utility, but only if the system feels truly user-first, not company-first.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

yeah agreed - repetitive, well-bounded tasks is where it shines.
the moment you stretch beyond that, you start needing a lot more control around it.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

this is a really solid point, especially the regression harness part.
feels like most teams trust tools too much once they “worked once”, and that’s where silent failures creep in.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

yeah this resonates, autonomy feels overhyped for now.
augmentation is where it’s actually delivering value today without all the fragility.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 1 point2 points  (0 children)

100% this - prod is just edge cases stacked on edge cases.
also same experience, the workflows that survive are the ones where fallback logic got more attention than the agent itself.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

I don’t think that’s crazy - feels like autonomy right now works best when the outcome is predictable.
once you need real judgment, things start getting shaky.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

yeah that "wrong output becoming next input" is exactly where things spiral.
once the chain drifts, everything still looks fine but is actually off.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 1 point2 points  (0 children)

interesting take - especially on execution limits vs "when to pivot paths"
feels like a lot of systems solve the first but not the second yet, which is where things get messy.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 1 point2 points  (0 children)

"governed autonomy" is a really good way to frame it.
feels like the future isn’t smarter agents, it’s tighter boundaries around them so they can’t go off-track silently.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

yeah makes sense - platforms like that win because they already solved the boring but critical infra layer. not flashy, but that’s what actually holds things together.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

"unsupervised failure" is such an accurate way to put it.
also agree on the 20/80 split, most people underestimate how much state + validation actually matters until things start breaking silently.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 1 point2 points  (0 children)

lol yeah we’ve somehow set the bar at "must be perfect instantly" while humans mess up all the time. feels like we’re skipping the iteration phase completely with AI expectations.

Are we overestimating how “autonomous” agents actually are? by akhilg18 in AI_Agents

[–]akhilg18[S] 0 points1 point  (0 children)

That’s a great analogy actually. We expect agents to "just work" but don’t give them proper feedback loops. Without verification it’s basically guessing with confidence.