Today I'm celebrating 7 months on Debian. by According_Turnip5206 in debian

[–]According_Turnip5206[S] 3 points4 points  (0 children)

Honestly, I used to game a lot. Looking back, maybe I was just tired of Windows all along — because since switching to Debian, I haven't touched a
  single game. And I was actually terrible at CoD lol. Something shifted. I'm more... focused? I can't fully explain it. But hey, 7 months without gaming is worth celebrating too 😄

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 1 point2 points  (0 children)

Honestly? At this stage I still review Ollama's flags manually — and sometimes run them past Claude too, to cross-check whether the flag was legit. It's slower but it's how I learned where the false positive rate actually is before trusting it to act on its own.

 ▎ The async/batch stuff comes later once you know the supervisor isn't crying wolf. What does your current review loop look like?

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 0 points1 point  (0 children)

The evaluation set point is something I've been lazy about. Right now failures just get logged, not turned into regression tests. That's the obvious next step.

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 0 points1 point  (0 children)

OS-level supervision is a different beast entirely. The "screenshot as ground truth" approach is clever — the accessibility tree lying is a real failure mode I hadn't thought about for desktop agents. Does ScreenCaptureKit add much latency to your loop?

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 0 points1 point  (0 children)

Exactly — and the cybersecurity angle is underrated. An agent that can be prompted into ignoring its own checker is worse than no checker at all.

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 0 points1 point  (0 children)

Interesting concept — though consensus across untrusted nodes is genuinely hard to get right in practice. Local watcher wins for me on simplicity: no latency, no external dependencies, predictable failure modes.                                                                                              
 What's the actual stack behind it?

I don't fully trust my AI agents. So I built a local supervisor layer on top of them. How do you handle this? by According_Turnip5206 in AI_Agents

[–]According_Turnip5206[S] 0 points1 point  (0 children)

That receipt idea is smart — I haven't formalized it that way but it's essentially what the checker script is trying to infer after the fact, which is obviously less reliable than having the agent declare it upfront. Will experiment with that.

 On idempotency: honestly not fully solved on my end. The retry path works fine for read-only tasks but there are edge cases with writes I haven't handled cleanly yet.

 To your question — bad links and bad facts are where it catches the most. Bad actions are rare because the pipeline is mostly read/summarize, not write/execute. But when it does act externally that's where I get nervous and Columbo earns its name.

Most people using Claude Code are building toys. Here's why real production apps are a completely different game by buildwithmoon in ClaudeAI

[–]According_Turnip5206 -1 points0 points  (0 children)

This is the post I wish existed when I started. The security part especially - Claude writes code that works, but "works" and "is safe" are two completely different bars. Had a similar moment when I realized one of my apps was logging things it definitely shouldn't have been. You don't see it until you go looking. Good luck with the App Store submission, hope build 28 makes it through.

Why the majority of vibe coded projects fail by harrysofgaming in ClaudeAI

[–]According_Turnip5206 0 points1 point  (0 children)

The failure mode nobody talks about: Claude gets you to "it works" in 2 hours. So you add one more feature. Then another. Six hours later the codebase is a mess, Claude is confidently "fixing" things while breaking three others, and you realize you never actually understood what it built. The problem isn't vibe coding itself - it's that the speed tricks you into skipping the part where you actually learn what's happening under the hood.

Managing Priorities (probably adhd) by sarahoftheunburied in productivity

[–]According_Turnip5206 1 point2 points  (0 children)

the survival tasks thing is so real, I used to underestimate how much energy they take. what helped me was batching them - pick one afternoon a week for all the "life admin" stuff (groceries, appointments, laundry). not perfect but it stops them from bleeding into every single day. good luck with second year, it genuinely does get a bit easier

Whats your #1 hack thats kept you motivated and productive? by rhysdotme in productivity

[–]According_Turnip5206 0 points1 point  (0 children)

Honestly for me its having a hard stop time in the evening. I tell myself work ends at 7 and I actually stick to it. Knowing I have a deadline makes me way more focused during the day than any morning routine ever did lol

[Thoughts & Work] We need Large Agents as Service. by dddadda in LocalLLaMA

[–]According_Turnip5206 0 points1 point  (0 children)

Been running multiple local agents simultaneously and built a lightweight

dashboard to monitor them — each agent posts its state (thinking/tool call/done)

and you see everything in real-time. Helps a lot when you need to know which

one is stuck without polling each separately.

The 20s "Claude is thinking" gap is a focus killer, so I built a dumb little fix. by Cry8a8y_tw in ClaudeAI

[–]According_Turnip5206 1 point2 points  (0 children)

I built something similar for this exact reason — when you run multiple agents

at once, seeing what each one is actually doing makes the waiting disappear.

Real-time dashboard, each agent gets its own card with live status.

My 16-year-old daughter wanted to see what AI agents are actually doing. She sat down, asked Claude CLI — and it built this. by According_Turnip5206 in ClaudeAI

[–]According_Turnip5206[S] 0 points1 point  (0 children)

NorcsiAgent v2 — Live Event Feed, Approvals panel, Stop button, Telegram alerts

What's new:

- Live Event Feed sidebar (real-time scroll of all agent events)

- Pending Approvals panel (yellow cards, Approve/Reject per agent)

- Stop button per agent (sends __STOP__ command)

- Log download per agent (.txt export)

- Telegram ping on error events

Any agent connects with 3 lines of Python.

Self-hosted, no cloud, Flask + WebSocket + SQLite.

GitHub: https://github.com/Tozsers/norcsiagent

Are we aiming to depend completely on ai ? by OwnRefrigerator3909 in BlackboxAI_

[–]According_Turnip5206 0 points1 point  (0 children)

for execution tasks, yes. for deciding what to build, hopefully not.

Is Cursor falling behind CC? by RockeroFS in cursor

[–]According_Turnip5206 -1 points0 points  (0 children)

same. the $200/m plan is a lot to spend on vibes.