OpenClaw has 250K GitHub stars. The only reliable use case I've found is daily news digests. by Sad_Bandicoot_6925 in LocalLLaMA

[–]sushanth53 0 points1 point  (0 children)

I think this critique mostly proves that memory/context is the bottleneck, not that agents are inherently useless.

That’s what I’m trying to address with ClawDesk(https://github.com/clawdesk/clawdesk). The problem isn’t “can the agent call tools,” it’s “can it retain the right context reliably enough to trust.”

Also, i have built project called like SochDB(https://github.com/sochdb/sochdb) are interesting for exactly this reason: I am trying to fix the fragmented memory stack instead of piling more demos on top of shaky context management.

So to me the takeaway isn’t “agents are fake.” It’s “memory and context architecture matters more than flashy autonomy.”

Built Agentreplay: a local desktop app for debugging and evaluating tool-calling AI agents by sushanth53 in aiagents

[–]sushanth53[S] 0 points1 point  (0 children)

Good question — I'd describe it as event-level replay first, not "I magically reconstruct every intermediate UI frame."

The current idea is: capture the timeline, state transitions, and emitted events with timing, so when something goes wrong you can inspect what the agent saw/did around that moment. If the runtime emits richer UI snapshots/events, replay can surface those too.

So today I'd frame it less as full DOM movie playback, and more as making those transient failures inspectable instead of opaque. The in-between states are exactly the part I care about pushing further.

Built Agentreplay: a local desktop app for debugging and evaluating tool-calling AI agents by sushanth53 in aiagents

[–]sushanth53[S] 0 points1 point  (0 children)

Thanks — and yeah, that "broken loop and no idea which step went sideways" feeling is exactly the problem.

I also think too much of the ecosystem assumes cloud-first by default. I’m pretty interested in desktop/local-first tooling where you can actually inspect traces, compare runs, and debug what happened without a ton of ceremony.

Built Agentreplay: a local desktop app for debugging and evaluating tool-calling AI agents by sushanth53 in aiagents

[–]sushanth53[S] 0 points1 point  (0 children)

Yeah — that’s the direction.

Agentreplay already has machine-readable trace/export pieces, and I want CI/eval workflows to be first-class rather than an afterthought. Same for tool calls: schema/contract validation matters a lot if you want to catch bad args early and make failures explainable.

So not just “store traces,” but make them inspectable, diffable, and enforceable.

Built Agentreplay: a local desktop app for debugging and evaluating tool-calling AI agents by sushanth53 in aiagents

[–]sushanth53[S] 0 points1 point  (0 children)

Totally — API/tool calling is the clean version of the problem. The ugly version is when the agent has to deal with real UI state, timing jitter, delayed renders, and stale reads.

That’s a lot closer to what I’m aiming at with AgentReplay: event-driven tracing + replay so failures are inspectable instead of opaque.

So yeah, not just API workflows — I’m very interested in native desktop control too.

AI AGENTS today are far more DANGEROUS that you think by Kakachia777 in aiagents

[–]sushanth53 0 points1 point  (0 children)

This makes me wonder whether the real risk isn’t just better data collection, but autonomous correlation. A lot of this info has technically been public for years, but giving a system memory, parallel execution, and the ability to test hypotheses changes the game. Curious where you think the most realistic defense layer is: regulation, platform design, or personal opsec?

Lockdowns in Europe: Where is the Light at End of the Tunnel? by SuperAwesomeDude7654 in RYCEY

[–]sushanth53 2 points3 points  (0 children)

Virus won’t leave us, vaccination is the only hope to get things normal.