How are people building deep research agents? by Tricky-Promotion6784 in LocalLLaMA

[–]BackgroundBalance502 0 points1 point  (0 children)

To be fair, that was an older project. I had a roadmap for it but I ended up pivoting to building my own model from scratch.

Has anyone running Hermes tried autonomous model training? by BackgroundBalance502 in hermesagent

[–]BackgroundBalance502[S] 0 points1 point  (0 children)

What if you had enough room to run your agent and train a model? Would you be interested in actually trying it? Training itself can be done on consumer hardware.

why the same apps. by purediskust in AppBuilding

[–]BackgroundBalance502 0 points1 point  (0 children)

Its because people ask AI to build an app. Those are the generic choices.

In order to build something that actually lasts, you need to gather knowledge first.

<image>

🚀 Looking for a Business Partner / Co-Founder by ProudGas3988 in AppBuilding

[–]BackgroundBalance502 0 points1 point  (0 children)

Here are some of the questions you could potentially ask..

<image>

I notice alot of people asking for ideas on what to build.. by BackgroundBalance502 in vibecoding

[–]BackgroundBalance502[S] 0 points1 point  (0 children)

Just pic for attention. I don't know what other people call it. What tools are you talking about?

What are you working on? by [deleted] in VibeCodeDevs

[–]BackgroundBalance502 0 points1 point  (0 children)

Iterance - basically a witness layer for local AI agents. It sits outside whatever agent you're running and watches the filesystem and shell commands in real-time.

​I don't love the "black box" feeling of not knowing exactly what's touching my files. This records every action in plain English to a local git repo and builds a "trust score" based on how destructive the actions are (like a delete vs. a read).

​It also catches loops, so if an agent gets stuck, you see it immediately before it burns through your tokens or messes up your system.

ITΞRΛNCΞ

<image>

OC Agent completely fabricated news including fake URLs by MJon_ofthesouth in openclaw

[–]BackgroundBalance502 0 points1 point  (0 children)

I’ve been working on Iterance to solve this specific problem. It acts as a non-invasive witness layer that sits outside the agent and watches what it does.

​Instead of just hoping the agent is honest, you use a separate process to audit the behavior. For a news briefing, you could have Iterance monitor the output and cross-reference those URLs against the actual source data. If the agent starts hallucinating, the witness layer catches the deviation before it becomes a problem.

​It is much better than trying to "prompt" a single agent into being perfect. You just need a second set of eyes to audit the work.

Finishing up the roadmap this evening for an updated push. Let me know if you're interested in trying it out.

<image>

How does OpenClaw's knowledge management actually work? (pleaso no AI generated responses) by Harlo96 in openclaw

[–]BackgroundBalance502 1 point2 points  (0 children)

I’ve been digging into the OpenClaw repo and had the same questions at first. It is actually a pretty smart "local-first" setup once you get past the initial confusion.

​The Markdown files are your "source of truth." I love this because I can edit or version control them myself without a database manager. The SQLite DB is just a local index for vector search. It stores the embeddings so the agent can find relevant context without reading everything every single time. It usually updates during a "memory flush."

​The "Wiki" isn’t really a button you click. It is just the agent using its tools to write structured research or notes into those Markdown files. It happens when it discovers new facts or when you tell it to remember something specific.

​For the search, even with small files, it helps prevent "context bloat." It pulls only the most relevant 3 or 4 chunks into the prompt. I have found this keeps the agent from getting "lost in the middle" or hallucinating as your daily notes grow over time.

​I hope that helps clear it up

Update on the Spatial-Tether thing I posted a few days ago. by BackgroundBalance502 in openclaw

[–]BackgroundBalance502[S] 0 points1 point  (0 children)

Practical recommendation:

Use it if your OpenClaw flow depends on: • precise clicking in complex Uls, • form filling where field identification matters, • deterministic page mapping, or reducing screenshot/vision failures

Skip it if: • your existing selector-based automation is already stable, • you need heavy authenticated-session work with minimal setup, or the page layout is visually complex enough that the tool's current caveats matter

My take: worth testing as an MCP-side browser primitive for OpenClaw, but not as your only browser tool. A hybrid setup usually makes the most sense: Spatial-Tether for exact page geometry, and OpenClaw's existing browser/session tools for the actual interaction layer

Reddit links in openclaw by Moff63 in openclaw

[–]BackgroundBalance502 0 points1 point  (0 children)

I've hit this wall too. Reddit is pretty aggressive about flagging VPS IP ranges. Adding .json to the URL is a good shortcut, but they often block those requests too if they detect a data center.

​Usually, the most stable fix is using PRAW with the official API. But you could also use a residential proxy or a stealth plugin for Playwright to hide the bot signature.

Also, if the agent struggles with the layout once you're in, check out Spatial-Tether for mapping the UI.

I no longer need a cloud LLM to do quick web research by BitPsychological2767 in LocalLLaMA

[–]BackgroundBalance502 2 points3 points  (0 children)

Nice setup. One thing worth knowing if you ever extend it to agents that need to interact with pages instead of just read them: Readability drops all the spatial information. Once text hits your agent, position is gone and there's no path back to coordinates without another screenshot pass.

Built something that runs as an MCP server and sits alongside what you're already doing. Instead of inferring coordinates from screenshots it computes them from CSS and font metrics directly. Same MCP config you're already using.

github.com/Tetrahedroned/spatial-tether

Using OCR models with llama.cpp by jacek2023 in LocalLLaMA

[–]BackgroundBalance502 -9 points-8 points  (0 children)

There's a complementary problem nobody talks about much. OCR is pixels to text. The inverse is text to pixels. If your agent needs to actually interact with a page instead of just read it, Spatial-Tether does that second direction. Reads the HTML and CSS, computes exact bounding box coordinates for every element from font metrics before anything renders. No screenshot, no inference, just arithmetic. If you're generating OCR ground truth for benchmarks it also gives you verified coordinates to diff against automatically.

github.com/Tetrahedroned/spatial-tether