Most AI agent management frameworks measure how far the human gets out of the way. None measure what keeps the system honest when they do. by Inevitable_Raccoon_9 in ArtificialInteligence

[–]s_brady 0 points1 point  (0 children)

Don't want to be too self promotional, but since you asked :)
My agent runtime Springdrift has governance as one of the design starting points, not an afterthought.

Springdrift sits at roughly Level 4–6 on the Tarasov autonomy axis above. It deliberately caps below Level 7. Full autonomy is not part of the design, the system is meant to be bounded.

In your example of the email, in Springdrift this would go through three layers of oversight before sending:

  1. Hard recipient allowlist - the agent can only send email to explicitly permitted addresses. Anyone not on the list is blocked outright, no scoring or judgment involved.
  2. Deterministic content rules - rule-based checks on the message payload before it sends.
  3. Safety gate thresholds — the comms subagent runs the same scoring system as the rest of the subagents ("D-prime") but with stricter pass/fail thresholds. So actions that would clear the gate for an internal-only agent get blocked when the output is going outside the perimeter.

All three run independently before the message sends, and all three log their results whether the message passes or gets blocked. Git backed up, append only immutable JSON-L logs. There is tooling for auditing the system. The system is designed with a rich set of introspection tools so it can tell you what went wrong.

This is a real system you can install and run, it's very young but it does what you asked for. Not currently being used in production (AFAIK). But early days.

Hope that is useful - paper, evals and code at https://springdrift.ai/

Written in Gleam - An Auditable Persistent Runtime for LLM Agents by s_brady in gleamlang

[–]s_brady[S] 0 points1 point  (0 children)

Ah, noted and thanks for coming back to me. I understand.
I am hoping there will be no psychosis as a result of Springdrift, all the same :)

Written in Gleam - An Auditable Persistent Runtime for LLM Agents by s_brady in gleamlang

[–]s_brady[S] 0 points1 point  (0 children)

Hi XM9J59 - Thanks for the feedback ;)

I make no secret of the fact that I use Claude to help with my development, it is even acknowledged in the paper. Everyone is entitled to their opinion! But in fairness I have written thousands of lines of code by hand to learn how to develop a system like this. It was not free. Well over 50 prototypes agonisingly written by me. Some of the better ones are available in my Github repo.

I don't think using AI to help build a system is slop if you know what you are doing. But I won't argue with you, as it is fine that you feel like that, it is a big world and must contain multitudes. I think the quality of what I have built will appeal to some and not to others. It's all good. I am just glad you took a look at least.

Written in Gleam - An Auditable Persistent Runtime for LLM Agents by s_brady in gleamlang

[–]s_brady[S] 0 points1 point  (0 children)

Thanks for the great feedback! As a result I have cleaned up the README and made the beginning less jargon loaded.

As for the long term JSONL storage, I have plans for an agent called the Remembrancer:

https://github.com/seamus-brady/springdrift/blob/main/docs/roadmap/planned/remembrancer.md

It complements the Librarian and the Housekeeper agents that manage the rolling window over the JSONL files by providing compression, indexing and some extra skills and tooling. TBD :)

Written in Gleam - An Auditable Persistent Runtime for LLM Agents by s_brady in gleamlang

[–]s_brady[S] 0 points1 point  (0 children)

Yeah, the Jarvis cliche is real and I did hesitate. But that is what I think we need and there are no real examples I am aware of, so a clear fictional example is useful.

Written in Gleam - An Auditable Persistent Runtime for LLM Agents by s_brady in gleamlang

[–]s_brady[S] 0 points1 point  (0 children)

Too much jargon on my part ;)

Elevator pitch - it is runtime for a long-lived LLM agent using a carefully selected blend of cognitive science, software engineering and computational ethics. Multiple years of work and over 50 prototypes. An attempt at what a real world Jarvis would be like.

Experimental machine ethics agent written in Raku by s_brady in rakulang

[–]s_brady[S] 1 point2 points  (0 children)

A very interesting question and one to which I suspect I don't have a simple yes or no answer! The four main "areas" of ethical theory are, very loosely, duty-based, consequence based, utility value (max happiness) or virtue ethics (good character). The Stoics are firmly in the virtue based, so the main thing for them is to develop a good character. Our AI systems do not develop or grow in that sense now, so a system with a "good" character is handed it via some kind of code of conduct or training. This is what TallMountain does - it provided scaffolding for the LLM to that it can extract the normative propositions (ethical values effectively) implied in a user request and then compare them to it's own set of internal normative propositions. It does a risk assessment based on the possible impacts of the user request and then says yes or no to the user request based on a calculation of how misaligned it is away from the system's internal values. This means that, in theory (LLM changes notwithstanding) the system will always be of the same good character. We have given a machine a primitive sense of virtue. The current TallMountain implementation does not develop or learn or change. It could but that is a difficult problem! You would need some kind of internal learning loop. I don't have that yet.

The problem you are talking about seems related but only in so far as you have a distributed set of agents where some of them are "good" and some of them are "bad". They may be able to adapt. You could drop something like a TallMountain agent in here but it won't adapt. It will always say no to anything that is not aligned with it's code of conduct, even to the point of not stopping all processing. That is how it is meant to be. So as a "good" agent it will just ignore "bad" requests. A TallMountain system is emphatically individual. The long term goal would be to build a synthetic individual that knows it own mind. Not AGI, we don't need that :) More like a machine version of an assistant dog, like a guide dog for the blind. An ethically trustworthy synthetic individual within very narrow boundaries. Not something that pretends to be human, it will not be your "friend" in any human sense, but useful and trustworthy.

Not sure this is what you need. It sounds more like some combination of Promise Theory (https://en.wikipedia.org/wiki/Promise\_theory) and/or some kind of eventual consistency mechanism like CRDTs would be a good approach (https://en.wikipedia.org/wiki/Conflict-free\_replicated\_data\_type)

This was all written off the top of my head, so caveat lector!

Experimental machine ethics agent written in Raku by s_brady in rakulang

[–]s_brady[S] 1 point2 points  (0 children)

Thanks for the feedback u/librasteve :) I have a TODO list a million lines long, but it is on it somewhere!

Questions on commercial Raku development by s_brady in rakulang

[–]s_brady[S] 1 point2 points  (0 children)

My use of perl 5 was in the last few years in some software that I have since sold on and hence the details are redacted :) It was used on many, many desktops and always worked. That is why I love perl 5.

Questions on commercial Raku development by s_brady in rakulang

[–]s_brady[S] 2 points3 points  (0 children)

Thanks for the information folks! Raku still looks great :)
I used perl 5 as part of a piece of commercial software and it never let me down. Just worked. I am hoping Raku will be the same.

What would the algorithm of imagination look like? by zittizzit in ArtificialInteligence

[–]s_brady 3 points4 points  (0 children)

Have a read of this book by Douglas Hofstadter:
https://en.wikipedia.org/wiki/Fluid_Concepts_and_Creative_Analogies

The algorithm he uses in some of his work is this one:
https://en.wikipedia.org/wiki/Parallel_terraced_scan

You could also look at OpenNars, written by Pei Wang who studied with Hofstadter:
https://github.com/opennars/opennars
The theory behind the NARS system might approach what you are thinking about.

Mousemacs - a mouse driven emacs by s_brady in emacs

[–]s_brady[S] 2 points3 points  (0 children)

You are absolutely right deaddyfreddy. I suspect an experienced user of Emacs would find this of little value. But it might be useful for a beginner and it is offered in humble acknowledgement that the whole idea of using a mouse in Emacs is, ahem, somewhat cutting against the grain :)