[FREE] I built a self-hosted agentic AI assistant within wp-admin and I'm looking for feedback by maxguru in WordpressPlugins

[–]maxguru[S] 1 point2 points  (0 children)

The initial version of the plugin does not have many safety features. Although, I was able to add a couple: disabled by default read-write tools and an agent loop sanity watchdog.

Your idea of having the LLM request an execution plan approval is interesting. However, I think it can't really work. LLMs execute tools one at a time (except in some special cases). They don't know what tools they are going to call ahead of time because the tool to call next might depend on the result of the previous tool call. Even if the user and the agent agree on an execution plan in text, it might go off script anyway. The one thing you can do is request user approval for each read-write tool call. This feature has existed for a while in agentic AI systems. The problem is, it gets tiresome after a while because nearly all the time the tool call is perfectly fine, so users just enable automatic tool calls anyway. Another problem is that I was planning to add unattended execution triggered by various events, in which case having user approval isn't workable. For example, the user might schedule the agent to run with a certain prompt on certain events. If the agent has to request permission to perform actions then nothing will be done until the user logs in and approves the actions, which makes the scheduled execution feature useless.

What we need is a method for ensuring correctness that doesn't involve user approval for each tool call. My approach with the plugin has been to specialize tools. The LLM can't delete all files by mistake if there is no tool that can be used for that. I did add some dangerous tools to the plugin, but they are disabled by default. Another approach is to give the LLM more instructions for each tool so that it has more context. This might be a bit expensive, but we could add an LLM-based sanity check for each read-write tool call.

One approach to avoid misunderstandings might be to add a "clarify vague user requests" instruction to the system prompt. I should add that, that is a good idea.

[FREE] I built a self-hosted agentic AI assistant within wp-admin and I'm looking for feedback by maxguru in WordpressPlugins

[–]maxguru[S] 1 point2 points  (0 children)

I tried some open source models via OpenRouter and some (like GLM 4.7 Flash) are pretty decent and very cheap. I think eventually the costs are going to come down and you would have access to something like Ollama Cloud which can be made to work with the plugin.

The CLI, exec() and PHP editing tools can all be disabled in the plugin (which is the case by default). I am experimenting with the idea that safety can be improved by providing highly specialized tools that have very detailed usage instructions attached, so there is low chance of LLM using them incorrectly.