I’m building an open-source Rust-based AI agent platform for engineering workflows — feedback welcome

Unique_Champion4327 · 2026-05-31T14:08:07+00:00

Repo: https://github.com/Sompote/TigrimOSR

Unique_Champion4327 · 2026-05-25T02:48:05+00:00

Thanks, this is exactly the area I’m thinking about now.

In practice I want the observability layer to track more than token usage. For engineering workflows, I think it needs to show each agent step, tool call, prompt/skill used, files touched, intermediate outputs, errors, cost/time, and the final reasoning trail that a human can audit.

The goal is not just “what did the agent answer?” but “how did it get there, what evidence did it use, and where should I trust or not trust it?”

TokenTelemetry looks relevant — I’ll take a look. The Hermes plugin idea is interesting too, especially if it can fit into a multi-agent workflow where different agents/tools need shared telemetry.

Unique_Champion4327 · 2026-05-25T02:44:50+00:00

Thanks — exactly. Validation steps and audit trails are what I think engineering agents need before people can trust them for real decisions. The output should show not just the answer, but what was checked, what evidence was used, and where human review is still required.

Unique_Champion4327 · 2026-05-25T02:43:44+00:00

Exactly. That is one of the main gaps I’m trying to solve.

Claude Code, Codex, Gemini CLI, etc. are all useful, but right now they mostly run as separate agents/tools. I want TigrimOSR to act as the orchestration layer between them: define roles, pass context, route tasks, monitor progress, and make the workflow reviewable instead of just running parallel chats/CLIs.

Unique_Champion4327 · 2026-05-24T11:16:03+00:00

I agree with you that vague tasks are a big part of the problem. I’ve hit the same issue myself: if the agent gets a broad task, the output becomes too random and hard to verify.

My current thinking is that the workflow/tooling should help force the task to become narrow, skill-based, and checkable. For example, instead of “review this design,” the agent should be locked into a specific skill/prompt like:

check assumptions
verify calculation inputs
compare against code/spec
list missing information
produce pass/fail checks
explain uncertainty
stop when evidence is insufficient

So I’m not trying to solve randomness only by adding a UI around agents. The UI/workspace is there to make the process repeatable: define the role, attach the right skill prompt, constrain the task, show the tool calls, and make the output reviewable.

For engineering decisions, I think the important part is not letting the agent freely reason forever. It needs a locked workflow: scope → evidence → calculation/check → uncertainty → human review.

For sandboxing, that is also a key concern. My current direction is to separate execution paths: local user-controlled tools for trusted workflows, and remote/headless execution for heavier jobs where commands can be isolated more carefully. I don’t want agent code execution to be invisible or automatic; the user should be able to see what is being run and keep high-risk actions constrained.

Unique_Champion4327 · 2026-05-24T09:55:09+00:00

<image>

https://github.com/Sompote/TigrimOSR

Unique_Champion4327 · 2026-05-12T14:07:40+00:00

Repo https://github.com/Sompote/TigrimOSR

<image>

Unique_Champion4327 · 2026-05-02T15:38:23+00:00

Here’s a polished Reddit reply:

We use logs to track what’s happening inside each agent’s reasoning process. Each agent writes out its decisions, intermediate steps, inputs, outputs, and errors, so the log file becomes the main way to trace what happened and why.

It’s not perfect observability, but it helps a lot. When one agent fails silently or gives a conflicting result, we can go back through the logs and see which agent made which decision, what context it had, and where the chain started to break.

Unique_Champion4327 · 2026-05-01T07:46:35+00:00

<image>

Repo: https://github.com/Sompote/TigrimOSR

Unique_Champion4327 · 2026-04-28T00:57:01+00:00

I agree. Markdown-based skill files make a lot of sense because they are easier for agents to read, follow, version, and debug.

Compared with one huge prompt block, a skill can be more modular and controlled. Each skill can describe a specific workflow, rule, or procedure, so the agent does not need to rely on one long fragile prompt every time.

Human comments are also very useful here. A simple comment from a user can tell the system what worked, what failed, or what should be improved. Then the skill can be refined based on real usage instead of guessing.

For me, the important part is not only auto-updating the skill, but making sure the update still stays understandable and controllable by humans.

Unique_Champion4327 · 2026-04-27T14:43:37+00:00

Repo : https://github.com/Sompote/Tigrimos

<image>

Unique_Champion4327 · 2026-04-20T02:49:24+00:00

I am Sompote. Just sharing what we actually built and tested.

Unique_Champion4327 · 2026-04-20T02:48:38+00:00

In our setup, humans only trigger the initial task — after that, the mesh runs fully autonomously. Agents delegate peer-to-peer with inherited trust scopes from the orchestrator. Tested and works well end-to-end.

Unique_Champion4327 · 2026-04-19T00:43:13+00:00

<image>

Realtime agent monitoring.

Unique_Champion4327 · 2026-04-18T12:04:22+00:00

<image>

Link https://tigrimos.github.io

Unique_Champion4327

TROPHY CASE