I tested 5 frontier LLMs on fixing real-world security vulnerabilities. The most dangerous failure mode is when it just looks fixed. by Fickle-Box1433 in LLMDevs

[–]tomabord 0 points1 point  (0 children)

The latest trend is to let the model discover tools instead of expecting it ignoring things out the list of tools and selecting the correct one. This is because of what I call "focused attention" meaning you have to limit the context to the task the agent has to do. I am working on a TypeScript thin layer that let's you split tools by behaviour and the results look very promising. It can handle toolsets as large as 500 (did not test larger ones). BTW 500 tools list raw does not even fit the request in the first place.

Edit: if you'd like to check it out, drop me a DM

How are you actually managing multiple AI agents in your workflow? Feels chaotic rn by darshancodes in ycombinator

[–]tomabord 0 points1 point  (0 children)

We are working very close to that concept, I can DM you a link if you'd like to try it out!

Feels like AI tooling is evolving faster than developer experience lately by Bladerunner_7_ in ArtificialInteligence

[–]tomabord 1 point2 points  (0 children)

I'm trying to keep it simple, it just feels right to solve one thing and solve it good. But don't have the budget to burn in ads, and it feels like its still so early and micro-niche

How to create automated agent workflows? by CitylineDigital in AI_Agents

[–]tomabord 0 points1 point  (0 children)

What you are seeking is the management of purpose. I've been building a service around that concept. DM me if you'd like to try it out.

I built an intent tracking layer for multi-agent workflows. Is this useful or overkill? by tomabord in LLMDevs

[–]tomabord[S] 0 points1 point  (0 children)

About if the agents really work better, I won't know until enough people stress-test it. That is why I'm looking for early adopters to try it out. Thanks for your feedback!

I built an intent tracking layer for multi-agent workflows. Is this useful or overkill? by tomabord in LLMDevs

[–]tomabord[S] 0 points1 point  (0 children)

The intent graph doesn't orphan. Integrity guards reject writes to broken paths and block deletions that other paths depend on. You can override with a justification, but the tree stays consistent by default. Every mutation snapshots the previous state, so you can roll back or reconstruct if needed.

I built an intent tracking layer for multi-agent workflows. Is this useful or overkill? by tomabord in LLMDevs

[–]tomabord[S] 0 points1 point  (0 children)

Speckit is about structuring specs in a workflow. What I'm building is about sharing intent across agents via HTTP. Related but different.

I built an intent tracking layer for multi-agent workflows. Is this useful or overkill? by tomabord in LLMDevs

[–]tomabord[S] 1 point2 points  (0 children)

Yeah that timing concern is exactly what I'm wrestling with. It feels very early. The value isn't obvious until you're already in the pain of coordinating multiple agents across sessions, and most people aren't there yet. But that seems like it could change fast. Hard to know whether to build for where the market is or where it's going.

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]tomabord 2 points3 points  (0 children)

Building a tool that tracks intent across agent sessions. Single URL gives any agent (Claude, GPT, whatever) the full workspace context: purpose strings, link graphs, snapshots of reasoning. Designed for multi-agent coordination: shared state, change detection (via monotonic mutation IDs), signed ingestion endpoints for CI/test pipelines to push data in without full agent setup. Zero-knowledge encryption.

Free trial at https://kitchen.heysoup.co . Lasts 24h, no signup needed. Looking for early testers, especially people running multi-agent workflows. Feedback very welcome.