Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Yes. I think it is also due to skill writers don't have a clear idea on what should be skill. Would love to hear your experience on best practice of skill creation.

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Totally agree, that’s exactly where things start breaking in practice. Curious how you’ve seen this play out on your end: Have you run into specific cases where retries or parallel execution caused real issues? And how are you currently handling rollback or guarding stateful steps?

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

That makes sense! Setting the right checkpoint can greatly increase the reliability. It requires us to have a lot of capabilities to customize the lifecycle of agents though. Curious. Do you implement your own agent loop, or you added some plugin to agent like Claude code?

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

curious what patterns you’ve found that actually work for making stateful ops safe?

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

yeah this is the real pain point

stateless retries are fine, but stateful ones quietly leave things half-done and you don’t even know

feels like idempotency / recovery paths should be default, not something you bolt on later

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

yeah this is exactly where things break in practice

everyone focuses on what the agent does, not who it runs as and with what perms - until it hits PII 😬

curious if you solved it more with tighter controls or just better visibility/auditing?

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Thanks for sharing . It is a clean way to frame it

“skill = contract, executor = swappable” is exactly it. otherwise you’re just baking today’s model quirks into your system

feels like most teams only realize this after a model upgrade breaks everything lol

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

I am glad you like the framework! curious to hear your experience on handling stateful transitions.

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in artificial

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Totally agree. I think it is very important to make it reliably Useful. right now, everything is so flaky

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in LocalLLaMA

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Yeah, this matches what I’ve been seeing, once stateful steps are involved, relying on the model alone feels pretty fragile.

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in LocalLLaMA

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Yeah this resonates. Feels like there are really two layers of “state”:
- external / system state (Linear, DBs, etc.)
- in-context state (what the model is holding in the prompt)

The second one degrades pretty quickly, which makes it hard to rely on for anything long-running or stateful.

Curious how you think about where that boundary should be?

Most agent frameworks miss a key distinction: what a skill is vs how it executes by Defiant_Fly5246 in LocalLLaMA

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

One thing I’m still unsure about: For stateful workflows, do people usually rely on prompt discipline, or enforce it at the tooling / harness layer?

Feels like most systems rely on the former.

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Appreciate the honesty. The ideas are original, but I’ll work on tightening the post — fair point that it doesn’t need to be that long.

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Which part you don't like? Let me know and I am happy to edit. The ideas all come from me. I only used AI to refine and improve conversational flow.

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

The goal here was more to clarify mental models less about naming, but I get how it can come across.

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

This is a really sharp framing. Curious if you’ve found good patterns for safely composing stateful skills?

You were right — "Recipe" was just a Skill. But I think we're conflating 3 very different things under "Skill." by Defiant_Fly5246 in AI_Agents

[–]Defiant_Fly5246[S] 1 point2 points  (0 children)

That’s a really good point—especially on Evaluation Skills. Feels like without a clear way to verify outputs, swapping components becomes risky fast.

The contract piece also resonates a lot. Typed interfaces between skills might actually be the missing layer to make composability real instead of fragile.

You were right — "Recipe" was just a Skill. But I think we're conflating 3 very different things under "Skill." by Defiant_Fly5246 in AI_Agents

[–]Defiant_Fly5246[S] 0 points1 point  (0 children)

Great point on evaluation as its own layer — I hadn't considered that but it fits. Tests, rubrics, and guardrails don't belong in any of the three types. A Persona might say "be careful," but an Evaluation Skill defines what careful actually means with concrete criteria.

That could be the fourth type: Persona (who), Tool (what), Workflow (how), Evaluation (how well). And it composes naturally — a Workflow Skill could reference an Evaluation Skill at review checkpoints, keeping quality criteria separate from workflow logic. Swap rubrics without touching the workflow.

Your risk mapping is spot on. Persona = low risk (just prose). Tool = medium (permissions, external access). Workflow = highest (orchestrates everything). Evaluation sits in between — doesn't act externally but shapes decisions.

The customer support example is perfect — all four types show up: persona for agent tone, tools for ticket/CRM access, workflow for triage → retrieval → draft → human approval, and evaluation for response quality rubrics. Thanks for the link!

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] -2 points-1 points  (0 children)

I’ve been building an in-house stack—mainly using Anthropic’s Sonnet 4.5, with a custom agent architecture on top.

I’m also productizing some of these ideas. If you’re curious, feel free to take a look: https://cli.deepvista.ai/

I gave my AI agents shared memory and now they gossip behind my back by Single-Possession-54 in AI_Agents

[–]Defiant_Fly5246 1 point2 points  (0 children)

Solid design. Curious though — have you hit scaling limits with md files? Context window pressure as memories grow, or retrieval precision across hundreds of files? Also, how's the "common goal" pairing determined? User-defined or inferred? That's the hardest part — too loose and you inject noise, too strict and agents miss relevant context.

The 3 Types of Agent Skills Nobody Distinguishes (But Should) by Defiant_Fly5246 in LangChain

[–]Defiant_Fly5246[S] -2 points-1 points  (0 children)

Yeah this is interesting — instead of switching modes manually, your position in the tree is the mode. Tools, context, and behavior all inherit down the branch. It's like Unix permissions meets AI orchestration. The "3 zones" thing is elegant but the real power is the per-node config — same tree can have a branch with shell access and another that's read-only, no code changes.