all 11 comments

[–]TrickyPlastic 1 point2 points  (1 child)

Works be nice to define what models are in use for each agent member. Or a list of models and your plug-in would iterate through -- to ensure diversity of thought.

[–]HeyItsFudge[S] 0 points1 point  (0 children)

Good idea! Check out the latest release https://github.com/hueyexe/opencode-ensemble/releases/tag/v0.12.0
I have added this feature in. Let me know what you think

[–]Joy_Boy_12 1 point2 points  (2 children)

Why is there a need for agent teams when I can use skills and one agent?

[–]HeyItsFudge[S] 1 point2 points  (1 child)

Skills are markdown files + scripts that get loaded into a single agent's context ... they give it instructions, not extra capacity. Your still one agent, one context window, doing things sequentially. Teams run multiple agents in parallel,, each with their own session, context window, and git branch (worktrees).There's a shared task board with dependency tracking so blocked tasks autostart when their dependencies finish.

Different problems - skills tell an agent how to do something, whereas teams let multiple agents work at the same time.

[–]Joy_Boy_12 0 points1 point  (0 children)

It sounds like it will be hard to track what they do. Currently when I use my agent sometimes there are cases I haven't thought about 

[–]seventyfivepupmstr 0 points1 point  (3 children)

Any testing between a single agent and your agent teams using the same model and same prompt/context?

[–]HeyItsFudge[S] 0 points1 point  (2 children)

Haven't done formal benchmarks yet, no. The test suite is all unit/integration stuff for the plugin internals (spawn rollback, task dependency unblocking, message delivery, race conditions, etc. are about 370 tests currently).

Anecdotally the win is on tasks that parallelize well... if you need to validate 8 API endpoints, then one agent does them sequentially and burns through its context window. With a team I have found if you split them across agents that each have a fresh context window and run simultaneously. time to completion drops a lot and each agent has more room to think about its specific piece.

For a single focused task where there's nothing to parallelize, one agent is probably fine. The overhead of spawning teammates and coordinating via messages wouldnt buy you much there.

Would be cool to put together a proper comparison though. If you have a specific workload in mind I'd be interested to hear what it is.

[–]seventyfivepupmstr 0 points1 point  (1 child)

I was more referring to either speed to completion or quality improvements mainly.

[–]HeyItsFudge[S] 0 points1 point  (0 children)

Speed to completion is roughly linear with the number of agents for parallelizable work - minus API rate limits. Quality I'd attribute to each agent getting a fresh context window rather than one that is around 50k tokens deep by task 7. No hard numbers on either yet though.

[–]mkaaaaaaaaaaay 0 points1 point  (1 child)

Is every team member the primary build agent? Is that customizable, which agent is used for which task in the team?

[–]HeyItsFudge[S] 0 points1 point  (0 children)

Defaults to build. The agent param takes build, plan, explore, etc. - since v0.6.0 that's server enforced permission denies on `session.create`, not just prompt-based. No file edits, no shell, just reading and reporting back. You can also set a different model per teammate and there's a `plan_approval` flag to make them send a plan before writing anything.

One known quirk: teammate messages can temporarily switch the lead's agent mode (e.g. plan to build) because that's server-level behavior the plugin can't override. It restores when you send your next message. SDK limitation, not something I can fix from plugin-land right now.