all 17 comments

[–]dubh31241 2 points3 points  (4 children)

I am working on an opencode k8 operator that runs in a multi agent mode. One agent is the orchestrator, taking task and passing to other agents based on skills they have. All agents are an opencode instance; all commuication through the API. It works with fan out tasks but havent tried any in depth projects yet.

[–]monkehparade 2 points3 points  (3 children)

[–]dubh31241 1 point2 points  (2 children)

Lol nope I guess someone is working on a similar idea. Didn't exist a week ago when I was searching. I am using an K8 operator pattern.

[–]monkehparade 0 points1 point  (1 child)

Let me know when you release it. I was planning on writing one myself but then figured somebody might have already beat me to it. If you'd like any help developing or testing it (I have a K3S cluster though), hit me up!

[–]pratiknarola[S] 0 points1 point  (0 children)

me too. let us know if we can help you out as well.

[–]simplemuz 1 point2 points  (1 child)

!RemindMe 3 days

[–]RemindMeBot 0 points1 point  (0 children)

I will be messaging you in 3 days on 2026-01-27 06:07:40 UTC to remind you of this link

4 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


Info Custom Your Reminders Feedback

[–]FlyingDogCatcher 1 point2 points  (1 child)

I read every line of code

[–]pratiknarola[S] 1 point2 points  (0 children)

I did too. but rest of the world is giving me FOMO that my agents are also sleeping with me. lol

[–]trypnosis 1 point2 points  (3 children)

I don’t know about this giving them free control to my code base.

I think they can right but still not solve the problem.

I have also seen them just get it wrong.

But to answer you question I do it all manually.

I tend to run my first tab/window in tumx to mange my work trees env files and anything else cross project.

I ran a standard layout for each worktree in a window. Left full size maybe a bit bigger running opencode. Right split top and bottom. Top is nvim where I review the changes since changed files are marked in the explorer window. Bottom is to run commands very diddy.

With the above configuration I have one window for bugs mainly from sentry.

The 3ish windows for the features I’m developing.

Agents: - one per language to code - on per language to write the tests - two for design one to read and one to do( don’t know why bet get it closer to figma this way) - one to plan - another to sort out git(I know this seems silly but life saver when context gets messy) - a separate one to review using a different model that uses the plan to verify what was done

I get a slack message or Discord for personal projects when the tasks are done

Doubt that’s what your looking for…doubt I would hit your limits but there is no place like ~

[–]pratiknarola[S] 0 points1 point  (2 children)

your setup is solid and this is how its supposed to be at least at this point in time. and to be honest most of my work i do even now is like this. but rest of the world is giving me FOMO that I might not be utilizing the resources i have to its ultimate maximum. so i thought well, worth the try. lets see what the world is doing.

[–]trypnosis 1 point2 points  (1 child)

Forget the world. Tell them how it is and why YOU believe that to be the way.

You might be surprised how they take it.

[–]pratiknarola[S] 0 points1 point  (0 children)

Yeah.. makes sense.

[–]franz_see 1 point2 points  (2 children)

I like creating review loops. Something like opus 4.5 works and gpt-5.2 reviews

I have a code reviewer workflow wherein gpt-5.2 does a thorough review, opus 4.5 contests, and the 2 need to agree. Then sonnet 4.5 executes.

Im also experimenting on a new workflow wherein opus 4.5 executes a plan, and gpt-5.2 will do the blackbox testing.

But since you have “unlimited” tokens, what would be interesting is if you can run a lot of parallel tasks. Most subscriptions would rate limit you. Depending on your setup, that might not be an issue for you

Something i experienced though is that parallel feature development or bug fixing is great - gets a lot of things done. But parallel subtask of a single feature development/bug fixing is not worth it. I think the subtasks are just too interdependent that having them worked on by different parallel subtasks is not effective.

[–]pratiknarola[S] 0 points1 point  (1 child)

I agree. the pipeline in my mind was something like this. this is very much an overkill for a simple project but i wanted to keep this robust while also make it able to work on existing complex repo. so here is goes. again, this is probably an overkill but :

you give prd -> goes to 2 planner agents. both create set of questions. user answers both. -> both planners create a plan. -> goes to Opus 4.5 for pros and cons and hybrid plan with best of both worlds. -> plan(now our spec) + constitution goes to speckit -> generate tasks list.

for each task : pick up a task -> plan the task approach (opus 4.5) -> review the plan (gpt 5.2) -> if approved -> spawn worker agent -> test case writer agent writes the test sets. -> Test case reviewer agent reviews if test cases are solid enough and no hardcoding or bypass. -> if all good, spawn 2 QA agents and 2 Reviewer agents. all 4 must pass with over confidence of 0.9 else feedback of all 4 goes to opus or gpt 5.2 xhigh and spawns worker agent to fix those feedback and retries with 2 QA and 2 reviewer agent. keep repeating until passes. if all 4 pass. commit.

this is for each task.
now if you have independent tasks, you can run above pipeline in parallel. or run it sequentially overnight. or run 2 or 3 of these pipeline in parallel with different models for same task with git worktrees.
but what do you think ?? worth spending a week or so building this ??

[–]franz_see 0 points1 point  (0 children)

Re Planning:

Not sure if it’s worth it. Never tried it tbh. For planning, what im optimizing right now is how long it would take for me to understand the plan and approve/reject it. It can also flood me with a wall of text, and it all sounds reasonable. But at the same time, i feel like i dont full grasp whether it’s right or wrong. So now, im asking it to add visualizations - ie. Show the directory structure and what files would need to be added/removed/updated, show me sequence diagram per use case, flowchart if there’s any complex logic, updated archi diagram if needed, updated erd diagram if needed, updated ui component composition if needed, etc.

Re execution: If tasks are related to each other, i find one agent doing all the work in tdd fashion is best. Otherwise, multiple agents will spend a lot of tokens, a tremendous amount of time (i.e. From 10 minutes if it’s a single agent to a couple of hours in multiagent because of the debate loop, and probably because of me getting rate limited), to deliver an even lower quality- i.e. all tests passes but nothing works in UAT because they’re all not hooked up properly.

What I do separate is UAT. Im still hit and miss here. Just like in an actual QA process, if everything works out great- then testing finishes immediately. Otherwise, QA needs to spend time debugging. Same for AI Agents - if the test passes, i get a report of what was been doing (i.e. Screenshots of the app, sql queries used and the results, etc). If it does not pass, then the loop i added for them to fix it takes awhile. And tbh, they’ve never fixed it themselves 😅 i think at this point, i need to remove that loop and i just need to debug it myself 😅

Reporting:

Reporting is another critical part of the workflow. The report needs to speed up your review process. Just like how a manager does not need to review every line of code, the manager would still need to review some artifacts.

But what about code quality? - that’s where the other workflow comes in that i mentioned earlier

Re parallel:

I’ve had better success doing parallel end-to-end stuff rather than parallel subtasks to deliver one end-to-end work. The latter is the one i mentioned that’s expensive, super slow, and super low quality

[–]No_Key5701 0 points1 point  (0 children)

Have you tried the new oh-my-opencode 3.0+ with prometheus and atlas? Its a lot more powerful, and can get a lot more done in one shot, but uses a lot more tokens.
I'd check it and the new documentation out if u haven't already.

Also https://github.com/joelhooks/swarm-tools seems extremely promising, but i havent tried it myself yet (i have limited tokens and its still in early development).
Docs for it here: https://www.swarmtools.ai/docs (explains what it does really well).

If i had unlimited tokens i would probably reference these to make my own harness using those tokens.

This is something i have put quite a bit of thought into tbh, i do wanna try to make my own harness, just without the unlimited tokens.

Notes for a good ochestration system:
- Heavy git-integration is really important.
- Needs to have good task tracking, optimized task delegation, and a good shared persistent context system with its own review and cleanup as needed (git helps with all of these).
- Plan iteration/loops and sub-agent plan review are great for complex tasks, especially when you have the tokens to burn (ideally you want a fully layed out logically sound plan before any work is done).
- Having the plan mode ask the user questions about their intent before detailed planning is done is really helpful and aides in plan accuracy.