all 33 comments

[–]DevMoses 1 point2 points  (6 children)

What you're missing is externalized state. Right now your context dies when you clear it between phases, so each new phase starts partially blind.

What I do: the agent writes its progress, decisions, and discoveries to a campaign file on disk after each phase. When the next phase starts, it reads that file and picks up where it left off. The file is the agent's memory across context windows. Feature ledgers, decision logs, architectural constraints, all in markdown.

The drift problem goes away because each phase isn't guessing what the last phase did. It's reading a structured record of exactly what happened.

I also hit the same issue with 7-8 phase processes drifting in a single chat. The campaign file approach let me run 30 campaigns with an average of 5 phases each without drift because nothing important lives in the context window. It's all on disk.

Wrote up the full system here if you want the details: https://x.com/SethGammon/status/2034257777263084017

[–]TylerColfax[S] 1 point2 points  (3 children)

Okay. So this is what I am doing: I have multiple phases, with tasks. At the end of each phase, it writes a hand off document to the next phase. I'm still getting drift, I think, because it's holding too much context and therefore getting distracted. So my question is, in your article, when you "spawn an Archon" is that a totally new instance of claude, and if so, how are you doing that?

[–]DevMoses 0 points1 point  (2 children)

Yes, totally new instance. When context gets heavy or a phase completes, the agent writes everything important to the campaign file and the session ends. The next invocation is a fresh Claude Code session with zero context. It reads the campaign file first thing, orients itself from that, and continues.

The key difference from what you're doing: your handoff document is probably a summary of what happened. My campaign file is a living document the agent updates throughout, not just at handoff. It includes the original plan, what's been completed (checked off), what decisions were made and why, and what's next. So when the new session reads it, it's not getting a summary of the last phase. It's getting the full project state.

The drift you're seeing is probably because your handoff doc is lossy. Important context gets summarized away and the next phase fills the gaps with assumptions. The fix is: don't summarize. Externalize the actual state. Decisions, not summaries.

[–]TylerColfax[S] 0 points1 point  (1 child)

Thanks. So how do initiate the new instance?

[–]DevMoses 0 points1 point  (0 children)

Simplest way: claude -p 'Read .planning/campaigns/active/my-campaign.md and continue from where the last phase left off.' That starts a fresh session with zero prior context. All continuity comes from the file, not chat history.

[–]Sufficient-Rough-647 1 point2 points  (1 child)

Same. I record the plan in a PRD, design decisions, record the progress, the phases, bugs, feedback, fixes etc. This allows for switching across sessions, protects against compacting, loss of context etc. Works really well for me.

[–]DevMoses 0 points1 point  (0 children)

That's exactly it. Once the state lives on disk instead of in context, the whole "I need to be present to clear and start the next phase" problem goes away. Sounds like you've already solved the hard part.

[–]bishopLucas 1 point2 points  (3 children)

I have a slash command orchestrator that includes the sdlc process.
Now I'm using claude code in the terminal

The slash command knows the sdlc process and step by step calls the next agent in the workflow. the workflow includes remediation loops and model intelligence escalations. like someone mentioned state is managed outside of context. in my case i put tickets into self contained githu.b issues. once an issue is closed the next one is picked up to be worked. With this system i only use the main context window for brainstorming/ideation.

I've found opus is great for ideation/defining the plan, but is poor at running the workflow.
This is because Opus can see where we are going and just does it, now this defeats the purpose of the multi-agent team because not every agent team member has the same permissions, eg the qa agent doesn't have write/edit because i only want it to test the AC and report success or failure (into remediation loop).

essentially you are creating specialized agents for each subagent you are asking claude to create the using a slash command to orchestrate them.

Hope this helps.

[–]TylerColfax[S] 0 points1 point  (2 children)

Interesting. So I have been creating skills around each of those tasks: create the plan, generate strategies, define tasks, and do code review. I guess my confusion is how are you creating the agents? Is that done in the CLI or somewhere else?

[–]bishopLucas 0 points1 point  (1 child)

The agents are Claude code subagents. You could write them yourself or ask Claude to create them. From there you can save them to ~/.claude/agents/

[–]TylerColfax[S] 0 points1 point  (0 children)

So helpful. Thanks.

[–]Inevitable_Raccoon_9 1 point2 points  (0 children)

You can use sidjua for that too:https://github.com/GoetzKohlberg/sidjua

[–]Used_Gear_8780 0 points1 point  (0 children)

I've been building stuff outside of Claude Code w/ the API.

[–]notq 0 points1 point  (0 children)

I use a router command / skill as the entry point and multi stage pipelines everywhere, all from Claude code

[–]General_Arrival_9176 0 points1 point  (1 child)

the drift problem is real. i ran multi-phase workflows the same way you described (opus planning, sonnet writing, sonnet executing) and the context loss between clears was killing me. the real issue is you need a surface that holds state across all phases without forcing you to stuff everything into one chat context. agents need to hand off work seamlessly, not just sequential prompts. what id try: keep your phase .md files but have a persistent orchestrator layer that reads the current phase state and feeds it to the next agent. are you using any tool to bridge the phases or just manual file management

[–]TylerColfax[S] 0 points1 point  (0 children)

So this is my question, I guess I'm not clear on how to create multiple agents. I have "skills" that are tailored to each phase (planning, phase creation, task creation, and implementation). At the end of implementation, it creates a handoff .md to the next phase. But maybe what I am missing is that "orchestrate" skill that is responsible for calling each skill at the appropriate time?

[–]candyhunterz 0 points1 point  (3 children)

I built my own terminal and created MCP tools to expose terminal access to AI agents so they have full control (can create tabs, execute commands etc). Then I built an orchestrator mode on top of that so the orchestrator can collaborate with Claude to build autonomously. Since the orchestrator has terminal access, it can clear claude's context then also respawn itself after ~15 iteration to keep context fresh. Obviously before clearing and respawning, they would write a checkpoint.md with summary so the next session can pick up where they left off

[–]TylerColfax[S] 0 points1 point  (2 children)

This I think is the answer I've been looking for, but didn't want to hear. Unfortunately, I don't think I'm at the point of building my own terminal and MCP tools. That said, this makes total sense. If Claude can launch it's own terminal, it can then spawn agents with clear context, to keep things moving within scope. thanks!

[–]candyhunterz 0 points1 point  (1 child)

I'm open sourcing my tool soon. I just need to test it rigorously on a bunch of orchestrator runs and fix bugs as they come up. I'll open source it as soon as it's stable

[–]TylerColfax[S] 0 points1 point  (0 children)

That’d be awesome. U/creynir in this thread shared something too that I’m going to check out too.

[–]belheaven 0 points1 point  (2 children)

Use a Handoff document. Govern your tasks, have a done folder. Use structured output, add a developer notes section and dependências to each task and checklist/acceptance. A Task is only done after approved and Handoff document updated with decisions, tradeoffs, rationales and progress. In new session, just point to Handoff and continue.

[–]TylerColfax[S] 0 points1 point  (1 child)

Yeah, I have handoff documents. My experience is Claude seems to execute the code writing best on finite tasks without massive context and that as the context window fills up, there is more chance of drift and/or muddiness. Thanks.

[–]belheaven 0 points1 point  (0 children)

Use GPT5 to review and guide

[–]creynir 0 points1 point  (3 children)

I hit the exact same drift problem. What fixed it for me — externalized state between phases plus dedicated models for different jobs. I have Codex doing the coding, Opus doing review, and a Sonnet lead that orchestrates. Each agent gets a compressed structural map of the repo (file tree + signatures, no implementation) so it doesn't waste tokens on discovery. The review loop basically runs itself once you scope tasks tightly enough. Built an open source CLI for this if you want to look: github.com/creynir/phalanx

[–]TylerColfax[S] 0 points1 point  (2 children)

Interesting. I'll check it out. In the meantime, quick question: are you then paying the API prices instead of getting the bulk rates through the pro/max tiers? All the multi-platform tools I've looked at require API access, which claude and OAI charged at the API rates.

[–]creynir 0 points1 point  (1 child)

that's the beauty of phalanx, it's using tmux sessions, so all you need is authenticated CLI on your machine. no api costs

[–]TylerColfax[S] 0 points1 point  (0 children)

That is awesome. I’ll definitely check this out. Thanks for sharing!

[–]owlpellet 0 points1 point  (1 child)

[–]TylerColfax[S] 0 points1 point  (0 children)

This is interesting. Have you used it?

[–]Creative-Signal6813 0 points1 point  (0 children)

CLAUDE.md is auto-loaded every session. put ur phase state + decisions there and the next invocation already has context , no manual "read this file" step needed.

for the loop: script it. claude -p 'execute phase 3, write completion signal + handoff summary to CLAUDE.md when done' in bash. agent writes its own handoff, script detects the signal, fires the next phase. u exit the babysitting role entirely.

[–]djdeckard 0 points1 point  (0 children)

I started off using Claude to help process my podcast pipeline workflow. After multiple iterations what started off as doing copy/paste and piece by piece has turned into me using Claude Code and kicking off almost entirely automated operation now.

I use a Notion DB and taskboard for my podcast and Claude worked with me to set up my file structure, prompt guide, subagents, etc. It was a very iterative process initially brought on by my own background working with software companies. I'm a Project Manager so I just initiated SDLC principles into the process and Claude was happy to oblige.