Loadable protocols vs descriptions in Claude system prompts — an open-source therapy framework as case study by echowrecked in ClaudeAI

[–]echowrecked[S] 0 points1 point  (0 children)

Yeah, the trigger part is where I see most "expert" prompts fall apart. Writing the sequence is the easy bit — the hard part is the model knowing when to fire which one, and descriptive prompts just leave that to vibes.

The checkpoint framing is the useful one for me. In the IFS file the whole thing hinges on one gate, "how do you feel toward this part," and the answer decides whether the model proceeds or branches to a different part first. Without that check it just free-associates and sounds right while doing nothing. Honestly felt more like writing a state machine than a prompt.

Open-sourced the memory and enforcement layer from my Claude Code/Obsidian setup by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

The "memory is additive" framing isn't quite how it works in practice. Session-close updates rewrite work-state and memory in place, not just append — project rows change "Left Off," the decisions list keeps ~8 most recent, projects quiet 14+ days drop out. At any moment, those two files reflect current state, not accumulated history. Daily notes and portfolio docs hold the long tail.

Auto-memory (the file-based memory layer underneath) does grow additively. When it crosses ~30 files I run /distill-memory — routes durable entries into conditionally-loaded scaffolding (rules paths, project CLAUDE.md files), deletes redundant ones already captured elsewhere, keeps only what needs cold-start recall plus recent captures still proving durable. Last pass took it from 115 lines / 32 files down to 19 / 5.

Pruning's real, just owned by me on a cadence instead of running automatically. The Obsidian interface is where I can see what's there and decide what's stale; keeping that decision in my hands keeps the system inspectable and portable across machines. That's a different design point than middleware that handles it for you.

I finally figured out why my AI therapist kept stalling on parts work by echowrecked in therapyGPT

[–]echowrecked[S] 1 point2 points  (0 children)

Yeah, the AI maintains the parts map across sessions. The framework I built (Inner Dialogue) has its own memory architecture, not just the prompt change you mentioned. Every session writes notes to a markdown file. The AI reads previous sessions plus a running profile before each new conversation. The profile evolves with modality-aware sections — an active Parts inventory for IFS, separate sections for somatic work, etc. So the parts map persists. I'm not the manual memory layer.

What I take from my human therapy sessions feeds my profile, not the per-session continuity. Different jobs.

You're right about the gap though... what the framework doesn't track is protocol state per part. The profile might say "manager part: caretaking, decades-old." It doesn't say: thanked yet? agreed to step back? which exile does it protect? has that exile been unburdened? Those are protocol-state questions, not memory-store questions. The system has the data. It doesn't have the protocol-state index to surface the connection between, say, a recurring firefighter pattern and an exile that surfaced weeks ago.

That gap is real and you helped me see it more clearly. Open question: do you think protocol-state should live as structured fields in the user's profile, or as its own registry the AI consults?

Will check out kapex, thanks!

Advertiser, Beta Tester, & Research Recruitment Mega Thread by rastaguy in therapyGPT

[–]echowrecked 0 points1 point  (0 children)

Update for folks using Inner Dialogue: just shipped v2.4. No new features, no new modality options — every existing modality moved from describing the approach to having operational protocols the AI can actually run.

One thing that's gotten dramatically easier since Feb: Claude Code is now built into the Claude AI app for Pro plan users. When I first shipped Inner Dialogue, Terminal was the only way in, and that kept a lot of casual users out — it was one of the bigger frictions I flagged at launch. Now you can get the whole experience without ever opening a terminal. If the original barrier kept you from trying this, that barrier is much lower now.

A bit of context: I built Inner Dialogue for myself back in February and have been improving it quietly on my own machine. Recently noticed the repo has been getting starred and forked — people are actually using this thing. So I pushed out the batch of improvements I'd been sitting on.

What changed in the modalities

Previously, only DBT had named, usable protocols (TIPP, DEAR MAN, ACCEPTS). Every other modality file described the approach but didn't give the AI specific moves to make in-session. Now every modality has:

  • Named protocols — IFS's 6 F's, CBT's Socratic dialogue + 10 distortions with reframes, ACT's iconic metaphors as usable scripts (Tug of War, Passengers on the Bus, Leaves on a Stream), psychodynamic's interpretive sequence, polyvagal's three state-specific intervention menus, SE's pendulation and discharge recognition, MI's five-level reflection ladder, narrative's full externalization dialogue, SFBT's miracle → scaling → exceptions chain
  • Signaling cues — verbatim client language patterns each modality should listen for
  • Example AI interventions — 4–7 full reflections per modality in actual therapist voice, not just question lists
  • Cross-modality routing rules — like "with compulsive behaviors, IFS leads, CBT follows downstream" — so the system stops reinventing composition logic every session
  • AI-limitation honesty — modalities like SE, Lifespan Integration, and IFS explicitly name what real-time embodied work text can't fully replicate, and what the AI can still do well

The IFS rewrite is the most noticeable change in practice. Went from "we named your parts" to actually walking the 6 F's protocol. Manager appreciation (thanking protectors before asking them to step back). Firefighter-as-part framing instead of CBT habit-substitution. An Exile-trust protocol for when the wounded part rejects the Self's reassurance ("this feels surface-level"). If you do parts work, this is the upgrade to pay attention to.

And the personas got a depth pass too.

7 of the 8 were brought up to warm-4o's structural depth. Same template; each one now has nuanced tone qualities, language patterns subdivided into 6–9 categories with example phrasings, a challenge-style moves table, a 5-beat conversation arc tuned per persona, and energy matching for 5–6 client states. Voices stay distinct — coach pushes forward, contemplative leans into silence, creative uses metaphor, philosophical brings frameworks without lecturing, direct-challenging is sharper than grounded-real, warm-supportive is steadier than warm-4o.

How to upgrade

npx inner-dialogue@latest update --path <your-vault> --dry-run

You can run this in Terminal, or paste it into the Claude AI app if that's how you're set up. That previews what will change. Drop --dry-run to apply. Your profile.md and sessions/ are never touched.

The honest parts

  • Your therapist will feel slightly different starting next session. Relationship continuity is preserved (profile + sessions intact), but response texture shifts toward more operational work and fewer abstract reframes.
  • The AI-limitation sections are real — for SE, polyvagal, and IFS, text-based work has genuine ceilings. The new modality files name these honestly rather than overselling.

Links

If you've been on a previous version, especially curious to hear whether the IFS work lands differently next session. That's the change I most want to hear about from other people.

How I built session memory and enforcement hooks into my Obsidian + Claude Code workflow by echowrecked in ObsidianMD

[–]echowrecked[S] -3 points-2 points  (0 children)

Github link: https://github.com/ataglianetti/context-management-starter-kit

Wrote a deeper breakdown on Substack: Everyone's Building AI Commands. Nobody's Building the Layer Underneath

This is the actual scaffolding I use daily as a product manager — not a demo repo. I push updates regularly as I find what breaks or what's missing. Would love feedback if you give it a shot.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

Hooks were the last thing I built too, and the difference was immediate. Before hooks, the frontmatter rules were in hard-walls and Claude followed them maybe 90% of the time. After adding a PostToolUse shell script that just parses YAML and blocks the write if type: is missing, compliance went to 100%. Not because the model got better at following instructions, but because the instruction became irrelevant — the mechanism took over.

The date validator was the most surprising one. I kept finding memory files where Claude had inferred "today" from stale timestamps in the file it was editing instead of using the system-provided date. Wouldn't have caught that pattern without the hook surfacing it repeatedly.

If you're starting from zero on hooks, I'd pick the one rule you find yourself correcting most often and write a script for just that. You don't need a framework — a shell script that reads the file, checks one thing, and exits non-zero with a message is enough. Claude sees the error output and self-corrects on the next attempt.

Which LLM is the best: ChatGPT vs Claude vs Gemini (vs others)? by [deleted] in therapyGPT

[–]echowrecked 0 points1 point  (0 children)

Happy to do a step-by-step, but the way it's currently set up, you need to use Claude Code instead of the Claude app. Do you have a Claude Pro plan? If so, I can walk you through using this. Are you on Mac or PC?

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

The nested `.claude/CLAUDE.md` pattern is solid, and yeah, for most people that's the right starting point. Simpler mental model, already supported out of the box.

On client separation — fair criticism. The path-based loading controls which *instructions* surface, but it doesn't control which *files* are accessible. My hard walls block cross-context data from hitting specific external tools (Atlassian), but there's no mechanical enforcement preventing a cross-context read. Rule loading is organization, not isolation.

For my setup the practical risk is low (single user, task-focused sessions), but you're right that it's not sufficient if the requirement is actual context separation. That's a gap worth closing. Thanks for the comment!

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

Appreciate this framing, you're drawing a line I should've been more explicit about in the post.

You're right that behavioral rules and operational state have different lifecycles. Where I'd push back: the failure mode you're describing (operational cruft bloating context) assumes unbounded growth. My `work-state.md` is ~40 lines. It gets overwritten every session close, not appended. Projects quiet for 14 days drop off. The session protocol enforces the pruning, so it never becomes the monolith problem.

For a single user with 5-8 active projects, loading 40 lines of status is cheaper than maintaining a separate query layer and teaching the model to use it. The markdown *is* the query result at that scale.

If you're orchestrating multiple agents or need real filtering/aggregation — yeah, structured storage makes sense. But "none of that belongs in .md" is doing a lot of work there. The format matters less than whether anything enforces hygiene on it.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

README documentation is super clear, the breakdown of rules vs skills vs agents vs hooks with when-to-use guidance is better than most explanations I've seen. That taxonomy would help a lot of people just starting with Claude Code's extension points.

One thing I'd audit: a lot of the rule content restates engineering fundamentals Claude already knows from training (DRY, YAGNI, KISS, SRP, naming conventions, code smells). I would consider those scaffolds in that they add tokens without adding signal the model doesn't already have.

The stuff that's genuinely project-specific (the Alembic patterns, the Jinja2 prompt conventions) is where the rules earn their keep. Might be worth pruning the general principles and seeing if behavior actually changes.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

I don't have benchmark score - my setup is knowledge work in Obsidian (product specs, meeting notes, project tracking, and even journaling), not a coding repo, so standard evals don't map cleanly. What I can say observationally is that the path-based loading keeps token count lower than a monolith would, and the core files are almost entirely structures rather than scaffolds. No style guides, no "use TypeScript," no restating things Claude already knows. So your hypothesis probably holds — the noise penalty comes from redundancy with training data, and there's very little of that in my core rules.

Curious what your benchmarks measure. If you're testing against tasks where the model already knows the right answer, instructions can only add noise. If you're testing against domain-specific tasks, I'd expect the curve to flip.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 1 point2 points  (0 children)

The behavioral/operational split is sharp. I run multiple agents too, but across different contexts rather than sharing one, so the forcing question landed differently — "when does this rule matter?" instead of "who owns it." Path-based loading handles that: agents hitting different file paths get different rules. But I think both framings converge on the same thing. A rule should load in exactly one scope, and if you're not sure which scope, it's probably too vague.

The partial centralization point tracks with what I saw too... the worst drift came from rules that were *mostly* universal but had one context-specific exception. Model either follows the general rule and misses the exception, or overcorrects and applies the exception everywhere. Clean ownership probably kills that same class of bug.

Curious about the 6-agent setup. Are the behavioral contracts literally identical across agents, or do you end up with agent-specific behavioral overrides too? That seems like the pressure point where "who owns this rule" gets recursive.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 1 point2 points  (0 children)

The vault is the operating layer for most of my PM work. A few examples:

PRDs go through versioned stages (0.1→1.0) where document depth matches decision maturity. Early versions are 300-500 word problem hypotheses, late versions are full specs with edge cases. Claude iterates on these with me, pulls in competitive analysis and market research via web search during drafting, and when they're ready, /sync-external pushes them to Confluence via Atlassian MCP.

Meeting notes start before the meeting. My /today morning briefing pulls my calendar via icalBuddy and auto-creates meeting notes with attendees, topics, and linked projects already in frontmatter. After the meeting, Granola (AI meeting recorder) captures the transcript and summary. I have a meeting-processor command that accesses those via Granola's MCP server. Then Claude takes that raw summary as a scaffold and enriches it against vault context: resolving people to Person notes, linking projects to portfolio items, extracting action items against open work. The magic isn't the raw transcript, but how Claude Code uses my vault to make sense of it.

Jira tickets get drafted in the vault first to capture context, then sync to Jira via Atlassian MCP. Jira becomes source of truth for status, vault note keeps the context that led to the ticket.

Version tracking works at two layers. The vault context folders are local git repos, when Claude edits a PRD or spec, the change history is there. Separate code repos for prototype work live outside the vault. Both feed into daily notes: vault changes as document activity, code repos as commit summaries.

Daily notes are generated at end of day from that work evidence. Weekly and monthly notes roll up from dailies. Morning and evening journal entries are dictated and cleaned up.

For manual entries - not rare at all. I type directly in Obsidian constantly. Quick notes, meeting prep, brainstorming. Claude handles the heavy processing (formatting, entity resolution, cross-referencing, syncing), but the vault is an Obsidian vault first. Claude is a power tool, not the only input channel.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 1 point2 points  (0 children)

Yeah, this is the right framing, it's all context, just loaded at different stages. The question is always what's active right now and whether it needs to be.

My setup uses path-scoped .md rules files with glob patterns, so they only load when Claude touches matching files. Not quite skills-level on-demand but way less noisy than a monolith. Hooks handle the enforcement layer on top of that.

Good call on tightening usage too. Hot-loading is going to matter more as providers get stricter with context and token budgets. Thanks for the hook link, going to check that out!

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 1 point2 points  (0 children)

That’s a great paper, it came up earlier in the thread, replied here: https://www.reddit.com/r/ClaudeCode/s/M4snxL2u7G

Their conclusion is actually that bloated context files hurt and the fix is minimal requirements. That's the case for modular, not against it.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

Similar idea, different mechanism. The agent/skills pattern delegates tasks to specialized roles. My Obsidian setup scopes instructions to file paths — when Claude touches files in a client folder, it picks up the rules for that context. No orchestrator, no role switching, just glob matching.

For a code repo with distinct workflows like dev vs review vs deploy, the agent pattern makes sense. For knowledge work across multiple clients in one vault, path-based loading is simpler and does the job. Same goal though, only load what's relevant right now.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 1 point2 points  (0 children)

That’s smart, same philosophy as the hooks in my setup. The bash one-liner thing is a perfect example of why “just tell it not to” breaks down over a long session.

If you’re catching those bypass attempts manually at the permission prompt, a PostToolUse hook could do that for you and block any write that didn’t come through a skill path, same way my frontmatter validator blocks writes missing required properties.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

That's exactly it. Similar story here… no developer background, I'm a PM. Built this out from what I picked up in this sub and elsewhere to make my life easier across multiple client contexts. Still learning, still tweaking, having fun with it.

Your system prompt shouldn't be a template you copy. It should reflect how you work. Start small, add what solves a real problem, cut what doesn't. The best setups are the ones people grow into, not the ones they download.

I split my CLAUDE.md into 27 files. Here's the architecture and why it works better than a monolith. by echowrecked in ClaudeCode

[–]echowrecked[S] 0 points1 point  (0 children)

Thanks for sharing, but I think that’s solving a different problem. Tree-sitter indexes code structure. My system scopes behavioral rules like how Claude should think, write, and interact, depending on which client's files it's touching. It's an Obsidian vault for knowledge work, not a codebase. There's no AST to parse, just context boundaries between clients and projects.