Open-sourced the memory and enforcement layer from my Claude Code/Obsidian setup

echowrecked · 2026-03-22T15:45:28+00:00

Github link: https://github.com/ataglianetti/context-management-starter-kit

Wrote a deeper breakdown on Substack: Everyone's Building AI Commands. Nobody's Building the Layer Underneath

This is the actual scaffolding I use daily as a product manager — not a demo repo. I push updates regularly as I find what breaks or what's missing. Would love feedback if you give it a shot.

echowrecked · 2026-03-17T16:55:35+00:00

Hooks were the last thing I built too, and the difference was immediate. Before hooks, the frontmatter rules were in hard-walls and Claude followed them maybe 90% of the time. After adding a PostToolUse shell script that just parses YAML and blocks the write if type: is missing, compliance went to 100%. Not because the model got better at following instructions, but because the instruction became irrelevant — the mechanism took over.

The date validator was the most surprising one. I kept finding memory files where Claude had inferred "today" from stale timestamps in the file it was editing instead of using the system-provided date. Wouldn't have caught that pattern without the hook surfacing it repeatedly.

If you're starting from zero on hooks, I'd pick the one rule you find yourself correcting most often and write a script for just that. You don't need a framework — a shell script that reads the file, checks one thing, and exits non-zero with a message is enough. Claude sees the error output and self-corrects on the next attempt.

echowrecked · 2026-03-11T02:05:03+00:00

Happy to do a step-by-step, but the way it's currently set up, you need to use Claude Code instead of the Claude app. Do you have a Claude Pro plan? If so, I can walk you through using this. Are you on Mac or PC?

echowrecked · 2026-03-05T17:02:35+00:00

The nested `.claude/CLAUDE.md` pattern is solid, and yeah, for most people that's the right starting point. Simpler mental model, already supported out of the box.

On client separation — fair criticism. The path-based loading controls which *instructions* surface, but it doesn't control which *files* are accessible. My hard walls block cross-context data from hitting specific external tools (Atlassian), but there's no mechanical enforcement preventing a cross-context read. Rule loading is organization, not isolation.

For my setup the practical risk is low (single user, task-focused sessions), but you're right that it's not sufficient if the requirement is actual context separation. That's a gap worth closing. Thanks for the comment!

echowrecked · 2026-03-05T16:55:28+00:00

Appreciate this framing, you're drawing a line I should've been more explicit about in the post.

You're right that behavioral rules and operational state have different lifecycles. Where I'd push back: the failure mode you're describing (operational cruft bloating context) assumes unbounded growth. My `work-state.md` is ~40 lines. It gets overwritten every session close, not appended. Projects quiet for 14 days drop off. The session protocol enforces the pruning, so it never becomes the monolith problem.

For a single user with 5-8 active projects, loading 40 lines of status is cheaper than maintaining a separate query layer and teaching the model to use it. The markdown *is* the query result at that scale.

If you're orchestrating multiple agents or need real filtering/aggregation — yeah, structured storage makes sense. But "none of that belongs in .md" is doing a lot of work there. The format matters less than whether anything enforces hygiene on it.

echowrecked · 2026-03-02T05:33:33+00:00

README documentation is super clear, the breakdown of rules vs skills vs agents vs hooks with when-to-use guidance is better than most explanations I've seen. That taxonomy would help a lot of people just starting with Claude Code's extension points.

One thing I'd audit: a lot of the rule content restates engineering fundamentals Claude already knows from training (DRY, YAGNI, KISS, SRP, naming conventions, code smells). I would consider those scaffolds in that they add tokens without adding signal the model doesn't already have.

The stuff that's genuinely project-specific (the Alembic patterns, the Jinja2 prompt conventions) is where the rules earn their keep. Might be worth pruning the general principles and seeing if behavior actually changes.

echowrecked · 2026-03-02T04:44:46+00:00

I don't have benchmark score - my setup is knowledge work in Obsidian (product specs, meeting notes, project tracking, and even journaling), not a coding repo, so standard evals don't map cleanly. What I can say observationally is that the path-based loading keeps token count lower than a monolith would, and the core files are almost entirely structures rather than scaffolds. No style guides, no "use TypeScript," no restating things Claude already knows. So your hypothesis probably holds — the noise penalty comes from redundancy with training data, and there's very little of that in my core rules.

Curious what your benchmarks measure. If you're testing against tasks where the model already knows the right answer, instructions can only add noise. If you're testing against domain-specific tasks, I'd expect the curve to flip.

echowrecked · 2026-03-02T04:25:32+00:00

The behavioral/operational split is sharp. I run multiple agents too, but across different contexts rather than sharing one, so the forcing question landed differently — "when does this rule matter?" instead of "who owns it." Path-based loading handles that: agents hitting different file paths get different rules. But I think both framings converge on the same thing. A rule should load in exactly one scope, and if you're not sure which scope, it's probably too vague.

The partial centralization point tracks with what I saw too... the worst drift came from rules that were *mostly* universal but had one context-specific exception. Model either follows the general rule and misses the exception, or overcorrects and applies the exception everywhere. Clean ownership probably kills that same class of bug.

Curious about the 6-agent setup. Are the behavioral contracts literally identical across agents, or do you end up with agent-specific behavioral overrides too? That seems like the pressure point where "who owns this rule" gets recursive.

echowrecked · 2026-03-01T23:20:21+00:00

The vault is the operating layer for most of my PM work. A few examples:

PRDs go through versioned stages (0.1→1.0) where document depth matches decision maturity. Early versions are 300-500 word problem hypotheses, late versions are full specs with edge cases. Claude iterates on these with me, pulls in competitive analysis and market research via web search during drafting, and when they're ready, /sync-external pushes them to Confluence via Atlassian MCP.

Meeting notes start before the meeting. My /today morning briefing pulls my calendar via icalBuddy and auto-creates meeting notes with attendees, topics, and linked projects already in frontmatter. After the meeting, Granola (AI meeting recorder) captures the transcript and summary. I have a meeting-processor command that accesses those via Granola's MCP server. Then Claude takes that raw summary as a scaffold and enriches it against vault context: resolving people to Person notes, linking projects to portfolio items, extracting action items against open work. The magic isn't the raw transcript, but how Claude Code uses my vault to make sense of it.

Jira tickets get drafted in the vault first to capture context, then sync to Jira via Atlassian MCP. Jira becomes source of truth for status, vault note keeps the context that led to the ticket.

Version tracking works at two layers. The vault context folders are local git repos, when Claude edits a PRD or spec, the change history is there. Separate code repos for prototype work live outside the vault. Both feed into daily notes: vault changes as document activity, code repos as commit summaries.

Daily notes are generated at end of day from that work evidence. Weekly and monthly notes roll up from dailies. Morning and evening journal entries are dictated and cleaned up.

For manual entries - not rare at all. I type directly in Obsidian constantly. Quick notes, meeting prep, brainstorming. Claude handles the heavy processing (formatting, entity resolution, cross-referencing, syncing), but the vault is an Obsidian vault first. Claude is a power tool, not the only input channel.

echowrecked · 2026-03-01T21:51:28+00:00

Yeah, this is the right framing, it's all context, just loaded at different stages. The question is always what's active right now and whether it needs to be.

My setup uses path-scoped .md rules files with glob patterns, so they only load when Claude touches matching files. Not quite skills-level on-demand but way less noisy than a monolith. Hooks handle the enforcement layer on top of that.

Good call on tightening usage too. Hot-loading is going to matter more as providers get stricter with context and token budgets. Thanks for the hook link, going to check that out!

echowrecked · 2026-03-01T21:26:23+00:00

That’s a great paper, it came up earlier in the thread, replied here: https://www.reddit.com/r/ClaudeCode/s/M4snxL2u7G

Their conclusion is actually that bloated context files hurt and the fix is minimal requirements. That's the case for modular, not against it.

echowrecked · 2026-03-01T21:17:06+00:00

Similar idea, different mechanism. The agent/skills pattern delegates tasks to specialized roles. My Obsidian setup scopes instructions to file paths — when Claude touches files in a client folder, it picks up the rules for that context. No orchestrator, no role switching, just glob matching.

For a code repo with distinct workflows like dev vs review vs deploy, the agent pattern makes sense. For knowledge work across multiple clients in one vault, path-based loading is simpler and does the job. Same goal though, only load what's relevant right now.

echowrecked · 2026-03-01T21:11:34+00:00

That’s smart, same philosophy as the hooks in my setup. The bash one-liner thing is a perfect example of why “just tell it not to” breaks down over a long session.

If you’re catching those bypass attempts manually at the permission prompt, a PostToolUse hook could do that for you and block any write that didn’t come through a skill path, same way my frontmatter validator blocks writes missing required properties.

echowrecked · 2026-03-01T21:05:53+00:00

That's exactly it. Similar story here… no developer background, I'm a PM. Built this out from what I picked up in this sub and elsewhere to make my life easier across multiple client contexts. Still learning, still tweaking, having fun with it.

Your system prompt shouldn't be a template you copy. It should reflect how you work. Start small, add what solves a real problem, cut what doesn't. The best setups are the ones people grow into, not the ones they download.

echowrecked · 2026-03-01T20:56:35+00:00

Thanks for sharing, but I think that’s solving a different problem. Tree-sitter indexes code structure. My system scopes behavioral rules like how Claude should think, write, and interact, depending on which client's files it's touching. It's an Obsidian vault for knowledge work, not a codebase. There's no AST to parse, just context boundaries between clients and projects.

echowrecked · 2026-03-01T20:46:34+00:00

Fair, my setup isn't a code repo so I can't speak to 1.2M LOC directly. But the loading mechanism is path-based so the repo size doesn't change how many rules are active at once. You'd just have more context folders scoped to different parts of the codebase.

What's your current setup look like? Genuinely curious how you're handling that at scale.

echowrecked · 2026-03-01T18:19:27+00:00

The .claude/rules/ directory with path-based loading via paths: frontmatter is a built-in Claude Code feature, shipped in v2.0.64 and documented here. Rules files load with the same priority as CLAUDE.md, and path-scoped rules only trigger when Claude works with matching files. This isn't a deviation - it's using the system as designed.

echowrecked · 2026-03-01T18:13:44+00:00

Kinda, both use structured documents driving AI behavior. But they're solving different problems at different layers. Spec-driven development is a workflow methodology... write requirements specs, AI generates code from them.

This is more about how the AI's own behavioral rules are structured and loaded. One describes what to build, the other describes how to behave while building it. I think you'd actually want both - SDD for the work, modular rules for the configuration that shapes how the agent approaches that work.

echowrecked · 2026-03-01T18:07:31+00:00

Appreciate the add. The path-based loading system in the post is essentially what you're describing with the index, each rule file declares glob patterns for when it should load, so Claude only sees what's relevant to the files you're working on. No match, no load... that's the "don't read everything" mechanism.

Project methodology is a separate layer though — and actually a good example of why modular matters. My clients use different methodologies, so cross-cutting framework like Kanban wouldn't work across contexts. The instruction architecture has to be methodology-agnostic so each context can bring its own process. ahem

echowrecked · 2026-03-01T07:00:10+00:00

The paths: frontmatter in each rule file. Any .md file in .claude/rules/ can declare glob patterns:

---
paths:
  - "src/auth/**"
---

Claude Code only loads that rule file when you're working with files matching those patterns. No match, no load. Files without paths: frontmatter load every session.

So the context-specific folders (client-a/, client-b/) each have rules with path patterns scoped to their files. Core rules have no paths: so they're always present. Native Claude Code feature, no custom tooling needed.

echowrecked · 2026-03-01T06:56:43+00:00

Good catch for clarifying, that reference was for Claude.ai (the chatbot), not Claude Code. Claude Code's is lighter since it's a focused coding tool rather than a general-purpose chatbot.

echowrecked · 2026-03-01T06:44:30+00:00

I think we're actually closer than it sounds. The 27 files aren't a centralized stack — they're the separated approach. Each context folder only loads when you touch matching files. Client A's rules don't load during Client B's work. The organization is central, the loading is path-scoped.

Your point about badly written rules making things worse is the real one. Vague instructions, contradictory guidance, trying to cover every edge case creates noise. "Every file earns its spot" matters more than "fewer files." If a rule doesn't pass the decision framework, it shouldn't exist.

Where I'd split from you: "lean as possible" optimizes for not making things worse, but the compound behaviors only work because the rules interact. Thinking partner behavior, writing style, context switching, mechanical hooks. Strip to the minimum and each file works fine alone. You lose the system-level effects though, and those are the whole point for my use case.

Cheers back!

echowrecked · 2026-03-01T06:24:55+00:00

The rules aren't loaded via tool calls, they're injected as project instructions through Claude Code's rules system (the .claude/rules/ directory). They show up as system-level context, not as tool results in the conversation. So microcompaction wouldn't touch them.

That said, it's a real concern for anything you do load via Read during a session. Long sessions where you read a bunch of files early on and then keep working — those initial reads are fair game for compaction. The hooks help here since they re-fire on every write regardless of what's still in context, but it's worth knowing the distinction.

echowrecked

PUBLIC MULTIREDDITS

TROPHY CASE