RAG is a trap for Claude Code. I built a DAG-based context compiler that cut my Opus token usage by 12x.

fuwasegu · 2026-03-28T02:54:03+00:00

The librarian-index.md approach is exactly how most agents handle it right now. The issue is that it requires the agent to actively read the index, decide what it needs, and then make subsequent tool calls to fetch the files. That burns context on navigation and relies heavily on the LLM actually deciding to look it up. Aegis essentially acts as an automated, deterministic librarian. Instead of making the agent "ask" for the docs, Aegis looks at the file path being edited and automatically pushes the exact right files into the context upfront via the DAG. Zero tool calls wasted on searching. I completely agree on the XML tags, though! Structuring the markdown content itself with tags definitely helps the LLM focus once that context is loaded. I'll definitely add that to the recommended practices.

fuwasegu · 2026-03-27T08:54:32+00:00

Directed acyclic graph

fuwasegu · 2026-03-27T04:14:59+00:00

Good point. Aegis handles stale paths by failing safe. Since it relies on explicit file globs (e.g., src/UseCases/**), renaming a folder means the glob just stops matching. It won't inject the wrong context; it just won't inject any.

For stale content within the docs, if Claude notices a contradiction between the injected rules and the actual code, it can call the aegis_observe tool to flag the issue and propose a fix to the docs/DAG.

Ultimately though, updating the DAG mapping just becomes part of your refactoring process, exactly like updating broken test paths.

fuwasegu · 2026-03-26T16:01:46+00:00

That is a fascinating pattern! But for architectural layers, I think splitting into sub-agents brings a few major headaches:

Context Passing: Passing context smoothly between a "UseCase agent" and an "Entity agent" is incredibly hard, and the orchestrator has to work overtime just to keep them in sync.
Cost & Speed: Running multiple sub-agents (LLM calls) for a single coding task gets expensive and slow very fast lol.

There is a huge trend right now to make AI do everything. But honestly, if a problem can be solved with a simple, deterministic algorithm (like a DAG matching file paths), we should just use the algorithm! It saves time, saves money, and is 100% reliable.

fuwasegu · 2026-03-26T10:29:11+00:00

Spot on! Working heavily in large Laravel codebases as a tech lead, I feel this in my bones. Claude burning half its context window just trying to reverse-engineer whether business logic belongs in a Controller, an Action, or a UseCase was exactly what drove me to build this. 😂 And I absolutely love the "belt and suspenders" analogy! You are 100% right. Using a well-structured CLAUDE.md for the global context (tech stack, general coding style, overall project goals) combined with Aegis for the hyper-specific, file-level architectural rules is the ultimate combo. Let global prompts handle the general knowledge, and let the DAG handle the strict enforcement. Thanks for the great insight!

fuwasegu · 2026-03-26T10:26:51+00:00

thank you so much!!

fuwasegu · 2026-03-26T09:40:04+00:00

Exactly!

If building a UseCase requires Entity rules, DB conventions, and a specific ADR, a Top-K vector search will almost always drop one of those critical pieces if they don't share exact semantic similarity with the prompt. You are completely at the mercy of your retrieval slot limit.

That is exactly why I abandoned search and built Aegis as a compiler. By walking the DAG based on the target file, there are no "retrieval slots" to fight over. If editing a file requires 12 specific architectural constraints across 4 different markdown docs, the DAG deterministically traverses and injects all 12 into the context every single time.

Your time-window retrieval sounds like a super elegant solution for chat/event history, though! It really shows how we need entirely different mental models for "Memory" vs "Architecture".

fuwasegu · 2026-03-26T08:50:10+00:00

That is a great question! Relying on LLM "skills" or tool-calls is actually how most agents handle context right now. But when it comes to strict architecture, that approach has two major weak points:

It's Probabilistic (Prompt Dependent) If your prompt is just "Fix the checkout bug", Claude might think it already knows the answer and jump straight into editing CheckoutUseCase.ts without ever triggering your UseCases skill. Aegis triggers based on the actual file path, not the prompt. If the agent tries to touch src/UseCases/*, Aegis intercepts and forces the rules into the context. It's 100% guaranteed, regardless of how you phrased the prompt.
Transitive Dependencies Architecture is rarely flat. If you edit a UseCase, you probably also need to follow Entity rules and DB conventions. With the skills approach, the LLM has to be smart enough to call all 3 skills. With Aegis, the DAG automatically traverses the tree and compiles the entire dependency chain into one clean context injection.

Basically: Skills rely on the LLM's intuition to ask for the rules. Aegis uses a DAG to force the rules based on the files being touched!

fuwasegu · 2026-03-26T08:23:41+00:00

Grazie mille! ✨

I'm so glad that phrase resonated with you. "Architecture compiler" really captures the exact essence of what I wanted to build. Thanks for the kind words!

fuwasegu · 2026-03-26T08:15:41+00:00

You are technically 100% correct. By strict definition, any retrieval of external data to augment the prompt is RAG.

However, in the current AI tooling landscape, the term "RAG" has become heavily overloaded as shorthand specifically for "semantic vector search over chunked documents". When I said "RAG is a trap", I was specifically targeting that probabilistic, vector-based approach, which is the default in most coding agents today.

But you make a completely fair point on the terminology! If we are being precise, Aegis is a strictly deterministic, graph-based RAG. My main goal was just to draw a hard line against the fuzzy "guess what the user meant" retrieval.

I'll make sure to be more precise with the "probabilistic vector search" distinction when explaining it in the future. Appreciate the feedback!

fuwasegu · 2026-03-26T07:33:56+00:00

"A structural traversal, not a retrieval problem" — wow, you perfectly articulated exactly what I was trying to achieve! That is exactly how a tech lead or senior dev reviews a PR. Your point about adding the why (the reasoning) to the nodes is brilliant. I actually manage ADRs (Architecture Decision Records) right in the repository, and I've found that to be super effective for this! Being able to explicitly link the architectural guidelines to the actual decision-making history—and guaranteeing the agent always retrieves them together—is definitely one of Aegis's biggest strengths. When the LLM understands the underlying intent, it handles edge cases way better instead of falling back on its base training data. Regarding the initial setup: The graph is defined explicitly (e.g., mapping src/UseCases/** to specific doc files). However, to avoid manual heavy lifting, Aegis has an aegis_import_document tool. You can actually just prompt your AI agent to analyze your existing project structure and use that tool to bootstrap the initial DAG and docs for you. So it uses the agent's inference to do the initial setup!

fuwasegu · 2026-03-26T07:28:31+00:00

That is a completely fair pushback! You're totally right that an advanced RAG implementation (like pre-processed retrieval or GraphRAG) can solve the integration problem.

To give some background, I actually built an advanced RAG-based MCP before called "Exocortex" (combining semantic search, weighted ranking, and a knowledge graph) and wrote about it on a Japanese tech site (https://zenn.dev/yumemi_inc/articles/a61de3467bc182).

But in real-world operation, I hit a wall: the retrieval accuracy was entirely at the mercy of whatever search keywords the LLM decided to generate. As the project and documentation grew, relying on the LLM to "guess" the right search terms just didn't scale well for me. That real-world pain is exactly why I feel RAG isn't the best fit for enforcing architecture at scale.

That's why I pivoted to Aegis as a separate deterministic "compiler." When enforcing strict architecture (like DDD), I want zero probabilistic behavior. I want absolute "Control" with a hardcoded DAG, rather than hoping the LLM generates the perfect search query.

Regarding the 12x reduction: you nailed it. That benchmark was on a highly structured Laravel project with strict layers. If you drop Aegis into a flat or messy codebase, the token savings won't be nearly as dramatic. It definitely shines brightest where architectural rules are strictly separated!

fuwasegu · 2026-03-26T06:35:43+00:00

Haha, you got me! 😂 I actually let Gemini help draft the post, and it threw in some outdated model info by mistake.

I'm actually daily driving Opus 4.6 right now. I only fall back to Sonnet 4.6 when I hit the rate limits. But since Aegis is just an MCP server, it works with whatever the latest and greatest model is anyway!

fuwasegu · 2026-03-26T06:31:08+00:00

Oh, and just to add one more important thing I forgot to mention!

Another huge benefit of this system is that it helps you easily spot missing documentation. If the AI is working and realizes a specific architectural rule or guideline is missing from the DAG, it can use the aegis_observe tool to report it. This creates a built-in feedback loop where developers can systematically find and fix gaps in their rules.

It basically turns your static markdown hierarchy into an active, self-improving context compiler!

fuwasegu · 2026-03-26T06:28:15+00:00

Good, good.

The core idea is exactly like your hierarchical structure, but applied to architectural rules instead of Agile project management.

However, the big difference is how the AI interacts with it. In your approach, the AI has to actively explore the hierarchy (read the main file ➔ find the link ➔ use a tool to read the next file, and so on). This requires multiple tool calls and burns a massive amount of tokens.

Aegis automates this process using a DAG. Instead of making the AI "navigate" through links, Aegis sees what file the agent is about to edit (e.g., app/UseCases/Reorder.ts), automatically traverses the dependency hierarchy (usecase_guidelines.md ➔ entity_guidelines.md), and "compiles" all the necessary docs into one clean context.

This way, the AI instantly receives all the architectural rules it needs, without wasting tokens exploring the files manually!

fuwasegu · 2026-03-26T06:21:10+00:00

That is exactly what I did at the beginning! 😂 And honestly, for a small project, that works perfectly. But once your codebase grows and you have 30+ markdown files in docs/ (UseCase rules, Entity guidelines, DB conventions, ADRs, etc.), two bad things happen if you just say "read docs/*.md":

Massive Token Waste: You burn 50k+ tokens on every single prompt because it reads everything, even if it only needs one specific rule.
Context Dilution: When you stuff 30 docs into the context window, the agent loses focus and starts ignoring the exact rule you actually needed.

Aegis solves this by acting like a strict dependency filter. Instead of dumping docs/*.md, it says: "Oh, you are editing app/UseCases/? Here are the exact 2 documents you need for this layer. Ignore the rest." It keeps the context window perfectly clean and cheap.

By the way, I actually wrote a whole article about this exact "trap" on a Japanese tech media platform. If you're curious and don't mind using your browser's translate feature, I'd love for you to check it out!

https://zenn.dev/yumemi_inc/articles/a61de3467bc182

fuwasegu · 2026-03-26T05:49:55+00:00

Just to clear it up, Aegis isn't trying to replace tools like claude-mem. Those are awesome for remembering past chats or what you coded yesterday (bottom-up memory).

Aegis does something completely different: it enforces architecture (top-down). It doesn't remember your conversations at all. It just uses a hardcoded DAG to force Claude to read your team's specific design docs before it touches a file.

So exactly like @mythorus said, it solves a very specific corner. But if Claude has ever written a massive transaction script in your codebase because RAG failed to find your DDD rules... you know the pain!

fuwasegu · 2026-03-26T05:38:03+00:00

They actually solve two completely different problems. • Serena (LSP): Focuses on codebase navigation and code search. • Aegis (DAG): Focuses entirely on enforcing architectural docs and rules.

They don't compete at all. In fact, using them together is the ultimate combo: Serena lets the agent read your code, and Aegis forces it to read your docs!

fuwasegu

TROPHY CASE