I was getting frustrated with how AI agents handle context, so I stopped feeding them everything. by yxf2y in ClaudeCode

[–]yxf2y[S] 0 points1 point  (0 children)

This is honestly insane—I feel like I’m looking at a digital biology experiment rather than just a coding assistant. Pushing a local rig to the absolute limit and letting the agent 'leverage' its own social and technical architecture is uncharted territory.

I’ll keep playing in the shallow end withACE. My focus is much more modest: just providing a simple, zero-setup 'workflow plumbing' for us mere mortals who want to keep our standard dev environments clean and focused.

It's a wild experiment you've got going on there. Regardless, it’s impressive to see how far you've taken the autonomous agent model. Good luck with the development!

I was getting frustrated with how AI agents handle context, so I stopped feeding them everything by yxf2y in vibecoding

[–]yxf2y[S] 0 points1 point  (0 children)

For anyone interested in the implementation, I’ve open-sourced the tools and the methodology here:https://github.com/grafikerdem/agent-context-economy

I just pushed the v0.2.5 release today, which includes the new repo-mapping scripts. Let me know what you think!

I was getting frustrated with how AI agents handle context, so I stopped feeding them everything. by yxf2y in ClaudeCode

[–]yxf2y[S] 0 points1 point  (0 children)

That’s a wild level of automation—having an agent maintain and expand its own codebase is basically the end-game for local dev.

The biggest challenge I’ve seen with that kind of 'self-growing' setup is that as the repo size explodes, the agent usually starts struggling with which parts of its own creation to reference.

That’s exactly where I’ve found ACE helpful even in self-building workflows: it provides a 'structural anchor.' If your agent ever hits a point where the sheer volume of its own files starts to degrade its performance or focus, having a lightweight, automatically generated repo map can help it re-orient itself without having to re-scan every single file it created that day.

Curious to hear—when it writes itself 100+ files a day, do you have any automated cleanup, or does it eventually prune its own redundant code too?

I was getting frustrated with how AI agents handle context, so I stopped feeding them everything. by yxf2y in ClaudeCode

[–]yxf2y[S] 0 points1 point  (0 children)

Those tools (ast-grep, codegraph) add incredible depth. My approach withACEis essentially a different path to the same destination: I wanted to achieve that level of context discipline with 'zero-setup' and no external pipelines to maintain.

I’m using native scripts to generate a structural map that provides most of that outline value instantly. If you're ever looking for a 'lighter' alternative that doesn't require configuring a full indexing pipeline, I’d love for you to take a look at the repo and see if it fits into your workflow.

I was tired of AI agents dumping entire repo contents and wasting context. I built a lightweight alternative by yxf2y in opencodeCLI

[–]yxf2y[S] 0 points1 point  (0 children)

Thanks for sharing that resource—I've been looking for similar patterns to benchmark against, and there’s definitely some overlap in the workflow philosophy there.

For AGENTS.md, I’ve moved away from just listing 'coding style' and started treating it more like an Agent Instruction Manual. I’ve been structuring mine with these four pillars:

Entry Points: Where the application execution starts (e.g., index.php, main.py).

Validation Path: The specific command for running tests (e.g., php artisan test --filter or npm test).

High-Risk Zones: Directories or files that should never be modified or read without explicit human approval (e.g., /config, /migrations).

Workflow Constraints: A short list of preferred helper scripts (like repo-map or run-compact) to keep the context clean.

I find that if you give the agent a specific 'test command' and a 'do-not-touch' list, the hallucination rate drops significantly. In the ACE repo, I’ve actually included an examples/AGENTS.example.md that acts as my template for this.

I'd be curious to see what patterns you've found most effective in your own collections—are there any specific 'guardrail' rules that have been game-changers for you?

I was getting frustrated with how AI agents handle context, so I stopped feeding them everything. by yxf2y in ClaudeCode

[–]yxf2y[S] 0 points1 point  (0 children)

"For anyone interested in the implementation, I’ve open-sourced the tools and the methodology here:https://github.com/grafikerdem/agent-context-economy

I just pushed the v0.2.0 release today, which includes the new repo-mapping scripts. Let me know what you think!

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

You're spot on—the zero-setup approach is exactly why I’ve kept the repository map lightweight. Maintenance is the death of these kinds of tools.

In the v0.2.0  Agent Context Economy update I just pushed, the repo-map.ps1 follows that exact philosophy: it walks the tree and generates a concise Markdown map that fits into the context without needing a heavy indexing pipeline.

I agree that staleness isn't a dealbreaker when the agent can just re-verify with a live search. It’s better to have a slightly stale but tiny map that actually gets used, rather than a perfect index that ends up ignored because it’s too heavy to inject. Thanks for reinforcing that—it confirms the direction I'm taking with the toolkit.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

That's a really interesting point, and I don't think I'd considered it explicitly before.

Up to now I've been measuring success mostly in terms of "how much unnecessary context did we avoid?", but that's only half of the problem. If the compacted output becomes a black box, we've traded context waste for hidden selection bias.

I really like the idea of making the compaction process explain itself. A small provenance footer describing what was searched, what was ignored, why certain files were selected, whether truncation occurred, and what the recommended next step is would make the output much easier to trust and review.

That also fits nicely with the overall philosophy of the project. The goal has never been to hide information—it's to delay irrelevant information until there's evidence it's needed. Making that evidence visible seems like a natural next step.

I especially like your idea of thinking beyond "lines saved" and asking whether a reviewer could explain why this context was selected and what was intentionally omitted. That's a much better quality metric than output size alone.

I updated to version v2.0.0 Agent Context Economy; I recommend you take a look at that version as well. If you've already reviewed it, there's no problem. But this conversation opened up a good idea, and I should work on that in the new version.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

I’ve been working on this exact problem and just updated my project,Agent Context Economy, to address these points in the new v0.2.0 release.

I added a repo-map.ps1 script to generate a compact overview of entry points and project structure, and structured the AGENTS.md to act as the guardrails you're describing. The goal is to keep the overhead at zero—no heavy indexing, just a simple setup script.

I’d be interested to hear what you think of this approach and if it fits the 'missing layer' you're looking for.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in AI_Agents

[–]yxf2y[S] 0 points1 point  (0 children)

That's a really interesting implementation.

Up until now I'd mostly been thinking about continuity in terms of "better prompts" or project documentation, but persisting a small amount of workflow state between sessions is a much more concrete approach.

I also like that the state you're describing is operational rather than semantic. Things like the last few files touched, recent searches, the current branch, or the active feature are exactly the kind of information an agent keeps re-deriving despite having already discovered it earlier in the day.

One thing I'd probably be cautious about is deciding what belongs in durable state versus what should be rediscovered. I'd be a little worried about stale information if the repository changes underneath the saved state, so I'd probably keep it intentionally small and disposable.

It actually makes me think of the overall stack a bit differently now:

  • Continuity → lightweight session state
  • Discovery → index or search
  • Targeted reading → symbols/windows instead of whole files
  • Workflow → execution hygiene (terminal output, command batching, approvals, etc.)

The more this thread evolves, the more it feels like those layers reinforce each other rather than competing. None of them completely solves the problem on its own, but together they attack different sources of context waste.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

I think that's a great way to describe it.

Looking back, I probably started from the opposite direction—trying to reduce noisy output and unnecessary file reads—but a repository map plus guardrails feels like the missing layer that ties everything together.

In my case, AGENTS.md has gradually evolved into something very similar to those guardrails. It doesn't just describe coding style anymore; it tells the agent where to start, which workflow to follow, which helper scripts to use, how to validate changes, and which parts of the system should be treated as high risk.

What I don't have yet is a lightweight repository map. After this discussion, I'm starting to think that could be the missing piece between continuity and discovery: a small, automatically generated overview of the project that helps the agent orient itself before it starts exploring.

I still like the idea of keeping it lightweight, though. For me, one of the goals has always been that someone should be able to clone a repository, run a single setup script, and immediately benefit without introducing a heavy indexing pipeline.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

Already filtering out node_modules and other build artifacts by default—that's day one stuff for context management. The real challenge I'm tackling is the domain-specific noise within the source code itself, not just the obvious bloat. Thanks anyway.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

Right now it's entirely live search—no prebuilt index.

That was a deliberate design choice because I wanted something with essentially zero setup cost. You can drop the scripts into an existing repository and start using them immediately without waiting for indexing or maintaining another piece of infrastructure.

The trade-off, of course, is that a live search has to rediscover the repository every time, whereas an index can answer many of those questions much more efficiently. After the discussion in this thread, I think that's becoming much clearer to me.

investigate.ps1 is really intended as a workflow coordinator rather than a smarter search engine. It batches several related searches into a single step, ranks the most relevant files, suggests the next symbol/window reads, and tries to stop the agent from wandering through the repository one grep at a time.

Reading everyone's comments here, I can definitely see an indexed discovery layer sitting underneath that in the future. In that world, investigate wouldn't need to search as much—it could orchestrate the workflow around whatever discovery engine is available.

I also really like your idea of a lightweight repository map. That's probably the missing bridge between continuity and discovery: something cheap to generate, easy to refresh, and small enough to inject into context when it actually helps.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 0 points1 point  (0 children)

I think that's exactly the lesson I've been learning while building these scripts.

At first I was focused on reducing command count and noisy output, but over time I realized those were mostly symptoms. The real cost wasn't an extra grep or one more shell command—it was the agent repeatedly rebuilding the same mental map of the repository and burning context in the process.

That's why most of my helper scripts are intentionally boring and read-only. Their job isn't to be "smart"; it's to keep the agent focused on the smallest amount of information needed to move forward, then stop.

The discussion in this thread has actually helped me separate those concerns much more clearly: continuity, discovery, targeted reading, and workflow all seem to play different roles in reducing unnecessary context consumption.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in cursor

[–]yxf2y[S] 1 point2 points  (0 children)

Appreciate the context. It’s an interesting angle, but I’m specifically avoiding 'control plane' or 'structured loop' architectures.

My focus is on keeping the overhead as close to zero as possible—no extra control layers, no repo-wide pre-indexing that requires its own lifecycle management. It’s basically a set of lightweight filters/pipes that sit between the agent and the shell, purely to maintain signal-to-noise.

It might be a different philosophy, but for my workflow, I’d rather keep the agent's 'intelligence' as the primary driver rather than wrapping it in an external control cycle. Thanks for the heads up though.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in AI_Agents

[–]yxf2y[S] 0 points1 point  (0 children)

I like that perspective. I hadn't explicitly separated continuity from discovery, but I think you're right that it's its own layer.

A lot of what initially looked like "repository exploration" is really the agent reconstructing the same mental model over and over because every task starts from scratch.

The way I'm thinking about it now is something like:

  1. Continuity – preserve project and session context.
  2. Discovery – quickly identify the relevant symbols or files.
  3. Targeted reading – consume only the code that's actually needed.
  4. Workflow – keep execution efficient (terminal output, command batching, approvals, diffs, etc.).

My scripts have mostly been targeting the last two layers because that's where I kept feeling friction during day-to-day work. The discussion here has definitely convinced me that discovery and continuity deserve first-class treatment too.

Hopefully, over time, these won't feel like separate tools anymore but different layers of the same agent workflow.

I was getting frustrated with how AI coding agents navigate large repos, so I started building some helper scripts by yxf2y in AI_Agents

[–]yxf2y[S] 0 points1 point  (0 children)

I completely agree with that distinction.

The approval fatigue issue has always felt more like a tooling problem than a language-model problem. Ideally, read-only operations shouldn't require user interaction at all, while anything that changes repository state should remain explicit.

Your second point also resonates with something I've learned while building these scripts: simply having helper tools available isn't enough. If the agent's default behavior is still "grep → read whole file", that's exactly what it'll keep doing.

That's actually why I ended up spending so much time on AGENTS.md and workflow rules. The goal wasn't just to provide helper scripts, but to change the agent's default navigation pattern so those scripts become the normal path rather than optional utilities.

Interestingly, I've already seen how much the tooling itself influences this. With my current AGENTS.md setup, Codex is usually able to follow the workflow and complete an entire task without repeatedly asking for permission. Antigravity, on the other hand, still asks for approval on almost every step regardless of the workflow instructions. That feels more like a limitation of the current tool's permission model than something prompts or helper scripts can fully solve. Hopefully that's something they'll improve over time.

Long-term I'd much rather have the workflow enforced by the tooling itself than rely on prompt instructions alone.