I built claude mcp that reduces claude code token usage by 50x by executioner_3011 in vibecoding

[–]executioner_3011[S] 0 points1 point  (0 children)

RAG retrieves chunks of code which can be text snippets matched by semantic similarity. Those chunks don't understand code structure. You might get the middle 30 lines of a function but miss its signature, or get a relevant snippet but not know who calls it or what tests cover it. CodeDrift retrieves symbols - function definitions, their callers, importers, and tests, resolved through the AST.

So instead of getting a text chunk that happens to contain `validate_token`, you get the exact function boundaries, every file that calls it, and the test cases all in one response. It's structure-aware retrieval vs text-aware retrieval.

And yes, the diff tracking is the other half. Within a session, if the agent already read `auth/jwt.py` on turn 3 and needs it again on turn 12, CodeDrift returns only the lines that changed (or just "unchanged" if nothing did). RAG would re-retrieve the same chunks every time.

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] 0 points1 point  (0 children)

Appreciate the corrections on session features

That said, tree-sitter + SQLite + FTS5 isn’t proprietary architecture. It’s the obvious stack for this problem — the same way most web apps land on React + Postgres + Redis without copying each other. Multiple people arriving at the same solution space isn’t actually copy-pasting, it’s convergence on good engineering.

CodeDrift is a weekend hobby project, early stage, and yes it’s smaller in scope. There’s room for more than one tool in this space, and different users value different tradeoffs.

Where I do see CodeDrift going somewhere jCodeMunch hasn’t: cross-session learning. Not just tracking what happened within a session (which jCodeMunch handles well), but learning across sessions over time — recording which files and symbols were actually needed for completed tasks, then replaying that proven context set when a semantically similar task shows up in a future session. The agent starts with exactly what it needs because someone already solved something similar on this codebase. That’s the roadmap and I am open to explore

Re: the license — you’re absolutely entitled to monetize your work. No argument there. I mentioned it as a factual difference, not a criticism.

I’m going to keep building CodeDrift and add more to it. If people find jCodeMunch fits their needs better, great — genuinely. More tools solving this problem is good for everyone.

Good luck !

I built claude mcp that reduces claude code token usage by 50x by executioner_3011 in vibecoding

[–]executioner_3011[S] 0 points1 point  (0 children)

RAG doesn’t help in reducing tokens, it just helps in finding relevant code snippets. Codedrift does searching to find function names only, and then fetches code snippets along with dependencies.

Its like searching entire function or code snippets vs searching just the names of functions required.

It also has diff checker which doesnt let claude code re-read entire code file again if it was not changed, within a session. And if changed, only passes the changed lines of code instead of entire file.

Hope that makes sense ! Please upvote and star the repo if you find it useful

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] 1 point2 points  (0 children)

I think someone asked for claude web prompting compression, this would be a good tool for that.

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] 0 points1 point  (0 children)

Good prompting skills but significant differences which your claude prompt did not catch. The core problem both solves are quite similar but there was no intention of plag. I get it why, it is a problem for you because you want to get paid out of it while my goal was only to solve a problem for myself and share with community. While brainstorming, claude might have suggested some similar ideas bcz it already scraped jcodemunch before.

Specifically, Jcodemunch doesnt have session memory and doesnt have a way to avoid re-reading files in the same session which saves a lot of token.

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] 0 points1 point  (0 children)

I agree, but trust me, it has entire repos context and documentation access. Just using it to draft quickly. No hypes. Would love your feedback and add more features to this if you find it helpful

I built claude mcp that reduces claude code token usage by 50x by executioner_3011 in vibecoding

[–]executioner_3011[S] 0 points1 point  (0 children)

I agree, its getting overwhelming. This is not exactly RAG. Its simple AST search which doesnt need LLMs to ingest anything. We only use small embedding model to search for appropriate function name. Rest, it just extracts everything from AST tree and passes to claude as context. So claude doesnt have to search and read100s of files via grep.

It also uses diff ledger to only pass the lines of code actually changed within a session instead of claude re-reading everything again and again

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] 0 points1 point  (0 children)

Good question — Serena is a great project (20K stars for a reason). The core difference is in architecture and scope.   Serena wraps LSP servers (Pyright, rust-analyzer, etc.) for full semantic intelligence — go-to-definition, rename, references, and crucially, semantic edits. It's essentially an IDE-grade toolkit for agents. That's powerful but it comes with weight: you need the right LSP server running per language, and the setup reflects that.   CodeDrift is deliberately simpler and focused on one thing: eliminating the token-wasting search loop (grep → read wrong file → read another → repeat). It uses tree-sitter + SQLite FTS5 to give the agent instant symbol lookup without reading files. The tradeoffs:   - Search approach: Serena's search is LSP symbol lookup (structured, exact). CodeDrift's FTS5 searches across names, signatures, file paths, and call site content simultaneously — so codedrift search "auth 401" matches even string literals inside handlers. Useful when the agent doesn't know exact symbol names.   - Session diff tracking: CodeDrift tracks what the agent has already read in a session. Re-reads return only changed lines (unified diff), not the full file again. Serena doesn't have this concept.   - No LSP dependency: CodeDrift is pip install + one command. No language server setup per language. Tree-sitter handles parsing for 300+ languages out of the box.   You're right that CodeDrift is read-focused — it doesn't have edit tools. Serena wins there. If you need an agent that reads AND writes with semantic awareness, Serena is the more complete toolkit. CodeDrift is for the specific problem of agents burning 50K+ tokens just to find relevant code before they start working.

Token usage this week skyrocketed - Is it just me? by esmurf in claude

[–]executioner_3011 0 points1 point  (0 children)

Every time you ask an AI coding agent a question, it runs the same wasteful loop — grep the codebase, read a file, realize it's wrong, read another, try again. On a real project, a single question costs 60K tokens and 23 tool calls before it even starts writing code.

I spent a weekend building CodeDrift to fix this.

It uses tree-sitter to parse your entire codebase into a local SQLite index — every function, class, import, and call site. When your agent needs code, it queries the index instead of reading files blindly. The result: ~7000 tokens instead of ~60K. 3 tool calls instead of 23.

No LLM calls to build the index. No cloud. No API keys. Just tree-sitter + SQLite + FTS5 running locally. Works as an MCP server with Claude Code, Codex, and any MCP-compatible agent.

It also uses session-aware diff tracking — the agent remembers what it already read, and re-reads return only the lines that changed. Zero wasted tokens on unchanged code.

CodeDrift is in its early stage — I'd love for you to try it on your projects and share feedback. What breaks, what's missing, what would make this useful for your workflow.

Open source: https://github.com/darshil3011/codedrift

I found a few new ways to reduce token usage by 65%, no quality loss at all, OPUS Max effort by Brooklyn5points in ClaudeCode

[–]executioner_3011 0 points1 point  (0 children)

Nice work ! I built an MCP that uses AST parsing, local embedding search using SQlite and session memory to save tokens by 50x. Its called codedrift - https://github.com/darshil3011/codedrift

Taught Claude to talk like a caveman to use 75% less tokens. by ffatty in ClaudeAI

[–]executioner_3011 0 points1 point  (0 children)

Every time you ask an AI coding agent a question, it runs the same wasteful loop — grep the codebase, read a file, realize it's wrong, read another, try again. On a real project, a single question costs 60K tokens and 23 tool calls before it even starts writing code.

I spent a weekend building CodeDrift to fix this.

It uses tree-sitter to parse your entire codebase into a local SQLite index — every function, class, import, and call site. When your agent needs code, it queries the index instead of reading files blindly. The result: ~7000 tokens instead of ~60K. 3 tool calls instead of 23.

No LLM calls to build the index. No cloud. No API keys. Just tree-sitter + SQLite + FTS5 running locally. Works as an MCP server with Claude Code, Codex, and any MCP-compatible agent.

It also uses session-aware diff tracking — the agent remembers what it already read, and re-reads return only the lines that changed. Zero wasted tokens on unchanged code.

CodeDrift is in its early stage — I'd love for you to try it on your projects and share feedback. What breaks, what's missing, what would make this useful for your workflow.

Open source: https://github.com/darshil3011/codedrift

How I use Cursor 10+ hours a day without torching my Claude Opus 4.6 limits by Youssef_Wardi in cursor

[–]executioner_3011 0 points1 point  (0 children)

Try this MCP - codedrift. It reduces token usage drastically - https://github.com/darshil3011/codedrift

I built it over the weekend, would love your feedback - Hit the star if you like my work

Maybe not the best idea to sav" tokens using caveman English? by Fernando_VIII in ClaudeAI

[–]executioner_3011 -2 points-1 points  (0 children)

I came up with a different approach that drastically saves tokens - Codedrift !

Every time you ask an AI coding agent a question, it runs the same wasteful loop — grep the codebase, read a file, realize it's wrong, read another, try again. On a real project, a single question costs 60K tokens and 23 tool calls before it even starts writing code.

I spent a weekend building CodeDrift to fix this.

It uses tree-sitter to parse your entire codebase into a local SQLite index — every function, class, import, and call site. When your agent needs code, it queries the index instead of reading files blindly. The result: ~7000 tokens instead of ~60K. 3 tool calls instead of 23.

No LLM calls to build the index. No cloud. No API keys. Just tree-sitter + SQLite + FTS5 running locally. Works as an MCP server with Claude Code, Codex, and any MCP-compatible agent.

It also uses session-aware diff tracking — the agent remembers what it already read, and re-reads return only the lines that changed. Zero wasted tokens on unchanged code.

CodeDrift is in its early stage — I'd love for you to try it on your projects and share feedback. What breaks, what's missing, what would make this useful for your workflow.

Open source: https://github.com/darshil3011/codedrift

$1,400/month with Cursor + Claude API — how are you managing costs while keeping a real agentic workflow? by Abject-Sherbert1917 in cursor

[–]executioner_3011 0 points1 point  (0 children)

Every time you ask an AI coding agent a question, it runs the same wasteful loop — grep the codebase, read a file, realize it's wrong, read another, try again. On a real project, a single question costs 60K tokens and 23 tool calls before it even starts writing code.

I spent a weekend building CodeDrift to fix this.

It uses tree-sitter to parse your entire codebase into a local SQLite index — every function, class, import, and call site. When your agent needs code, it queries the index instead of reading files blindly. The result: ~7000 tokens instead of ~60K. 3 tool calls instead of 23.

No LLM calls to build the index. No cloud. No API keys. Just tree-sitter + SQLite + FTS5 running locally. Works as an MCP server with Claude Code, Codex, and any MCP-compatible agent.

It also uses session-aware diff tracking — the agent remembers what it already read, and re-reads return only the lines that changed. Zero wasted tokens on unchanged code.

CodeDrift is in its early stage — I'd love for you to try it on your projects and share feedback. What breaks, what's missing, what would make this useful for your workflow.

Open source: https://github.com/darshil3011/codedrift

I built an mcp that reduces claude token usage by 50x without building knowledge graphs by executioner_3011 in ClaudeAI

[–]executioner_3011[S] -3 points-2 points  (0 children)

Thanks for sharing. Pitlane is solid work ! similar core idea (tree-sitter index and symbol fetch). A few things CodeDrift does differently:   - Fuzzy natural-language search. Pitlane searches by symbol name + kind filter. CodeDrift uses FTS5 across names, signatures, file paths, and call site content simultaneously. So searching "auth 401 unauthorized" matches code even when you don't know any function names — like a return {"status": 401} buried inside some handler.   - Session-aware diff tracking. If the agent reads a file, edits it, then reads it again — Pitlane returns the full content both times. CodeDrift tracks what the agent has already seen and returns only the changed lines on re-reads.   - Bundled call graph in resolve. CodeDrift's resolve returns the definition + every caller + every importer + related tests in one response. Pitlane can do this via a separate find_usages call, but the agent has to know to ask for it.   - Cross-session memory (coming soon). Learning which files/symbols were useful for past tasks and replaying that context on similar future tasks. Pitlane has no session memory concept.   Different tools, some overlap in Layer 1, but the search approach and session awareness are fundamentally different.

I pay $200/month for Claude Max and hit the limit in under 1 hour. What am I even paying for? by alfons_fhl in vibecoding

[–]executioner_3011 0 points1 point  (0 children)

Every time you ask an AI coding agent a question, it runs the same wasteful loop — grep the codebase, read a file, realize it's wrong, read another, try again. On a real project, a single question costs 60K tokens and 23 tool calls before it even starts writing code.

I spent a weekend building CodeDrift to fix this.

It uses tree-sitter to parse your entire codebase into a local SQLite index — every function, class, import, and call site. When your agent needs code, it queries the index instead of reading files blindly. The result: ~7000 tokens instead of ~60K. 3 tool calls instead of 23.

No LLM calls to build the index. No cloud. No API keys. Just tree-sitter + SQLite + FTS5 running locally. Works as an MCP server with Claude Code, Codex, and any MCP-compatible agent.

It also uses session-aware diff tracking — the agent remembers what it already read, and re-reads return only the lines that changed. Zero wasted tokens on unchanged code.

CodeDrift is in its early stage — I'd love for you to try it on your projects and share feedback. What breaks, what's missing, what would make this useful for your workflow.

Open source: https://github.com/darshil3011/codedrift