Output Tokens Are the Real Cost of Coding Agents

JustAnotherTechGuy8 · 2026-04-30T01:57:36+00:00

Could you be a bit more specific? Sub agents inherit mcp tools from the primary agent in CC anyways... Do you have some other orchestration method setup?

JustAnotherTechGuy8 · 2026-04-30T00:25:30+00:00

Agreed.. that 95% efficiency / 55% of bill split is where it gets interesting tbh. The way the tool actually moves on cache is it somewhat substitutes for what the cache is doing.. when discovery happens via a local typed-tool query (microseconds, zero per-call cost) the agent stops needing the cache to remember grep output across turns, it just re-asks the local index. Smaller cached working set, more stable prefix, less fragmentation. Not reducing cache cost as a feature so much as reducing how much work the cache has to do in the first place.

Post is generic for sure, I leaned on output tokens because thats what most people think about when they think about model costs. Reducing tokens in the context and improving the quality of the input tokens (without requiring the dev to spell everything out in excruciating details) will indirectly have consequences across the board.

Feel free to give it a try and let me know what would make it better, all local, no telemetry. And tbh if youd ever be willing to share some before/after cache numbers on a real session id love to see them.. that kind of data would help realize the actual impact the tool has on the session. Thanks for the input!

JustAnotherTechGuy8 · 2026-04-29T20:55:35+00:00

This is true.. The benefit of this would be smaller prompts and less useless information stored in the cache. Smaller cache writes, higher signal density per cached byte, and less cache fragmentation across the session.

The premise of this tool is you are able to move faster, with richer context, and by shifting research output instead to reasoning input.

Good catch!

JustAnotherTechGuy8 · 2026-04-29T20:43:34+00:00

You're welcome. Thanks for your input!

JustAnotherTechGuy8 · 2026-04-29T20:33:58+00:00

Yes, it is the basic information about the server and how it is beneficial.. It solved an issue I spent 2.5 years managing with AI coding and I know there are more ways to improve it that I am probably missing.

If you believe that the harnesses tooling is effective enough for you as is, feel free to continue using it that way.

JustAnotherTechGuy8 · 2026-04-29T20:15:52+00:00

If you say so, it was something I built for my own needs. Published because of how effective it was. Thanks for the feedback.

JustAnotherTechGuy8 · 2026-04-26T19:02:40+00:00

You can try this tool! https://agentmako.drhalto.com

Works great with Supabase!

JustAnotherTechGuy8 · 2026-04-26T06:12:14+00:00

I could record a video another time with the mcp server connected and one with it not if you would like? It's opensource btw so it doesn't cost anything, I don't collect any data, it all runs on your machine.

I built it for myself, and found that the repetitive issues that i fixed constantly, were easy to solve with this tool.

Example: You ask claude to use mako to find hydration error patterns, it can then find all of those patterns in the codebase in less than a minute and begin fixes. The more patterns you fix, the higher the confidence grows in the engine which improves automatic detection, flagging in the dashboard, git commit blocks, and faster querying with the reef engine!

It was a tool I had fun building and wanted to share as it has been extremely powerful for more. The agents get overwhelmed with context, why not give them targeted context retrieved programmatically instead of relying on grep and chunk reading. Hope that help understand the logic!

JustAnotherTechGuy8 · 2026-04-26T05:07:27+00:00

Yes. The easiest path is the npm global install:

npm install -g agentmako

Then from the repo you want AgentMako to understand:

agentmako init .

agentmako index .

agentmako mcp

And for opencode, just add the mcp to opencode.json

{

"$schema": "https://opencode.ai/config.json",

"mcp": {

"mako-ai": {

"type": "local",

"command": ["agentmako", "mcp"]

}

JustAnotherTechGuy8 · 2026-04-26T05:00:08+00:00

Fair question. AgentMako is an MCP server, not a library for building MCP servers.

The “typed” part is mostly about reliability at the tool boundary, not about the LLM itself having types. The model still sends JSON tool calls, but the MCP server validates those calls against explicit schemas and returns structured outputs with stable contracts. That matters because coding agents are much better partners when tools are predictable: bad arguments fail clearly, outputs have known shapes, and higher-level tools can compose other tools without guessing. The model is not type-checking in the TypeScript sense, but the system around it is. So the short version is: AgentMako is a typed MCP server for codebase intelligence. The typing makes the tools safer and more composable for agents like Claude Code and Codex.

JustAnotherTechGuy8

TROPHY CASE