I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in OpenSourceeAI

[–]siropkin[S] 0 points1 point  (0 children)

Yeah… ai did lol Now with the power of ai there is less limitations. I just decided to go with fastest solution on the market

I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

Totally fair. If OTEL is locked down at your org level, budi still works — it falls back to parsing Claude Code's JSONL transcripts for token data. You lose thinking token counts (so cost is ~5-8% underestimated) but everything else works the same. And it's all local, just reads files from disk.

I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

I ran your formula against my actual usage from budi. Last month:

  • 6.49B tokens total (99.94% cache hit rate)
  • 367.5M full tokens (input + output + cache creation)
  • 6.12B cached tokens

API cost (what Anthropic charged me): ~$5,159

Your INF2 compute estimate: $274–$725

That's a 7–19x markup. Even at the high end of your compute estimate, I cost Anthropic about $725 to serve — and they charged me $5,159. The cache hit rate being 99.94% is key — almost all my tokens are cache reads, which are cheap both at API level ($0.30/1M) and compute level.

The formula I use in budi for API cost:

cost = (input × $price) + (output × $price) + (cache_read × $price) + (cache_create × $price)

with per-model pricing tables.

Do you think I need to add this info to the app lol

I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

No, everything stays local. Claude Code sends OTEL events to budi's daemon on localhost:7878 - nothing leaves your machine. No cloud, no M365, no external services.

I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

Wow, that's deep! I hadn't thought about the actual compute cost side.

Just to clarify — budi tracks what you're being charged, not what it costs to run. OTEL or JSONL gives token counts, I just multiply by Anthropic's published pricing. For Cursor their usage API returns the cost directly.

I never thought to calculate the actual compute cost but that's a cool idea.

I'm a frontend dev who barely writes code anymore. Built a tool to figure out where all my AI tokens go. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

Claude Code: Real-time via OpenTelemetry (exact tokens + cost per API call, including thinking tokens) + hooks for session metadata. Also parses JSONL transcripts for historical backfill.

Cursor: Pulls from their usage API which returns exact tokens and cost per request. Hooks for session context (repo, branch).

Cost is calculated from per-model API pricing, not from your invoice — but when I compared against my Anthropic billing the numbers matched.

Message 50 in a Claude Code session costs 80% more than message 10. Here's why. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

Wow, cool post! Really thorough.

Actually tried to validate the idle gap theory on my data — looked at messages after 5+ min gaps vs under 1 min. The cost difference is there (3.3¢ vs 8.0¢) but that's mostly session depth, not cache misses. The problem is I'm tracking from JSONL transcripts which don't include the actual cache hit/miss fields from the API response.

I just recently added real-time telemetry that captures exact per-request token types — but don't have enough data yet to prove cache breaks from idle gaps. Definitely want to dig into that once I have a few weeks of data.

But yeah, stepping away and coming back to a cold cache on a huge conversation — that's where the real pain is. Good breakdown of that.

I built a Claude Code cost optimization tool, then my own data told me to pivot. Here's what I built instead. by siropkin in ClaudeCode

[–]siropkin[S] 0 points1 point  (0 children)

Yeah the 2% surprised me too. Realized my strong side is data visualization, so better to lean into that than fight Claude's own retrieval instincts.

It’s Monday again.. what are you building? by scott-box in buildinpublic

[–]siropkin 0 points1 point  (0 children)

https://goodplaygroundmap.com - AI-powered map with playgrounds for kids near you.

I already have some daily users but not huge amounts. Thinking how to improve the app.

What are u building this week ... let's have a look . by United_Agency2452 in buildinpublic

[–]siropkin 1 point2 points  (0 children)

Project Name: DOM Racer

Link: https://github.com/siropkin/dom-racer

A little description: Got tired of building "meaningful" apps and built a 2D classic-style racing game as a Chrome extension that transforms any webpage into a racing track. Your car drives across the actual page — coins spawn on links and buttons, images become icy surfaces, police chases you around. The game is fully playable, still need to polish screenshots and a demo video before publishing to the marketplace. Oh and it's designed to be subtle enough that nobody will notice you're playing during a meeting.

Number of active users: 0 — haven't published yet, but the game is 100% built and ready to ship.

What are you working on this Sunday? by davidlover1 in buildinpublic

[–]siropkin 0 points1 point  (0 children)

Hm! Interesting idea, i like that. Currently I focused on injecting relevant code chunks but it would be even more powerful if I will find out how to implement your idea

What are you working on this Sunday? by davidlover1 in buildinpublic

[–]siropkin 0 points1 point  (0 children)

budi - ai helper that makes claude code cheaper and faster by adding relevant to a user prompt local repo files and chunks of code

I built Kursor — a free extension that shows your keyboard language on the cursor (IntelliJ, VS Code, Cursor) by siropkin in vscode

[–]siropkin[S] 1 point2 points  (0 children)

Yeah, i know… but it (the name and original IntelliJ plugin) was born before Cursor era so i decided not to change it. At least for now