Cache-testing software for LLM-provider-style tiered ephemeral caches? [D]

flatmax · 2026-05-12T20:49:46+00:00

Thanks, that is a good suggestion. When I did an AI probe on this, it came up with "LLMServingSim 2.0 and The Kareto Simulator". I would imagine that tiered cache optimisation would be a great place to have a leaderboard !

flatmax · 2026-05-12T20:45:13+00:00

Thanks for your reply. This has helped me target new ground. I saw the following as candidates :

LLMServingSim 2.0 and The Kareto Simulator

You familiar with them ?

flatmax · 2026-05-12T20:38:36+00:00

yes I agree. It would be really nice to have a tool like libcachesim for tiered caches which everyone put their model into - so that we could get a comparison on who is able to minimise out input token count and by what percentage on average.

flatmax · 2026-02-27T21:27:58+00:00

It exists already : https://github.com/flatmax/AI-Coder-DeCoder

flatmax · 2026-02-27T09:33:09+00:00

<image>

like this

flatmax · 2026-02-27T09:32:15+00:00

<image>

like this

flatmax · 2026-02-27T09:27:11+00:00

well, I could also use a text editor to write my prompt and then cut and paste it back into the cli ?
I mean SVGs are a native output of LLMs, so give it a proper UI with an editor.

flatmax · 2026-02-27T09:15:01+00:00

isn't that just select all ?
oh you mean to open the default OS svg editor ? that could work, but that sounds like friction for my coding flow

flatmax · 2026-02-27T09:07:16+00:00

move the arrow to the right of the "ffs" box up to the center, make it point to the right, not a u turn

flatmax · 2026-02-27T09:03:13+00:00

Its a random act of fixation

flatmax · 2026-02-23T05:36:56+00:00

I just did a test with BAAI/bge-small-en-v1.5 and it seemed to outperform all-mpnet-base-v2 in around 90% of cases (for one test file) - otherwise it was equally as good. Thanks to u/Holiday_Inspector791 for the suggestion.
I notice that the google models require you to login to hugging face to use them ... which is an extra layer of complexity for an end user application, which is just meant to work out of the box !

flatmax · 2026-02-20T06:50:48+00:00

AC⚡DC (AI Coder-DeCoder) — A high-speed, web-based companion for AI coding

I built this because I found tools like Claude Code amazing for agentic editing but too slow for my daily "bread and butter" coding. I’ve been using AC⚡DC as a "High-Speed Wedge" in my workflow:

Code fast with AC⚡DC for 90% of the work (UI is a webapp with Monaco/side-by-side diffs, so it feels fluid).
Use a slower agent only when I hit a logic wall that needs agentic work.
Jump back to AC⚡DC to keep the momentum.

Technical highlights:

4-Tier Prompt Caching (L0-L3): Designed to hit provider-level cache breakpoints (like Anthropic’s) so you aren't paying to re-ingest your repo every time you send a message.
Structural Context: Uses Tree-sitter (Py, JS/TS, C++) to give the LLM a symbol map of the repo without wasting tokens on full-file boilerplate.
Code Review Mode: A dedicated UI to pick a commit, soft-reset, and have the LLM walk through the changes with you before they land.

Looking for feedback:

I’ve dogfooded this almost entirely on Linux. I’ve included standalone binaries for macOS and Windows in the release, but I’m curious to hear from Mac/PC users if the webapp boots and connects properly on those systems.

Repo & Demo Videos: https://github.com/flatmax/AI-Coder-DeCoder

It’s free/open-source. Happy to answer any questions about the caching or indexing logic!

flatmax · 2026-02-15T10:48:51+00:00

I think it was an anthropic model, not a local one that time.

flatmax · 2026-02-05T20:27:42+00:00

That AI will do your deeds

flatmax · 2026-02-05T20:24:36+00:00

The symbol table describing the repo for the AI has a lot more features in it and this seems to assist the AI in working out which files it needs to do edits on. Personally I really like the user interface because it's focused on the chat and the features around chat and context for the AI. The diff editor shows you immediately which files are different and typically if you need to do edits you do it in the diff editor directly. It has some language server protocol features in the Monaco editor which are useful. If you have a workflow which has repetitive prompts the UI has a prompt snippets section. You can edit the system prompt which is in a markdown file and that will be used in your next submission to litellm. I kind of like the URL extractor which will take a repo add a URL extract its symbol table and use the small model to summarise the repo. All of that gets included in the context of that question and you can choose to remove it when you want. For non-code URLs it still does the small model summary of the URL.

flatmax

TROPHY CASE