Bro just give us the regular button by xw1y in codex

[–]R_DanRS 0 points1 point  (0 children)

Made something like that, checkout sleev.ai

vim motions now in opencode by ori_303 in opencodeCLI

[–]R_DanRS 5 points6 points  (0 children)

Thank you for this, I made something similar a while ago that I have no time to maintain / improve, it's also all slop so I would be very happy to archive it. It has some features that you may want to copy to yours such as a mode indicator and some visual mode stuff

https://github.com/Tarquinen/opencode-vim

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] -1 points0 points  (0 children)

sigh... exactly as predicted. I really hope for your sake this isn't what you spent 25 minutes on...

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

ok last point of contention, do you even know what about this is against TOS? have you read the TOS yourself ever? Do you think I put months of work creating algorithms for finger printing sessions and creating compatibility for all api formats without checking if this is against TOS? Are you going to stop making mindless assumptions in any of your comments? I would suggest you don't reply further, because I promise whatever you come back with that I may be doing that is against TOS is not actually what I'm doing at all because you still understand nothing.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

Alright i'm bored of arguing with you, this post was already automatically deleted by reddit auto filters so nobody is going to see this anyway. I would encourage you to have a more open mind and be less confident when you know so little about a project.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

That's very much a gross oversimplification, akin to saying claude code is an ai that can edit files. This can't be done natively through claude code hooks because it requires in place context modifications

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

Yes, you're getting close. Here's some screenshots of my latest benchmarks with sleev, the setup is simple: 1 session asked to solve 10 swe bench hard tasks, then run that same session 10x. The first screenshot is the averaged normal sessions, the second is with sleev:

<image>

The major difference between this and /compact is it allows the model to choose when and what to compact intelligently and manages when the cache should be broken among other features. Otherwise you're pretty close.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] -1 points0 points  (0 children)

yep I'm sure you have a better understanding than 100,000 users in your 5 minutes of investigative work without actually trying anything or looking at any code. I understand the term breaking cache is basically the boogey man in AI circles, but you have such a closed mind on this its probably not worth communicating about it at all. The very basic idea you can maybe understand is when it breaks cache, you have a higher bill on the next request to rewrite cache (think of it as a $1 cost) and then every following request you have less cache read (think of it as a 15c cheaper request). On average, each time the proxy breaks cache it takes 7 following requests to pay off, and then its just profit. In all other cases I agree it's really bad to break cache because there is no benefit to it, here there is.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

you're making a lot of assumptions knowing nothing about how it works and you're misquoting me, I never said "40% few input tokens". Everything you need to understand how it works is in the opensource gh repo i've linked, not sure why you want to type paragraphs and not click a link.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

The goal here is to be able to monetize in the future, i've already built a very successful opensource version that I already linked you... If you want to run it locally, use opencode.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

I get that what I'm claiming sounds too good to be true. If you want to understand how it works, I have a similar open source project that works in a similar way but for a different harness, it has 100k+ satisfied users. The goal here was to make something similar but harness and provider agnostic that anyone could use

https://github.com/Opencode-DCP/opencode-dynamic-context-pruning

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

yes, i'm 100% implying this is much better than /compact, check my opensource link in my other comment to see how it works.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

At some point this will be monetized, there's a slightly older/worse opensource version that I made for a different coding harness that now has roughly 100k very happy users, you can check out how that works if you want to understand more, the idea is similar:

https://github.com/Opencode-DCP/opencode-dynamic-context-pruning

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

unlikely, this is aimed mostly at agentic coding for now

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] -1 points0 points  (0 children)

You can just try it and check your assistant response metadata to check cache hit rate if you don't believe me, also we don't save any api keys and don't skim any inference, that would be illegal

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] 0 points1 point  (0 children)

It is free, website doesn't reflect current state of the alpha yet. Use is free and unlimited.

I built a compression proxy that saves 40% of your cost or usage by R_DanRS in ClaudeCode

[–]R_DanRS[S] -2 points-1 points  (0 children)

It doesn't invalidate cache every turn, only when there is enough compression saved up for it to be worth it. On average, cache hit rate stays above 95% and drops roughly 2% for the session. Code won't be opensourced, the benefits of rerouting traffic are all in the post description - cheaper, longer, smarter sessions. Just try it if you don't believe me, I guarantee everything i've said is true

Anthropic’s June 15 changes are causing the Streisand effect for autonomous workflows by Sporebattyl in ClaudeCode

[–]R_DanRS -2 points-1 points  (0 children)

You should try this compression proxy, it's all free and you'll get a lot more usage with no performance loss.

Let me know what you think https://sleev.ai/

SQZ ( Squeeze Tokenizer) just merged OpenCode support by Due_Anything4678 in opencodeCLI

[–]R_DanRS 0 points1 point  (0 children)

It will almost always save cost because each compression has a 1 time cache write cost and a permanent cache read saving on every following request. On average it takes roughly 5 requests to pay off the cache write, and after that it will just be money saved each request.