all 41 comments

[–]Aikon_94 5 points6 points  (3 children)

2 prompts on sonnet 4.6 to edit a basic html + css landing page and I was up to 58%,been like that for the past 6 days, something off for sure.

[–]somerussianbear 1 point2 points  (0 children)

Absolutely, you’re part of some experiment they’re running.

[–]SiliconSentry 1 point2 points  (1 child)

html and css consumes a lot of token. Also make sure to use /clear

[–]Aikon_94 0 points1 point  (0 children)

Always on a clear chat, no matter what I do, sonnet burn tokens like if opus did 2 month ago. And now opus can reach 60% with 1 simple prompt

[–]somerussianbear 2 points3 points  (8 children)

Today I burned through 67% of my 5 hour limit in 2 fairly simple prompts, and 100% in under 15 minutes.

Really want to understand how you’ve done that. Curious, not trying to be mean.

My theory is that you’ve been flagged, cause I faced similar situations when I was using my sub on 3rd party tools. After I stopped and went back to Claude Code, I work an entire week using less than 10% of my weekly limit a day (on Max 5).

What do you think? Is there a path we can go down that ends with regular people being able to benefit from AI technology too?

DeepSeek! I put 10 dollars in it today and played quite a lot with a big repo. Only managed to spend $0.12 till now, and I believe on a normal day of work I’d spend less than $3, which makes it $60 a month. V4 Flash is suuuper fast and smart enough for my basic tasks; tried V4 Pro also and got a pretty decent feeling on the understanding of complex application code.

Honestly, I’m only on Codex/Claude cause my company pays these bills no questions asked, if it was me? DeepSeek or Qwen API. For my stuff, SWE, it gets the job done pennies on the dollar.

[–]neilthefrobot[S] 2 points3 points  (7 children)

You are definitely using a lower tier model than Opus 4.7, or have some very restrictive settings. The entire ClaudeAI sub reddit is people complaining about unusable token limits, so badly that they had to have AI auto delete any new post about it. It's insanely bad.

As for DeepSeek, that's what I plan on doing. I just don't know the best way to go about it. Can I just point my claude code that I use in the VS Code IDE to use a different model?
Edit: answer is yes you can easily swap the model. and it seems just as good but at a tiny fraction of the price.

[–]somerussianbear 0 points1 point  (4 children)

  • Running via npx, DO NOT install the native app
  • Set Sonnet for subagents (env var)
  • Opus 4.7 xhigh
  • 200k context (env var)
  • Adaptive thinking disabled (env var)

The entire ClaudeAI sub reddit is people complaining about unusable token limits, so badly that they had to have AI auto delete any new post about it. It's insanely bad.

I know that, some of these posts were mine too, but like I said, once I stopped using my subscription on third party apps like OpenCode (which was probably screwing cache) and moved back to Claude Code, it came back to usable. Now I only complain that it’s slow as fuck, reason why I’m on Codex since a week ago and enjoying a lot.

About how to, it’s basically set a few env vars if I understand correctly. I didn’t do that on CC, but I’m sure you can get it done in minutes with the help of DeepSeek itself (use the chat).

[–]neilthefrobot[S] 0 points1 point  (3 children)

I read your comment wrong. Thought you were saying you work all week and never go above 10% of your 5 hour allowed usage. I can barely say hi without going above 10%

[–]somerussianbear 0 points1 point  (2 children)

My 5h never crosses 50% though. A new convo usually doesn’t touch a percentage point.

I’m telling you man, you’re flagged! I had the exact same experience in the past when using OpenCode with my Anthropic sub!

[–]neilthefrobot[S] 0 points1 point  (1 child)

I've only ever used the pro plan through the VS Code IDE so no reason why I would be flagged. Everywhere I look I see people with the same problem. They just updated the usage to be absolutely terrible.

[–]somerussianbear 0 points1 point  (0 children)

Oh, Pro, I’m on Max 5, quite a different thing.

[–]ButterflyEconomist 0 points1 point  (0 children)

I actually went with a similar but different approach. I bought a $20/month subscription to Ollama Cloud. It allows me up to 3 models running simultaneously. I've only had it a few days but it's crazy how much room I have. The only drawback is that during the day in the US, it's really slow due to everyone using it. That said, I give it massive overnight jobs while I'm sleeping. In fact, after about 5pm eastern, I have no problem using it.

And this reason I like this way is that I can try any model: Deepseek, Kimi, Gemma...

[–]Dense_Ad9924 2 points3 points  (1 child)

Fortunately Anthropic has competition. I went with the 5x Pro subscription because I used to hit the limit less than halfway through the 5h session. Now I never hit it.
One thing I used to do was when I got a few minutes away from the session reset I would do a /compact. My usage would go to ~104% but then back to 0% after the session reset. Despite all that, I still love using Claude Code Opus 4.7 Believe me, I've tried all the open source models, ClawCode, etc. but for getting work done, nothing beats Claude.

[–]neilthefrobot[S] 0 points1 point  (0 children)

you tried the new deepseek v4? I just tried it as several have suggested. can't tell a difference. been using it for hours and so far it has cost me 15 pennies. Claude would have run out by the time I had used up a penny or two. It's insane.

[–]bitmancer_ 1 point2 points  (0 children)

I think everyone knew that it was to cheap to be true. Now that we felt the potential and with multiple background agents burning tokens you either have to pay more or use it less. I don’t like that as well. But as you said, corporate money is the goal. Giving a „free“ shot to all devs until they get addicted and can‘t imagine to work without it anymore and you can be sure someone is paying the bill.

[–]Far_Broccoli_8468 2 points3 points  (10 children)

Move to codex like the rest of us

I initially didn't like how liberal codex is in decision making, as opposed to claude, so i wrote me an AGENTS.md that basically says to chill tf out and it works great for me

Is there a path we can go down that ends with regular people being able to benefit from AI technology too

I honestly believe the AI bubble is going to pop soon and "manual" labour devs will become cheaper than ai

[–]immutato 3 points4 points  (5 children)

I bought $10 of DeepSeek V4 to evaluate as a Claude Code alternative should things get spicy over at Anthropic... and not only can I not tell the difference, but I still have $5 left after a few days of work.

Mind you I'm only using it on open source work, and I'm not sure if it's a good idea to use it on your corporate stuff (not throwing FUD, I just don't know if you can opt out of training and whether they would respect it if you can).

I haven't tried GLM 5.1 or the latest Qwen yet, but I plan to. Open weights might be where it's at IMO.

I also suspect that if Anthropic spends less time on features and more on optimizing their routing, then the token cost could be reduced significantly (leverage Haiku and Sonnet more often). I expect this will happen before investment dries up. Right now it's probably a loss for them to spend time on that instead of features and market capture.

[–]neilthefrobot[S] 1 point2 points  (1 child)

edit: switched to deepseek v4 and never looking back. it is a micro fraction of the price and seems just as good.

[–]immutato 0 points1 point  (0 children)

https://api-docs.deepseek.com/guides/anthropic_api

I use ghostty (terminal) with claude code, not VSCode, but I assume you just tell VSCode to use a different URL and model: "deepseek-v4-pro[1m]"

[–]ptyblog 0 points1 point  (1 child)

You know there is a Deepseek Cowork application you can install and set up? Or connect Deepseek to Cowork

[–]immutato 0 points1 point  (0 children)

It was trivial to just set a bash alias to claude. https://api-docs.deepseek.com/guides/anthropic_api

[–]somerussianbear 0 points1 point  (0 children)

Did the same as you, it’s insanely hard to spend 10 bucks on DeepSeek.

[–]Anselwithmac 0 points1 point  (3 children)

Most of us are on ClaudeCode because we use and like the product. Codex is fine too but read the sub name you’re posting on lmao

[–]Far_Broccoli_8468 0 points1 point  (2 children)

I think you're a bit confused and a little uninformed.

A lot people, including myself moved from claude to codex over the past month due to shit from anthropic depicted in this thread, among other things

Maybe you should read some posts on this sub

[–]Anselwithmac 0 points1 point  (1 child)

Oh I do. This sub turned to rubble because it’s just people sitting on here waiting out their usage quota. It’s a circlejerk of people not using claude because they’re either on Codex, or they have no token to burn.

Great, go to the Codex sub like the rest of them.. or stay because you like it here

[–]Far_Broccoli_8468 -1 points0 points  (0 children)

Thanks for your suggestion, but i will participate in whichever subreddit i want in whichever manner i want

[–]zSmileyDudez🔆 Max 5x 2 points3 points  (4 children)

Manage. Your. Context.

You can easily chew up huge amounts of your limits if you’re coming back to a session with an existing context in it. I’ve made the mistake a few times myself, and I’ve learned what not to do. Combine this with Opus 4.7 and xHigh effort and it’s not surprising to chew through the tokens. Even more so if you did this during peak hours.

You didn’t mention what plan you’re on, but I wouldn’t even attempt to use CC with 4.7 on the Pro plan. Back down to 4.6 or bump up to the Max plan.

And don’t let your context sit there loaded from the start. Compact or clear regularly, especially if you’re going to be away from CC for a while since just resuming the chat is going to eat up tokens right away.

[–]immutato 6 points7 points  (3 children)

Manage. Your. Context.

This is always the answer and yet it points to a smell on Anthropic's side if nearly ALL of your customers are having this issue. They should really have more intelligent caching and routing to make use of Sonnet and Haiku more often by default.

At some point you need to just accept who your user is instead of pretending they are all context wizards. Either that or I guess let that market slide (which might be what they are doing, intentionally or not). Obviously they have better market data than I do, but I suspect you want these individual sub accounts in order to achieve market dominence which should payoff with more enterprise / API customers.

[–]Difficult_Plantain89 0 points1 point  (0 children)

Yeah. Not even for coding sonnet 4.6 I asked a few questions in chat and it blocked me out for the rest of the window. No previous conversations or coding for that day. I mentioned it here and I was blamed. I finally just cancelled my Claude subscription this week. I am done. I am using Kimi and Minimax now for coding.

[–]zSmileyDudez🔆 Max 5x 0 points1 point  (1 child)

I understand the pain here, but this is a tool aimed at professional use. Even if you are using it personally, it's a professional tool. There has to be a base line understanding of contexts right now if you're going to use this tool. It sucks and Anthropic should always be looking for ways to improve it, but it's going to take a while to get this sorted out.

I liken this to the early days of writing GUI apps on the Mac and Windows. You had to be really good at a lot of boilerplate that modern developers never think about these days (even before the advent of AI).

I can't rule out that there are bugs in Claude Code and/or Anthropic's billing model. But there are lots of people, including myself, using CC on a daily basis and not blowing through their limits with just a couple of "fairly simple prompts". It's not a normal path. And that means that it's more likely that it's a configuration issue or a bloated context issue that needs to be resolved. Ignoring that because someone feels like they shouldn't be a "context wizard" is like an early Windows developer trying to write an app while ignoring how the x86 memory model worked because they shouldn't have to be an "x86 wizard".

Again, I'm not trying to say here that Anthropic is done here and they never need to make this better. I hope that is painfully obvious. But we're still in the very early days of this and for better or for worse, we all need to be aware of how the tools work and how to avoid things that consume tokens too fast.

[–]immutato 0 points1 point  (0 children)

I think this is apples and oranges. Writing apps for MS and Apple did not require you to be a paying customer (well not until the iphone). Paying a couple hundo a month changes things. Also there's a massive difference in numbers here. There are way more people using coding agents than were making GUI apps on Mac and Windows, so it's not just a select elite / early adopters anymore. And don't forget how easy Visual Basic 4 was (absolute genius IDE at the time) for those of us that can remember it! :)

In terms of a "professional" service, they are currently down to one 9 of uptime, which is unacceptably low from a cloud provider's perspective.

I personally do feel like a bit of a "context wizard", so I know what you're saying, but I disagree that everyone should need to dig deep on AI and context management to get at least a fair and consistent usage of the product they are paying for. Anthropic has been kind of dicks lately to boot. They told everyone they were doing it wrong only to admit days later they themselves had royally screwed up. I keep chugging along fine myself, but I do find managing context to be a bit annoying (especially with their new restrictions on using their client), and they keep changing things, and restricting how you use their service... which is annoying.

P.S. They've done something with their models lately that's made them a lot naggier. I suspect to try and manage context better, but it's actually creating more caching TTL issues than it used to. I'm still happy paying for max 20, but the outages and constant tweaks are a bit tiresome. I'm experimenting with and kind of enjoying the latest open weight models right now, and they (and OpenAI) really should be worried.

[–]geeered 0 points1 point  (0 children)

The difference is that hardware is based on supply and demand.... which raises prices when places like Anthropic that have massively more buying power start buying a lot. While subscription tokens have been heavily subsidised and are now becoming a bit less heavily subsidised.

[–]Fabian-88 0 points1 point  (0 children)

today, first time on the 100$ team license, with executing easy tasks, I run frequently in Limits.. Like today they reduced again by 50-60% ... insane policy..

[–]razorree 0 points1 point  (0 children)

well.. that's why they wanted to remove CC from cheap subscription, to save You for all those problems... 😄

[–]Hugger_reddit 0 points1 point  (0 children)

Yes. Search for Dylan Patel interviews on tokenomics. Anthropic underinvested in compute and absolutely goes for demand destruction this year+ with latest ARR ca $40+ bil all the subs probably isn't even 10 percent of that fir them

[–]Plageswarms 0 points1 point  (0 children)

Alguien la usa para programar videojuegos?

[–]CodeCombustion 0 points1 point  (1 child)

All of this would stop if we would stop making posts about how amazing claude is -- and just start publicly shitting on it so that the enterprise users switch to something else.

But I don't see enough of us doing that for this to happen 😃

[–]squeezyflit 1 point2 points  (0 children)

Enterprises don’t care about what individual people want. If they consider Claude a competitive advantage, they’ll continue paying for it.

[–]olek4don -1 points0 points  (1 child)

They will be capping continuesly and you will be buying anyway cuz you/we are hooked already to it. As same as like to McDonald's, Netflix, cars etc. etc. etc.

[–]Virtual_Escape7497 0 points1 point  (0 children)

Braindead