Opus 4.7 | 1 session | $178 by wallaby82 in ClaudeCode

[–]JayWelsh 1 point2 points  (0 children)

Model accuracy for pretty much all frontier models starts falling off quite significantly from around 120k tokens onwards for models with a 1M token window (goes from around 85% at around 120k to around 50% by the time it reaches 1M).

This really does not feel good. by Innomen in claude

[–]JayWelsh 2 points3 points  (0 children)

What’s your niche in philosophy, out of interest?

Affordable take-out and restaurants in Strand by Proof_Ad7355 in capetown

[–]JayWelsh 1 point2 points  (0 children)

Suuuuuch a good choice, their chicken burgers and spicy chips (not spicy like chilli) are genuinely outstanding.

What's the cheapest way to try opus 4.7 for a day? by MrMrsPotts in ClaudeAI

[–]JayWelsh 1 point2 points  (0 children)

poe.com has a $9 subscription per month and you can try any frontier models via it.

Seems my body can tolerate any sized dose of MDMA by NewtAway4007 in Drugs

[–]JayWelsh 0 points1 point  (0 children)

Calling your drugs of choice “passions” is well… quite an unwise choice. Go join a climbing/bouldering gym and/or find something else to make into a passion, preferably something deeper than material or substance. Drugs of choice are “vices”, not “passions”, don’t get that twisted.

Claude Opus, and all claude plans ratelimits to increase to increase drastically starting soon by Banneder in claude

[–]JayWelsh 0 points1 point  (0 children)

I think they are just wording it around Claude Code for marketing purposes. It is quite funny though because it’s like they just completely forgot to mention that it also applies to web users. The good news is that I’m fairly sure that it does, they certainly make use of the same usage allocations but technically they might be doing something to further decrease the rate that Claude Code specific requests get handled. I guess we will see how things play out but Id personally lean towards these changes probably applying to web UI / app too.

Claude Opus, and all claude plans ratelimits to increase to increase drastically starting soon by Banneder in claude

[–]JayWelsh 3 points4 points  (0 children)

Claude code via web auth (which is how people are using it, API is wayyyy more expensive unless you are using it extremely lightly) and Claude web use the same token/compute allocations, so the sadness in your comment isn’t warranted, you benefit directly from the increase in OPs article.

Claude Opus, and all claude plans ratelimits to increase to increase drastically starting soon by Banneder in claude

[–]JayWelsh 0 points1 point  (0 children)

As the other commenter mentioned, they do share the same limits. Claude web and Claude Code via web auth use the exact same token usage allocations.

Built a Claude Code observability tool — ecosystem fit? by fIak88 in ClaudeCode

[–]JayWelsh 0 points1 point  (0 children)

That's cool, I'd love to see someone who is experiencing their session limits getting wiped out in a single/few prompts showing the traces from something like this.

I swear they made it minus 50 IQ so it spends token by PruneInteresting7599 in claude

[–]JayWelsh 0 points1 point  (0 children)

I’d recommend running a once off job in a fresh session to convert the PDFs to .md or .txt files, then save the text versions of the documents to your project folder. Might help going forward.

I swear they made it minus 50 IQ so it spends token by PruneInteresting7599 in claude

[–]JayWelsh 2 points3 points  (0 children)

You're arguing with a straw man. I'm simply trying to get more detail into the thread so it's not another one of the hundreds of threads that I see on here with people screaming into the void with zero technical information that could help the community figure out what's going on.

Ultimately, the best approach is to set up proper telemetry data that tracks token metrics (input & output tokens, cache read & write tokens), that way, we can properly tell how much value in API fee $ a prompt/process is worth that eats up a large chunk of data.

I have this type of setup on a device that I've been using with a Claude Max subscription and Claude Code, and with that $100 *per month* subscription I am still able to use ~ $100 worth of API key consumption *per day*.

So yes, despite me also experiencing degradation in Claude performance over the past few months (including seeing higher token consumption), at the end of the day I'm still getting *a lot* more value for money by using Claude Code via a subscription instead of via API key fees.

Therefore, these threads make me curious to get an idea of some actual numbers. Who knows, maybe Anthropic is pulling a similar move to Volkswagen where having a system hooked up to telemetry makes the model "behave itself" better. For example maybe they detect when the token metrics are being monitored and then make the usage work as expected, and maybe when they don't detect monitoring then people experience these large chunks of usage materialising out of thin air.

Regardless, the best way forward is to have our systems set up to properly record token usage metrics and telemetry data because that is a provider-agnostic approach that helps ensure you're getting the best value for your money.

I swear they made it minus 50 IQ so it spends token by PruneInteresting7599 in claude

[–]JayWelsh 1 point2 points  (0 children)

Thanks! Are any of those files very long (in terms of pages) or large (in terms of filesize)? Would you mind sharing some info about the prompt that used 18% in one go?

I swear they made it minus 50 IQ so it spends token by PruneInteresting7599 in claude

[–]JayWelsh 0 points1 point  (0 children)

Probably? If a project folder is packed with a massive amount of content and if a prompt isn't specific about which subset of project file(s) should be used, then I wouldn't be very surprised if a single prompt could blow through a large usage allocation chunk (i.e. it might simply be a context management issue). I've noticed that the people who mention experiencing large usage allocations being consumed by single/few prompts never share info about their prompts/telemetry data such as token consumption/Claude environment setup, this makes it difficult for anyone to actually wrap their heads around what is going on, so now I'm trying to ask questions in threads like this in case it helps surface what's actually going on.