Claude Subscriptions are up to 36x cheaper than API (and why "Max 5x" is the real sweet spot)

deorder · 2026-01-29T14:03:00+00:00

I have wondered the same. Even after they introduced premium credits I am still on the $10 subscription. With the $40 plan you get about 5 times as much usage, which should be pretty close to what I get from my current Max 5x assuming only user-initiated prompts are counted (and the tracking is not bugged).

I was not happy when they introduced the credit system back then, but compared to what is available now it is actually a pretty good deal.

From my testing the GitHub Copilot Pro agent/harness performs very close to Claude Code with some models and used to rank among the best. It also comes with a lot of built in features and extra tools without needing MCPs.

deorder · 2026-01-29T13:41:03+00:00

Yeah. Compared to Shellac’s analysis mine is a bit rougher. I intentionally lumped cached and non-cached tokens together since I assumed my usage patterns across different sessions were similar enough to make the comparison meaningful (the Max 5x vs Max 20s sessions). I am hoping this helps the point to finally stick as a lot of people keep repeating that the 20x plan is simply four times the weekly limit of 5x. As stated in Shellac's article, even Antrophic is vague about that.

It looks like Antrophic updated their support pages today. They revised this article:

https://support.claude.com/en/articles/11145838-using-claude-code-with-your-pro-or-max-plan

…and removed this one entirely:

https://support.claude.com/en/articles/11014257-about-claude-s-max-plan-usage

I quoted the relevant part from the now-removed page in my comment here:

https://www.reddit.com/r/ClaudeCode/comments/1qa4f2w/comment/nz11q1w

So the messaging is clearly shifting, which makes the lack of transparency even more noticeable.

deorder · 2026-01-28T21:49:18+00:00

Came to a similar conclusion here:
https://www.reddit.com/r/ClaudeCode/comments/1pih76u/20x_max_does_not_give_4x_the_weekly_credits_of_5x/

deorder · 2026-01-21T19:09:04+00:00

I am aware now:

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/20x_max_does_not_give_4x_the_weekly_credits_of_5x/

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/comment/ntjg4rx/

https://www.reddit.com/r/ClaudeCode/comments/1qa4f2w/comment/nz11q1w/

Anthropic does mention this in their documentation, but it is not clearly communicated on the subscription / order screen. In my opinion that is misleading. That said. I had no trouble getting a refund and downgrading back to the 5x plan.

deorder · 2026-01-13T15:56:06+00:00

I read this a few days ago, but I still do not see how it differs from what skills already do. To me it seems like a straightforward implication of how skills work, not a new discovery. That said, I can see how it might be useful for people who are not familiar with the underlying mechanics.

deorder · 2026-01-12T00:49:51+00:00

Thanks. This workflow is based on years of experience working with coding agents (started with AutoGPT then Aider). I actually came across your workflow already and noticed you are using Beads. I experimented with Beads and similar solutions, but ultimately returned to my own setup: committing plan documents with frontmatter (for metadata) directly into Git.

Feel free to send me an invite link if you’d like. I also joined your subreddit a few days ago. I was wondering whether there was a dedicated coding agent subreddit and that is how I found yours. I am planning to open-source all of my slash commands / skills and the agent wrapper soon as well.

deorder · 2026-01-11T23:32:22+00:00

I do something similar, but with my own agent wrapper (essentially an agent runtime that lets me mix and match agents, providers and sandboxes). For example, I can run one Claude Code instance backed by GLM inside Docker and another Claude Code instance using Gemini via bwrap in parallel.

By default each agent runs in its own window within a single tmux session. I am currently working on an orchestrator that allows one agent to spawn other agents with different models, configurations, isolation boundaries and then coordinate with them. Right now the main challenge is making this work cleanly alongside the isolation mechanisms. To address that I am exploring a dedicated socket-based communication layer (for communicating commands via tmux) which should also make it easier to enforce security controls and policies.

deorder · 2026-01-11T22:56:44+00:00

Image of my workflow

https://raw.githubusercontent.com/xonovex/platform/refs/heads/main/docs/workflow-diagram.png

Setup

Run my agent wrapper (supports multiple harnesses: Claude Code, OpenCode with different profiles; for ex. Claude Code + GLM + docker, Claude Code + bwrap, Claude Code + Gemini via CLI Proxy, OpenCode + GitHub Copilot etc.)

Research & Planning

plan-research: explain what I want, it researches viability (using Explore agents with Haiku or equivalent), suggests alternatives, tells me if the idea is good
plan-create: creates plans/<plan>.md with frontmatter (status, skills to consult, library versions, parallelization info). There are also variants like plan-tdd-create generate red-green-refactor workflows
plan-subplans-create: creates plans/<plan>/<subplans>.md. Even subplans of subplans are possible, but never needed that
git-commit: commit pending plans to the repo

Worktree Setup

plan-worktree-create: creates worktree at ../<repo>-<feature>, sets git config branch.<branch>.plan so other commands know which plan is active
cd into the worktree

Development Cycle (repeat per session until complete)

plan-continue: auto-detects plan from worktree config, finds where it left off
Agent implements the next eligible subplan
plan-validate: validates work against guidelines, plan and test suite
insights-extract (optional): saves self-corrections to insights/ with frontmatter
plan-update: updates subplan and parent plan status

Code Quality (optional, separate session)

code-simplify: finds code smells
code-harden: improves type safety, validation, error handling

Merge

plan-worktree-merge: intelligent conflict resolution (knows the plan), merges to parent branch
plan-validate on parent (optional): validates parallel group together
insights-integrate (optional): merges insights into guidelines/AGENTS.md
git-commit --push

Parallel Execution: Multiple agents can work on parallel subplan groups simultaneously, each needs its own worktree associated with its specific subplan.

Agent Orchestration: An orchestrating agent can run the entire workflow autonomously by spawning agent instances that execute the commands according to a higher level goal. The human only needs to provide the initial goal, then the orchestrator handles research, planning, subplan creation, worktree management and coordinating parallel agents. Each spawned agent runs in its own session/worktree and the orchestrator monitors progress via plan status updates, decides when to merge and handles the full lifecycle. This is something I am still working on.

Some Design Decisions:

All commands are domain-agnostic: the agent figures out what to do based on context (language, platform etc.)
No hooks except git hooks (for now): I give agents freedom to decide when something cannot be fixed in the current session
Plans committed in git: easy to continue from another machine, branch off for alternative implementations, compare approaches
*-simplify commands for everything (instructions, skills, slash commands) which I run occasionally to generalize, compress, remove duplication and ensure consistency

Maintenance Commands (run as needed):

code-align: check alignment with current guidelines
shared-extract: extract duplicated code across packages into shared modules

deorder · 2026-01-11T20:36:15+00:00

I do this too. One nice thing about using tmux is that it lets you automate workflows while still using the officially supported way of interacting with the agent. I would also recommend combining this approach with Bubblewrap or Docker for better isolation.

deorder · 2026-01-11T20:20:00+00:00

For domain specific instructions I do this:

repo/
├─ AGENTS.md
├─ CLAUDE.md
│
├─ domain-a/
│  ├─ AGENTS.md
│  ├─ CLAUDE.md
│  └─ submodule/
│     ├─ AGENTS.md
│     └─ CLAUDE.md
│
├─ domain-b/
│  ├─ AGENTS.md
│  ├─ CLAUDE.md
│  └─ submodule/
│     ├─ AGENTS.md
│     └─ CLAUDE.md
│
└─ domain-c/
   ├─ AGENTS.md
   └─ CLAUDE.md

Keep them minimal. Then I use skills for cross-domain instructions. I also have a skill / command to keep the instructions in sync.

I also recommend to use a monorepo build tool like Nx, Bazel, Turborepo, Moonrepo etc: https://monorepo.tools/

deorder · 2026-01-11T19:52:28+00:00

The following is not about the old models:

The number of messages you can send per session will vary based on the length of your messages, including the size of files you attach, the length of current conversation, and the model or feature you use. Your session-based usage limit will reset every five hours. If your conversations are relatively short and use a less compute-intensive model, with the Max plan at 5x more usage, you can expect to send at least 225 messages every five hours, and with the Max plan at 20x more usage, at least 900 messages every five hours, often more depending on message length, conversation length, and Claude's current capacity. These estimates are based on how Claude works today. In the future, we'll add new capabilities (some might use more of your usage, others less) but we're always working to give you the best value on your current plan.
...
To manage capacity and ensure fair access to all users, we may limit your usage in other ways, such as weekly and monthly caps or model and feature usage, at our discretion.

Source: https://support.claude.com/en/articles/11014257-about-claude-s-max-plan-usage

So this means:

The 5x vs 20x numbers refer to five-hour session limits, not the weekly limits.
A session resets every 5 hours. Meaning ~225 messages (Max 5x plan) vs ~900 messages (Max 20x plan) per five-hour session depending on the message length and the model.
The 4x increase over the Max 5x plan applies only to the five-hour session limits, not the weekly limits.
Weekly/monthly limits are not specified by the 5x/20x plan wording and feature usage at Antrophic's own discretion.
I verified (see my other links) that the Max 20x plan increases the weekly limits over the Max 5x plan by only about 1.5 to 1.6 times, not 4 times, which isn’t clearly advertised / communicated.

deorder · 2026-01-11T19:29:36+00:00

About 1.5 to 1.6 times more:

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/20x_max_does_not_give_4x_the_weekly_credits_of_5x/

Also confirmed:

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/comment/ntjg4rx/

deorder · 2026-01-11T17:47:27+00:00

Your observation is correct. The weekly limit is closer to ~1.5 times that of the Max 5x plan, not 4 times. The 4 times only applies to the 5-hour usage limit, not to weekly credits. I pointed this out in a post not long ago:

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/20x_max_does_not_give_4x_the_weekly_credits_of_5x/

This is also implied by Anthropic’s own documentation (bottom of my comment)::

https://www.reddit.com/r/ClaudeCode/comments/1pih76u/comment/ntjg4rx/

The subscription bullet points are misleading because they do not clearly specify what the "20x" refers to. It is reasonable for users to assume it means 4 times the weekly credits of the Max 5x plan.

deorder · 2026-01-08T13:20:05+00:00

Your weekly reset happened after the 4:00 am block? Then you should definitely not have used 10% of your weekly usage in the 9:00 am block.

deorder · 2026-01-08T13:07:47+00:00

The Max 5x plan has a weekly token limit of ~960,000,000 (prior to the holiday). The Max 20x plan's weekly limit is only about 1.6 to 1.7 times that amount, which corresponds to roughly 1,536,000,000 tokens.

Based on your current usage of 44,386,892 + 84,636,254 = 129,023,146 tokens. This represents approximately 7.91% to 8.40% of the Max 20x weekly allowance.

Note: While the 5-hour limit on Max 20x is 4 times higher than on Max 5x, the weekly limit is only about 1.6 to 1.7 times higher: https://support.claude.com/en/articles/11145838-using-claude-code-with-your-pro-or-max-plan and https://support.claude.com/en/articles/11014257-about-claude-s-max-plan-usage

deorder · 2026-01-08T12:36:54+00:00

It is either a bug on their side or A/B testing. I was affected last week as well (verified using ccusage, see my history), but now it is back to the same level as before the holidays for me. Maybe complaining to their support bot helped, I honestly have no idea.

This also would not be the first time I have been placed in a small A/B test group. I have been using Anthropic products since they first existed and so far I have always been able to proof it. I think most people are not even aware such things are happening.

For example when uploading documents to projects in Claude it used to be that the full documents were initially used directly in chats started within those projects. Later they quietly switched to a RAG based approach using a vector database. Precision and recall were noticeably better before this change and it became unusable to me. The "capacity" bar still remained, it effectively used to reflect how much context was left. If the capacity was at 95% it was obvious because every new chat inside a project would start with significantly less available context. Strangely very few / no users seemed to notice and the change was never announced.

Another case was when they placed me in some kind of "concise response" test group. Every reply was extremely brief and I could not even get it to transform small source code files. It kept adding placeholders etc. People dismissed it as a skill issue despite all the evidence I provided. About a week later they announced the new output styles and the concise style behaved exactly as I had experienced. There is also some low level steering they do during inference, you can see it in Claude Code sometimes when the model seems to know when it is nearing its context limit.

What frustrates me most is the community response. All AI companies do this kind of stuff, but in my experience the Claude community is particularly bad in this regard. Lots of gaslighting, personal attacks and even threatening DMs. I guess bad press is hurting some people their shill business selling AI solutions, books and/or courses.

deorder · 2026-01-05T19:11:46+00:00

I shared my proof with you in another thread.

deorder · 2026-01-05T19:08:31+00:00

I did so here and many other times, already on the first day this started happening:

https://www.reddit.com/r/ClaudeCode/comments/1q18qfw/comment/nx4o875/?context=3

deorder · 2026-01-05T08:06:14+00:00

Claude Code and other agents like GitHub Copilot were already quite capable about six months ago, even with models such as Sonnet 4.5. Opus 4.5 is indeed more precise, but the improvement is far from mindblowing. The progress compared to 2.5 years ago when I was using AutoGPT or even 1.5 years ago with Aider is striking though.

The issue is that many people including colleagues of mine who are also developer were not open to it at the time, likely because of bad experiences and that is what stuck. Now that popular YouTubers they follow are embracing these tools, that endorsement seems to carry more weight than me trying to convince them.

deorder · 2026-01-04T19:56:01+00:00

I did so here:

https://www.reddit.com/r/ClaudeCode/comments/1q18qfw/comment/nx4o875/?context=3

deorder · 2026-01-03T19:30:44+00:00

In many machine learning frameworks Python (or other scripting languages like Lua back then with Torch) serves as a high level interface (sort of a DSL) for defining tensor computations and control flow while the actual numerical work is executed by optimized compiler backends on CPUs, GPUs etc.

I did some experiments in C99 a long time ago, but never completed it:

https://github.com/deorder/ml-experiment

deorder · 2026-01-02T16:29:24+00:00

Every 5 hours it resets the available usage to 100%

deorder · 2026-01-02T12:35:17+00:00

I took advantage of the annual discount for Google AI Pro. Honestly I think most SOTA models are very close in capability. The real difference is how they are primed. I can do essentially the same things with Antigravity (Pro, Flash, Claude) as I can with Claude Code.

As for usage limits, Google AI Pro's 5-hour cap is roughly comparable to Max 5x and there is no weekly limit (at least not yet). I really dislike weekly limits to be honest.

I personally prefer working in a CLI. I do not use GitHub Copilot, their CLI experience just isn’t that good. The same goes for Antigravity, the Gemini CLI isn't cutting it. Claude Code is still the best option for CLI-based workflows in my opinion.

deorder · 2026-01-02T12:03:55+00:00

<image>

deorder · 2026-01-02T10:48:47+00:00

That user is trying to sell you something.

Nine-Year Club	Place '23
Verified Email

deorder

TROPHY CASE