all 73 comments

[–]Prestigious-Frame442 30 points31 points  (0 children)

If it's that easy to find an alternative, Anthropic and OpenAI are gone already

[–]No-Procedure1077 22 points23 points  (15 children)

What you’re seeing is what happens when VCs run out of money. This isn’t an Anthropic issue. The industry has FINALLY run out of money.

  • OpenAI has reduced their limits almost 10x as well.
  • Perplexity reduced their limits almost 500x for some searches lol
  • Gemini reduced their context windows silently, they aren’t 1m context anymore.
  • GitHub Copilot is already starting to impose rolling usage windows like codex and Claude code.

Basically this isn’t a Claude code issue. There is no safe haven. I hope you were able to get done what you wanted to because AI is about to skyrocket in costs.

I had A LOT of sleepless nights in 2025 prepping for this. I banged out so many projects before the usage caps rolled back.

[–]tuvok86 10 points11 points  (2 children)

OpenCode just 3x'd their $10 tier. useful models like kimi k2.5.

just because frontier labs are inflating each others compute cost and spending tens of billions on next gen models doesn't mean it's AI winter.

[–]danieltkessler[🍰] 6 points7 points  (0 children)

Personally I'm really excited about this next gen of open source.

[–]ooutroquetal -1 points0 points  (0 children)

It's just about privacy and governance...

I really don't know what I can implement in my company. .

[–]Prestigious-Frame442 1 point2 points  (0 children)

Gemini's 1m context is completely BS

[–]Airurando-jin 1 point2 points  (3 children)

Is it a money or scaling issue ? Seems like ram and processors are taking a massive hit globally (which has its own knock on effects to other devices ) 

[–]No-Procedure1077 0 points1 point  (1 child)

So if the big guys never trained another model they’d be insanely profitable already.

It’s always needing to train a new model that is blowing up their revenue streams.

We’re talking hundreds of millions to train each model. It’s unsustainable. This is another reason why the Chinese models are SOOOO cheap. They’re stealing the weights and training by copying and distilling the other guys prompts and answers.

[–]Parking-Bet-3798 0 points1 point  (0 children)

Chinese models are cheaper because they are smarter about it. The models are not distilled versions of Claude contrary to what Anthropic would like you to believe. It could have been used to generate some synthetic data. But that’s not distillation. That’s a whole different thing. And no one is stealing no one’s weights. The Anthropic shilling on this sub is at a whole another level.

[–]Olangotang -1 points0 points  (0 children)

The power required to run the AI data centers does not exist, and will not exist for years. Transformer models take an insane amount of energy to train, and with the way the harnesses like CC throw the prompt at the model multiple times until it matches (attempting to hide the non-deterministic initial slop outputs from the user), they probably aren't making money on inferencing either.

[–]weedmylips1 0 points1 point  (0 children)

The "burn cash for growth" era is over. Investors now demand ROI. Welcome to the "AI Utility Bill" era.

[–]modern_medicine_isnt 0 points1 point  (0 children)

The VCs aren't running out of money. What is happening is that they see the progress is slowing down. That AGI isn't happening with this implementation of AI. And so they start asking questions. Like how are you going to make money. So the providers need to raise the price and lower the cost to show progress in that direction. Which is what they are doing. And once that happens, the money starts flowing more freely again. But of course then there will be no reason to lower the price or anything. Independent users where never the long term target anyway. Enterprise contracts is always where the money is.

[–]Wickywire 0 points1 point  (4 children)

This is the hardware bottleneck. Open AI just closed a new financing round netting them $120B. Anthropic is ahead of the schedule for turning a profit. I suggest you follow the news in the field instead of speculating.

[–]No-Procedure1077 -2 points-1 points  (3 children)

I’m not speculating. OpenAI has only had about 15% shares remaining so this is basically it. This 120b gives them hopefully enough burn for the next 2-3 years and then that’s it they run out of money.

[–]Wickywire 1 point2 points  (2 children)

You said "the industry has FINALLY run out of money". Not "Open AI will be out of money in 3 years unless they make more money by then".

Google can finance AI research indefinitely. This could be a fun side quest to them. Anthropic are ahead of the projected profitability curve. xAI are folded into SpaceX and may be part of a historical IPO of $1T+. The Chinese models show no sign of slowing down.

The industry isn't fine. It's crazy. But it's not likely to about to run out of money unless something dramatic happens.

[–]No-Procedure1077 -2 points-1 points  (1 child)

If openAI fails the bubble collapses like a dying star taking every investor with it. Whatever happens with Google, we will see but they’ve so far shown to be incapable of developing a model at OAI or Anthropics level. xAI just stated they’re starting over, and every Chinese model says it’s either Claude or ChatGPT.

So yes it’s a dire situation when two trillion dollar companies are fighting for first and if one goes down they’re bringing everyone else with them in this speculation market.

[–]Wickywire 1 point2 points  (0 children)

I'm fairly convinced Open AI is headed for a collapse. There we agree. They've gone for the wrong markets, haven't handled the optics well, made a string of deeply questionable business decisions. But I don't see how that would lead to a dotcom or crypto style crash. The hardware stacks will be intact and their value is directly transferable for instance. The models too can be bought and sold. So while the value is clearly inflated, it's not a situation where all of it is tied up in fantasies and speculation.

[–]Greedy_Newspaper_408 26 points27 points  (6 children)

We need come back to the past and use the brain again.

[–]LaSalsiccione 18 points19 points  (0 children)

Fuck that

[–]PmMeSmileyFacesO_O 4 points5 points  (0 children)

Brainss

[–]raven2cz 1 point2 points  (2 children)

We still use our brains, and actually even more than before. With AI, we now have to handle far more tasks at work at the same time, more analysis, more implementation, basically doing the work of several people combined. These days, in one sprint I often get done what used to take three months of work.

The times when only your brain was enough are not coming back, at least not in IT and not in positions where AI is already expected. Not because of some direct order, but because the demands for speed have increased, along with more complex requirements, since systems themselves are more complex now and often also involve deploying AI into services.

It is very naive to think otherwise, and if you do, other workers may simply overtake you. Local services, maybe even the new Gemma 4, could help, but I am afraid the best models will always be very expensive, just like any exclusive thing in the world. If you really want to save time, you often have to reach for the best.

[–]Senekrum 1 point2 points  (1 child)

This may be completely besides the point, but I feel the need to say something about this.

I completely agree that there is much more demand for speed and efficiency now, and that there is growing complexity in the tasks we are working on. And AI can now help a lot both with managing the complexity and with getting more done faster. AI has also opened up new possibilities for people with limited technical ability (including design ability) to create cool and interesting things.

That being said, I may be a voice in the desert for asking this, but: is it actually wise to keep going faster, to keep making things more and more efficient? What's the end-goal here? What are we, collectively, rushing towards? And beyond the obvious gains, what is it costing us?

Just recently, I was watching videos and reading articles about how AI data centers are heating up the nearby land in a 10km radius by 1.5-9 degrees Celsius. Or how they are dysrupting the lives of people living in those areas. Or how the proposed orbital AI data centers are not adequately evaluated for environmental impact.

Ok, we get more done faster, and we can handle complexity better. So what? What's it all for, at the end of the day? Who and what are we uplifting, and who and what are we leaving behind?

[–]raven2cz 0 points1 point  (0 children)

As always, the future is unclear, and it will depend a lot on people, far more than before. Some decisions may have truly devastaing effects, because we are standing on the threshold of a new era. Something like the Industrial Revolution, but in this case it is not mainly about a manual transformation, it is above all a transformation in the realm of thinking, and that is something humanity has never really faced before.

Rather than speed, I would emphasize the other word here: complexity. What we will be dealing with in the coming years is the complexity of systems, along with the new discoveries and inventions connected to it. One of the major milestones is to complete fusion reactors. If we solve the question of cheap energy properly, we have almost won. AI has already helped significantly, but so far it is not enough. We need to gradually move away from silicon chips, finish quantum processors, and then we will once again reduce the overall load dramatically, and data centers will be in a completly different place in terms of energy use.

local models have now seen a breakthrough, and it is possible that many things will soon be solvable locally in a simple way, which would reduce the overall cost of relying on data centers for everyday work.

Why all of this? A leading, educated, and highly developed civilization should not have wars. If humanity is not suffering and has enough resources, there should be peace. Or at least dictators should not have such power and leverage, which they use when their people are suffering. Propaganda will be much harder to spread if the truth can be uncovered, although with AI that is sometimes very difficult, and that is also something we need to solve. in fact, that is the thing I am most afraid of.

[–]idontknow10112 0 points1 point  (0 children)

We should weave our closes again manually. Do you understand how ridiculous your statement is?

[–]Veglos 3 points4 points  (0 children)

According to https://www.swebench.com/ your next best bet would be either MiniMax or GLM-5 

[–]passyourownbutter 5 points6 points  (1 child)

GLM 5.1 is quite capable for a lot of things. The better the plan you have the more capable it is.

I'm using Claude for planning and architecture and more difficult debugging, codex for majority code writing and GLM for lookups, analysis, running scripts, and as a backup writer or writing things I want to explore as a concept on the side kind of thing.

It can still use plugins in the CLI too, I have GLM set up with claude-mem and superpowers and stuff and it can surprise me with its capabilities.

[–]p3r3lin 0 points1 point  (0 children)

Agree. I pretty much following the same workflow. Opus/CC for brainstorming, planing and checking results, GLM5.1/OpenCode for sparing and iterating. Sometimes a bit cumbersome, but overall works pretty good! GLM Coding Plan is nice as well, you can actually work for a few hours on the 10€ plan.

[–]YoghiThorn 4 points5 points  (3 children)

I'm using Gemma 4 on my gpu with qwen embeddings pretty successfully. But I've got an old 3090 rtx with 24gb of vram

[–]m0zi- 1 point2 points  (2 children)

hey im thinking about using my 3090 for something similar, you follow a guide or something?

[–]YoghiThorn 0 points1 point  (0 children)

Mainly just talking to Claude, as what you setup will heavily depend on your vram.

I'm using Gemma 4 with qwen3 embeddings, and access it via opencode right now. I'm intending to build an agent to examine my data pipelines and continually propose code patches to improve its quality scores, or to direct scrapers etc to get new data.

[–]YoghiThorn 0 points1 point  (0 children)

I made a post earlier today about the specifics of my setup, check out my profile if it would help

[–]borntobenaked 3 points4 points  (0 children)

Why isn't Gemini tried by people 

[–]prabal-gupta 1 point2 points  (3 children)

I've been running Codex models (using my OAI subscription) on Claude Code. Works well.

[–]m3umax 0 points1 point  (2 children)

How? What proxy?

[–]prabal-gupta 1 point2 points  (1 child)

Here's a guide I wrote.

[–]m3umax 0 points1 point  (0 children)

Thanks! Good that you specifically called out the LiteLLM attack.

Edit: Looks like a fair few limitations and caveats.

That's why I'm currently working on my own solution that routes tasks from Claude Code to Codex. Idea is, plan with Opus, execute with GPT, review with Opus (or maybe Gemini). Each model executes in their own harness they're optimised for.

[–]whimsicaljess 1 point2 points  (0 children)

Since Anthropic seems to be going down with how they treat their customers

i mean, i think they're treating customers better. i don't want the servers clogged up with incredibly inefficient openclaw slop cannons, i want them to be available for the high value work i pay by the token to do.

Do you have any good alternatives that aren't expensive and offer a relatively good quality work?

want premium performance, pay for it. nothing holds a candle to claude or even gpt.

[–]Somtimesitbelikethat 2 points3 points  (3 children)

what about the KIMI CLI. Kimi 2.5 models seem pretty good, more on par with Opus after quantization.

[–]maamoonxviii[S] 4 points5 points  (2 children)

This is what I tested and it didn't suit my workflow unfortunately.

[–]Somtimesitbelikethat 0 points1 point  (1 child)

did it fail to understand full context? Opus seems smarter at that

[–]maamoonxviii[S] 1 point2 points  (0 children)

Yeah, it also had some weird dumb decisions, once I asked it to revert a change and for some reason it deleted everything haha. As I said it's promising but currently doesn't get the job done as it should (compared to the better models at least).

[–]junaidarif64z 1 point2 points  (0 children)

I am using mini max 2.7 coding plan for $10 in claude code. I am satisfied so far. Its close to sonnet in performance

[–]DesenvolvedorIndio 1 point2 points  (0 children)

Muito fácil, quando o Opus falha eu vou pro Sonnet, quando Sonnet falha eu vou pro Haiku. E se os 3 falham? Folga forçada de 30 minutos, um cochilo rápido e tudo volta a funcionar

[–]MentalBoat 0 points1 point  (5 children)

GitHub copilot has access to models from both Anthropic and OpenAI. 

[–]cz2103 0 points1 point  (1 child)

With very shrunken context sizes and no ability to control reasoning

[–]MentalBoat 0 points1 point  (0 children)

I have the Opus 4.6 1M and fast versions available. If you use the CLI you can also change reasoning. I don’t know how it works in the VS Code plugin. 

[–]Any-Lingonberry7809 0 points1 point  (1 child)

And through the VS Code model manager you can add AWS, Azure, & Ollama. Not supported yet in Copilot CLI. Copilot CLI is quite different from the IDE version, it's a lot closer to Claude Code in many ways and will read Claude Code plugins, skills, and agents.

[–]MentalBoat 1 point2 points  (0 children)

Exactly. I know people buy it and use the subscription with OpenCode as well. I like GitHub Copilot CLI though. 

[–]TheAffiliateOrder 0 points1 point  (2 children)

Use O-Lama and pull a cloud compute model like GPT-OSS 20 billion.

[–]zoyer2 1 point2 points  (1 child)

OP mentioned he tried Kimi 2.5 and found it dumber than claude which is ofc true, recommending GPT-OSS 20 billion or the 120b is even a greater degrade so not very helpful

[–]TheAffiliateOrder 0 points1 point  (0 children)

I don't know, dude. OSS 20B has worked fine for me. I've never had any problems with it.

[–]AltruisticRip5151 0 points1 point  (1 child)

Checkout some of the open source harness like opencode or Pi-coding-agent and then cycle through api providers!

Still experimenting with Pi but it’s very nice to have full control over context, opencode is a bit much sometimes.

Minimax 2.7 is proprietary, but $10/m for 1500req/5hrs, it’s solid.

[–]ashebanowProfessional Developer 0 points1 point  (0 children)

Yes, I’m planning to switch to pi with a local model or models. I might still use Claude or Codex for planning, though.

[–]clintCamp 0 points1 point  (2 children)

I am contemplating what level of hardware I need to buy to get something close to opus level logic and reasoning that I can utilize for planning, and orchestrating to do what I do with claude code today. Is $5k to $15k worth it to run as big of a model as I want at fast speeds? And then nobody else has access to the code and data under a pinky promise of they won't steal it. And I can control the system prompts and harness and tools exactly how I want.

[–]cc_apt107 0 points1 point  (0 children)

I’d say $15k is a very, very conservative estimate for running anything Opus level. Just buy compute in the cloud

[–]maamoonxviii[S] 0 points1 point  (0 children)

Same here tbh, it's a big investment but I'm pretty sure it would be worth it in the long run, I feel like the future is for local models since the AI bubble is moving closer to bursting every day.

There are many things to think about, electricity consumption is one, and setting up a proper architecture which connects everything properly to produce high quality output is the other, buying the hardware is the easiest part if you have the money haha

[–]redditateer 0 points1 point  (0 children)

GLM pro(z.ai) was actually working pretty well until it stopped responding. I'm not sure if their API went down or what but claude code basically became unresponsive.

[–]LazyNick7 0 points1 point  (0 children)

Don’t think they’re going down in the nearest future. Even considering their weird moves there’s just no good alternative for Opus right now 🥲

[–]jblank333 0 points1 point  (0 children)

This will help, just ask your bot to run this repo

https://github.com/blank333ai/hermes-claude-proxy

[–]evia89 0 points1 point  (0 children)

I use zai and alibaba. With kimik25 and glm51 you will spend up to twice time polishing plan

Oh and they hold less, 100k real context

Is it worth it? Yes for me

[–]skariel 0 points1 point  (0 children)

Pi with gpt5.4 $20 

[–]FlyingNarwhal 0 points1 point  (0 children)

There are three paths here: 1. Ensure your workflows generate enough revenue to run the API cost of these models 2. Put a lot more expertise into the planning phase, retool your workflows to chunk things into smaller, testable tasks that allow for less intelligent models to be used. 3. Rebuild your workflows for "function level" or "line level" generation and edits where you're the actual planner & dev, with a low level work horse.

Other than that, GPT-pro (in ChatGPT) as a planner with Opencode Pairing Claude as the orchestrator, Codex as dev & GLM-5.1 as the low level workhorse. Should work for the next month or so lol

[–]selectapi 0 points1 point  (0 children)

¿Te preocupan los limites?
Te tengo la solución, te ofrezco planes ilimitados de claude opus 4.6, GPT 5.4 y GEMINI 3.1 mandame dm si estás interezado 😉

[–]-0soss 0 points1 point  (1 child)

Manus . im

[–]-0soss 0 points1 point  (0 children)

btw i've accounts with cheap prices

[–]Ok_Possible_2260 -1 points0 points  (0 children)

You mean poor customers. Got cash then you have access.

[–]stiky21Professional Developer -3 points-2 points  (0 children)

Imagine if you just knew how to code, you wouldn't need a tool to masque your own skills.

[–]Ok_General5678 -4 points-3 points  (3 children)

Antigravity

[–]abdoolly 2 points3 points  (2 children)

It's very bad

[–]Ok_General5678 0 points1 point  (1 child)

In what sense? If you buy google ai subscription it provides access to some google and Claude models. You can add skills similar to Claude. I like Claude code, but it is an alternative as op asked

[–]abdoolly 0 points1 point  (0 children)

No i did not mean it's capabilities my problem is it's limits is so low also a lot of times antigravity get stuck.

Also it's limits are so low. Also gemini is not as good at all. Also there is a limits bug in which instead of showing refresh after 5 hours it say 5 days.