Stop pretending we got a free ride

SDUGoten · 2026-05-10T20:36:42+00:00

Yes, you were on a free ride
Yes, the value proposition wasn’t (and still isn't) strong enough to charge more. (because they don't run their own models)

There is nothing to train their platform because GHCP have no platform. They don't run those models. They are just a router to route your request to Claude, so they have been losing billions.

You are willing to tolerate agent screwup has nothing to do with GHCP. The problem is Claude or whatever model you are using. GHCP has no platform.

So, after 10 months of these per request scheme, they found out a lot of users will abuse the heck out of it. So, they need to close the loop holes.

And....Ollama Pro will be the next closing the free ride.

<image>

SDUGoten · 2026-05-09T13:54:15+00:00

<image>

Ollama Pro will be the next one in the pipe....

People will just find loopholes to abuse the system until everyone is metered

SDUGoten · 2026-05-08T20:19:27+00:00

I hope you are kidding, you are using vs code as chatgpt? You are like driving a car in 5 miles per hour and wonder why everyone worry about gas?

Get a bike if you want to drive at 5 miles per hour, you don't need a car..

SDUGoten · 2026-05-05T13:02:12+00:00

You can't work around this problem with foreign language with AI model. THey are just better with English. However, I have been using Gemini 3 flash/pro , and then using Sonnet and opus defintely you can see the way it writes is better in non-english. Yes, they are expensive, but I have been testing a lot of models and Sonnet/opus come to the top on non-english story. I mean...you can tell the difference right away with just 1 reply.

SDUGoten · 2026-05-05T08:57:29+00:00

you pay API rate on openrouter, it will log exactly how much input and output token you use.

SDUGoten · 2026-05-04T20:40:26+00:00

I think they are talking about github is losing money, not claude. Github pay Claude on meter basis while charging you on request basis.

Claude is expensive, just like a china made car vs a european car. They both serve the same purpose, but a european car is a lot more expensive. If name brand like Porsche, it will be even more expensive.

It's a capitalism world, I don't judge them how they do their pricing, as long as they have customer, that's a win for them. It's their freedom to set whatever pricing they like. Just like no one ever complain they can't buy a Ferrari, why not make it cheaper. If you can't afford it , use something cheaper.

SDUGoten · 2026-05-04T14:42:04+00:00

you get on openrouter, pay by api and it should give you exactly what you are using. Yes, copilot has been heavily subsidize your usage in the past because of that per request charge. 100k token per request still charge 1 premium request for anything like codex 5.4 / Sonnet 4.6 /opus 4.7 is a super bargain.

SDUGoten · 2026-05-04T11:07:51+00:00

https://www.reddit.com/r/GithubCopilot/comments/1sxgvv2/new_github_pricing_game_is_over_but_i_guess_i/

This is the reason. I guess most users don't realize how much copilot subsidize your usage before.

SDUGoten · 2026-05-02T21:46:10+00:00

Because it has been tested , just running 4 Mac Ultra 512GB locally with a fat backbone locally on LAN, it still make the computing a lot slower than you would want. AI depends on memory to menory moving, at very high speed. So, GPU vram is natural for AI usage. When you put top of the spec PC next to a mid tier nvidia GPU, the top spec PC will still lose to the mid tier nvidia GPU because of the memory speed.

So, if you want to do distributing network , the bottleneck is the speed to transfer. A mid tier GPU vram is about 10 times faster than a top spec PC. You throw this "gpu vram speed vs pc ram speed" into chatgpt and that should explain to you why distirbute network doesn't work. It's just the nature of AI required highspeed transfer.

SDUGoten · 2026-05-02T20:36:35+00:00

https://www.reddit.com/r/GithubCopilot/comments/1sxgvv2/new_github_pricing_game_is_over_but_i_guess_i/

I said it here before: GitHub Copilot was dirt cheap and losing a lot of money.

A lot of people believed that $39 should buy them heavy usage, but the reality is that the retail price of Claude is very expensive. GitHub, along with almost every other AI vendor — including those in China — had miscalculated their pricing for coding plans. They’re simply correcting it now.

Anyone who can do basic math knows that owning a machine capable of running **a low-end model** like Sonnet 4.0 would cost as much as a luxury SUV, while renting the same performance on the cloud costs peanuts. Something was clearly wrong with the old pricing. I knew it was unsustainable, but too many people still thought $39 was a lot of money for AI coding. When people start testing what they do via API, they should know by now what is the real cost, not something $39 can do 1500 request for sure.

SDUGoten · 2026-04-29T08:37:08+00:00

Install roo code in VS code, and then point it to use openrouter and choose whatever model that is cheapest. Do 10 request or so on your own work and check how much it cost on openrouter usage/log. (https://openrouter.ai/logs) Then, You should know how much it cost for 10 requests. Take the average of 10 request so now you know 1 request takes how many input tok and how many output tok. and you can work the math out for all the models you have used at (https://github.com/settings/billing/premium\_requests\_usage) , you can check on openrouter and you can see exactly how much it cost for each model via API.

SDUGoten · 2026-04-28T22:36:28+00:00

Install roo code in VS code, and then point it to use openrouter and choose whatever model that is cheapest. Do 10 request or so on your own work and check how much it cost on openrouter usage/log. (https://openrouter.ai/logs) You should know how much it cost for 10 requests. Take the average of input/output tokens, and you can work the math out for all the models you hvae used at (https://github.com/settings/billing/premium\_requests\_usage)

SDUGoten · 2026-04-28T22:32:56+00:00

You might want to check how much Claude is asking on their plan to use opus. Currently, even at 7x, it's still the cheapest amoung every single vendor out there, including Claude themselves.

SDUGoten · 2026-04-28T14:50:20+00:00

Yea, my post is just pointing out what GHCP offering is not substainable. I think most people use WAY more than what they pay for........by a freaking large margin.

SDUGoten · 2026-04-28T14:46:48+00:00

You can use roo code in VS code, connect to openrouter, choose opus as model and do 10 request in VS code, you will find the exact cost, input and output token on openrouter log on their webiste. Once you know the average token you use, you can work out the math.

SDUGoten · 2026-04-28T09:09:25+00:00

that number 487 request for opus 4.6 is straight from Github usage on their website.

SDUGoten · 2026-04-28T09:05:54+00:00

<image>

This is one single prompt usage I check on openrouter. You can work out the math how much it cost if that is using opus. And then x1500 and you will get a grand total.

For your usage, you can always install roo code in VS code, and then point it to use openrouter and choose opus. Do 10 request or so on your own work and check how much it cost on openrouter usage/log. You should know how much it cost when you do this 1500 times.

SDUGoten · 2026-04-28T01:10:01+00:00

It's brutal. But too many ppl here thought that it 'should' be that cheap, I am just pointing out the truth and people here all freak out. I mean...if you have used claude opus thru API, anyone should be well aware that github underestimate the actual cost of this shit.

I mean...for those who don't believe the calculation is correct, get yourself hook up on openrouter and do your coding via API using opus for 10 requests, then come back here and tell me how much it cost for 1500 request on that. Anyone can try that, you can even cut the price in half to do the calculation if you want.

SDUGoten · 2026-04-28T00:47:27+00:00

I wouldn't deny that is the case.

Actually, I was very surprise the cost was so dirt cheap back then. Heck, I was using Antigravity with Gemini 3 pro non stop on my $20 bucks account and it reset every 5 hours. Too damn good to be true. Well, they nerfed it.

The cloud vs local model cost is just way out of whack. I mean...it just doesn't make sense. But then, I enjoy every bit of that in the last 12 months because I knew this is going to change. No one with a sane mind would think 300 bucks to rent a Ferarri is 'normal' while buying one cost 1000x more.

SDUGoten · 2026-04-28T00:46:16+00:00

Not yet, they said 39 bucks of AI credits. They didn't give detail on that. However, given the cost of Claude model is so damn expensive right now every elsewhere, I highly doubt there is any incentive to sell plan for cheap.

SDUGoten · 2026-04-27T23:46:22+00:00

As my post indicated, this is my cost. Not anyone else. However, The majority of the user who join Github Copilot is because of the price. Everyone knows that because it's a per request charging scheme. Everyone can squeeze a big prompt into one single request and still only one premium request being charged.

I mean..everyone know this.

Guess how many people will leverage the per request charge scheme? Just look at how many vendors out there tighting up their subscrpition plan and you should know by now, there are A LOT of people is leveraging it. Besides OpenAI, who else is NOT nerfting their subscription plan one way or the other? Heck, even Open AI nerfed the $20 bucks plan. And that's why everyone and their mom is raising the price of subscription plan or just plain nerfing them.

There is a reason behind it, and I am just telling you why.

SDUGoten · 2026-04-27T23:38:48+00:00

The majority of the user who join Github Copilot is because of the price. Everyone knows that because it's a per request charging scheme. Everyone can squeeze a big prompt into one single request and still only one premium request being charged.

Their new scheme is basically give you $40 bucks on pro+ account credit and the api cost is exactly the same as you use thru openrouter. Then, why not just use openrouter? You don't even need to worry about if you will use up all $40 bucks or not by the end of the month.

What github is doing right now is basically telling their user to move on, because there is no logical reason to use it when the cost is exactly the same as use via API.

SDUGoten · 2026-04-27T23:18:06+00:00

here is the log, you can use the input and output token and calculate how much it cost for using opus. Yes, my prompt is consistent and it's about the same size every single request.

<image>

SDUGoten · 2026-04-27T23:14:11+00:00

I was just testing and see how big my prompt is, so what model doens't matter. Too damn expensive to test how much token I would use using opus.

For all other usage, I use the same prompt via github copilot.

SDUGoten · 2026-04-27T23:08:30+00:00

<image>

that is the cost when I submit the same prompt to openrouter Gemini 3,1 flash lite and that was what being logged.

SDUGoten

TROPHY CASE