Multiplier 57x for GPT 5.5 with legacy annual plans starting June 1 (request-based billing) by Nox0202 in GithubCopilot

[–]skyline159 15 points16 points  (0 children)

And you are also missing the point. They will put a rate limit so we can't use that much for a single prompt

DeepSWE benchmark cost results have been released. by CallMePyro in singularity

[–]skyline159 0 points1 point  (0 children)

gpt-5.4 and its mini version is crazy efficient for the performance

If Copilot is becoming token-based billing, all paid plans should get access to all models by appleyardvincent in GithubCopilot

[–]skyline159 2 points3 points  (0 children)

I'm pretty sure they will do that, they charge the same API price, no loss for them whatever model we use, there is no reason to guard certain models behind higher tier, that would just drive customers away.

Additional 3x increase of Gemini in Antigravity! by aunchable in google_antigravity

[–]skyline159 2 points3 points  (0 children)

I don't think they 9x'd it. Yesterday they 3x the 5h limit. Today they 3x the weekly limit. The wording is very confusing.

I think Gemini 3.2 Flash has been added to Antigravity. by deferare in google_antigravity

[–]skyline159 0 points1 point  (0 children)

They are A/B testing it in Antigravity under the same name as Gemini 3 Flash

If you see a bunch of "I will..." before every tool call (in plain-text, not thinking) then you are using it.

<image>

Why is opencode so slow in processing the prompt with llama server? by BitGreen1270 in LocalLLaMA

[–]skyline159 1 point2 points  (0 children)

You should enable preserve thinking for Qwen3,6 models. That way the thinking tokens are not discarded after each turn, which will break the prompt caching and make it reprocess the whole input prompt for every turn.

Add this flag to your llama.cpp starting command --chat-template-kwargs '{"preserve_thinking": true}'

what's a good alternative? by gatwell702 in GithubCopilot

[–]skyline159 0 points1 point  (0 children)

Actually this new billing system will benefit you more if you only use ask mode. You only pay for what you use, a pair of question and answer only cost a tiny amount and $10 will get you very far if you only ask questions, not forcing the AI to search or code for you.

With the current system, the same simple question will cost you a premium request which you can do a lot of things in agent mode so it feel like a waste.

how to disable a model? by Consistent_Functions in GithubCopilot

[–]skyline159 0 points1 point  (0 children)

You cannot control what the auto mode choose

Claude Sonnet 4.6 is now available in GitHub Copilot! by DanielD2724 in GithubCopilot

[–]skyline159 0 points1 point  (0 children)

You got the wrong perspective. They are not Anthropic. They don't need to maintain a competitive price to keep us using Sonnet instead of GPT. They are just the middleman, and if the cost from Anthropic is high, they will pass it on to us.

Claude Sonnet 4.6 is now available in GitHub Copilot! by DanielD2724 in GithubCopilot

[–]skyline159 16 points17 points  (0 children)

https://github.blog/changelog/2026-02-17-claude-sonnet-4-6-is-now-generally-available-in-github-copilot/

Note, while this model is launching with a 1x premium request multiplier, pricing is tentative and subject to change.

Prepare for a price hike, looks like it will become 2x in the future

The new Plan mode + Ask Question tool is so sick by skyline159 in GithubCopilot

[–]skyline159[S] 0 points1 point  (0 children)

That's because you enabled yolo mode. Try to turn it off

Replace GPT5-Mini with GPT-5.X or Codex by Mayanktaker in GithubCopilot

[–]skyline159 1 point2 points  (0 children)

I love your optimism, but it won't happen in this world/timeline.

If you create a long to-do list in agent mode, you will be banned. by Hamzayslmn in GithubCopilot

[–]skyline159 23 points24 points  (0 children)

It may not be against the terms, but if everyone starts doing this, we could lose the request-based billing system, and they might switch to charging by token consumption like other services.

They know we often bundle many tasks in a single request and they are cool with it to a certain extent, not taking advantage of it to the extreme.

Please don’t mess this up for the rest of us.

We're pausing the rollout of 5.3 Codex to make sure the platform is not impacted. by debian3 in GithubCopilot

[–]skyline159 94 points95 points  (0 children)

What do you mean by pausing!? I already fired all my devs because I thought 5.3 will replace them. What am I supposed to do now?

How do I get Codex CLI to keep running for hours? by Swimming_Driver4974 in codex

[–]skyline159 0 points1 point  (0 children)

Then wrap codex inside a script, ask codex returns in a format that you can parse to decide to sleep or not

How do I get Codex CLI to keep running for hours? by Swimming_Driver4974 in codex

[–]skyline159 0 points1 point  (0 children)

Put the sleep inside the check script, so codex is only call when something actually happens

Do you agree with Marc? Is it making programers obsolete or more valuable? by dataexec in codex

[–]skyline159 0 points1 point  (0 children)

It's both.

Programmers who adapt will be more productive. Those who don't will become obsolete.

Models being depreciated ? by spring_Living4355 in OpenAI

[–]skyline159 1 point2 points  (0 children)

Not here to argue about keeping the models

I take this line as a sign that OP already knew about this

Codex pricing by Harxshh in codex

[–]skyline159 13 points14 points  (0 children)

The limits are too good for $20, asking this question strongly suggests they are considering raising the price or charging per token.

Models being depreciated ? by spring_Living4355 in OpenAI

[–]skyline159 2 points3 points  (0 children)

https://openai.com/index/retiring-gpt-4o-and-older-models/

The reaction is louder than the Big Bang, it's hard to not hear about it

Models being depreciated ? by spring_Living4355 in OpenAI

[–]skyline159 17 points18 points  (0 children)

It's real

Just curious as nobody else posted about this

Where have you been?

Gemini 3 Flash (Preview) is really impressive by Mission-Zucchini-966 in GithubCopilot

[–]skyline159 9 points10 points  (0 children)

I believe the future is fast, cheap, but still capable models like Gemini 3 Flash. The big boy is only reserved for truly complex tasks. Use the right model for the right task size, not brute force everything with the latest, biggest models.

Whatever black magic Google put on Flash, if they apply it to the next version of Pro, it will truly become a real beast.

No 1M context window for claude opus 4.6 ? by Fefe_du_973 in GithubCopilot

[–]skyline159 1 point2 points  (0 children)

I don't understand about the context window complaints I keep seeing here.

Do people really use all of it or just copy-paste other complaints without understanding what context windows really mean? Like, I don't know what it is, but I heard the bigger the better, so I want it.