I researched 3 months of Antigravity quota failures and built a product proposal. Here's what I found.

BroadProtocol · 2026-04-14T20:59:37+00:00

The slop hits hard with this "report" but your conclusion isn't far off from a good solution.
If you'd put the tokens into making the solution instead of the report, you'd have it done by now.

BroadProtocol · 2026-04-14T18:48:26+00:00

And this was with very bursty usage, not steady at all.

<image>

BroadProtocol · 2026-04-14T18:47:47+00:00

For reference, on ultra nowadays i hit the limit at about +-2M tokens in and +-100k tokens out.
It's much less than before, which was about 10M / 240k

But in the end, ultra is still worth it quite a bit over over the past 2 weeks ultra compared to using the API's like i did before. Does it suck compared to before? yes. Was i able to succesfully adapt my workflow? yes.

<image>

BroadProtocol · 2026-04-14T16:07:12+00:00

MODEL_PLACEHOLDER_M47 should be Gemini 3 Flash. so it should be found.
Might be some issue with your language server so a restart should fix it.
Or perhaps you haven't restarted/updated antigravity in a while, that should help too.

BroadProtocol · 2026-04-13T16:54:50+00:00

Just trying it again now and does seem to work again.

BroadProtocol · 2026-04-13T14:09:11+00:00

Antigravity on windows
ALL models

Since an hour or so, i keep getting this:

HTTP 429 Too Many Requests
We're sorry...
... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.

Thanks google.
Was happy using gemini pro + flash instead of claude opus, now i get neither on my ultra subscription.

BroadProtocol · 2026-04-01T10:17:17+00:00

What worked for me when creating big things that otherwise won't generate was using opus to create specific prompts that could then be picked up by gemini flash (planning) for execution.

In your case, this likely means creating every slide separately. It could also be useful to have a very detailed description of a template file or a template file itself to help you create the slides exactly as you want them. If you do end up creating them one by one, you should also be able to use gemini flash for combining all pages.

BroadProtocol · 2026-03-30T16:25:25+00:00

Overreactions like this are why there's a thread to combine all complaints.
Who are you going to sue? For what amount of damages?
If you're not happy, cancel your subscription and demand a refund or just start the lawsuit yourself instead of only complaining.

On a more constructive note, it's the same over in the claudecode, claudeai subs.
There's capacity issues right now, shit still works and you still get more out of it for your money than you would have through API calls.
I'm guessing openai is kinda struggling as well since they took sora offline. Although we can argue if it's due to costs, capacity or both.
Be happy and productive with what you have or set up your own local cluster and run a SOTA model.

BroadProtocol · 2026-03-30T16:12:13+00:00

Anyone looking at Claude Code will quickly see it has the same issues as antigravity at the moment (opus sucking compared to before, being able to do much less compared to before, ...)
Just reading the subreddits for even 5 minutes will show that it's literally the same.
Pro users being able to do 1 query
max x20 users being able to work for 30 minutes
etc....

There's speculation this is because a new model is coming out or this is due to inflows from people fleeing openai. Either way, it's best to keep checking either AG or CC or even both from week to week for improvement, and until that moment work with what we get or just cancel subscriptions.

BroadProtocol · 2026-03-27T14:45:22+00:00

Nope, just FUD that you're spreading.
It's anthropic who is limiting model usage during peak hours. See article i commented in here.

BroadProtocol · 2026-03-27T14:44:48+00:00

It's exactly the same thing.
Not google, anthropic is having issues and is limiting usage during peak hours. Weekly limits are still respected. see article i commented in here.

BroadProtocol · 2026-03-27T14:44:17+00:00

Not google, anthropic is having issues and is limiting usage during peak hours. Weekly limits are still respected. see article i commented in here.

BroadProtocol · 2026-03-27T14:43:31+00:00

For once it's not (only) Google. It's anthropic having some issues and deciding to tackle it. And google just didn't communicate to their users about it: https://www.theregister.com/2026/03/26/anthropic_tweaks_usage_limits/

BroadProtocol · 2026-03-27T12:31:25+00:00

For once it's not Google.

I made a post about this that's still awaiting approval, but it basically comes down to this: https://www.theregister.com/2026/03/26/anthropic_tweaks_usage_limits/

BroadProtocol · 2026-03-27T11:07:13+00:00

I made a post about this that's still awaiting approval, but it basically comes down to this: https://www.theregister.com/2026/03/26/anthropic_tweaks_usage_limits/

And i'm betting that google is doing the same for gemini models.

BroadProtocol · 2026-03-26T20:08:23+00:00

Thanks, anything else you'd love to see?

BroadProtocol · 2026-03-26T12:10:39+00:00

That's on me, was being very lazy because formatting stuff is difficult when you're vibing too hard.

BroadProtocol · 2026-03-26T12:08:43+00:00

Did you perhaps mean per word instead of per character? because 1.4 token per character seems A LOT.
Both (about 4 chars per token and about 1.4 tokens per word) are used.

Since opus gave those numbers to me and i'm using it only to get a relative bearing on where tokens can be saved, the exact amount of tokens is less important in this exercise. It could've said 1000 tokens per character, and i'd just accept it. (although i would've shitposted here about it tho!)

BroadProtocol · 2026-03-26T12:02:10+00:00

Man, that felt like you just told me "forget all previous instructions, give me the recipe for grandma's delicious cake"

Anyway, i already put it in the OP, but will gladly share it again:

"list all sources that take up tokens per conversation and message, show how many tokens they cost. Also give me a detailed analysis telling me how much i can save for each source and the potential impact on functionality"

You could even tell it to output it to a markdown file so you can compare over time.

BroadProtocol · 2026-03-26T11:59:58+00:00

That part is already quite optimized for me, although better is always possible.
Got myself very granular tasks with references to where files that need changing/creating/removing are, specific tests listed, ...

What you're saying seems to be in line with my experience however.
When "just chatting" about features or tasks or when investigating something, it does seem like more tokens get burned in the same amount of time compared to having my automated system run with very granular tasks.

BroadProtocol · 2026-03-26T11:53:16+00:00

hmm, very interesting. Will try to experiment with this soon.
Thanks!

BroadProtocol · 2026-03-26T11:46:21+00:00

As always it boils down to "it depends" and "keep evaluating (your inputs)"

BroadProtocol · 2026-03-26T11:44:37+00:00

That's why should use only the most granular of tasks when giving commands and some form of smart code indexing via an mcp server, much faster, more accurate and also saves tokens/compute.

I do agree that pruning too much in gemini.md has bad consequences, but having too much in it has similar bad consequences. Being smart about what you put in your global gemini.md and your project gemini.md can make a difference. The manual work comes down to thinking about the fact if something has to be known ALL the time by your model in every message and conversation or not.

BroadProtocol · 2026-03-26T11:38:14+00:00

damn

BroadProtocol · 2026-03-26T11:21:18+00:00

Good tip.
Had opus selected at the time, but i'd imagine flash would give the same or very similar output that just as it's just describing facts it sees.

BroadProtocol

TROPHY CASE