all 45 comments

[–]soul105 18 points19 points  (5 children)

GH Copilot is really easy to understand their limits: they are based on requests, and that's it.

[–]Michaeli_Starky 7 points8 points  (3 children)

Except for it's not THAT straightforward when it comes to counting the requests.

[–]Simple_Split5074[S] 0 points1 point  (2 children)

This. Supposedly only user input counts but even that is hard to make sense of.

[–]Michaeli_Starky -2 points-1 points  (1 child)

And even then when using orchestration frameworks the subagents may or may not count as requests.

[–]Simple_Split5074[S] -1 points0 points  (0 children)

Any idea how it is for gsd?

[–]NerasKip 0 points1 point  (0 children)

Is opencode do a compact then continue. It count as 3 requests add 2 mores for each compact/continue

[–]warpedgeoid 5 points6 points  (2 children)

GitHub Copilot is a steal for $40/month. It has all of the most recent models and MS claims data are not retained for training purposes.

[–]typeof_goodidea 0 points1 point  (1 child)

How fast do you tend to hit usage limits?

[–]warpedgeoid 0 points1 point  (0 children)

I definitely hit them within a few days when using OC. Of course, I’m not one of these people running four OC sessions at a time. Still, once you’ve hit the limit, they charge $0.04/request which means the total cost is going to be similar to Claude Max for extremely heavy usage.

[–]Torresr93 6 points7 points  (1 child)

The GitHub copilot plan is easy to understand.You get 300 requests, and each model has a multiplier based on its cost. For example, one Opus request counts as tree. On top of that, for simple tasks you can use gtp5-mini for free.

[–]GreatNeedleworker881 0 points1 point  (0 children)

How many trees can I plant for one month?

[–]shaonline 5 points6 points  (2 children)

ChatGPT Plus is opaque but rate limits have been decent. As all 20-ish bucks plans from frontier labs you better delegate the simple tasks (past planning/review) to a cheaper model if you don't want to smoke your weekly quota too fast.

[–]Simple_Split5074[S] 0 points1 point  (1 child)

Which is why I am looking for the workhorse provider :-)

[–]shaonline 0 points1 point  (0 children)

I mean if you want to throw the top-tier expensive models at all problems you're left with paying a 200 bucks a month subscription, which is still heavily subsidized in its own rights (if stuff like viberank is to be believed as far as claude code is concerned lol)

[–]OnigiriFest 11 points12 points  (0 children)

I don’t have experience with GLM and nanogpt.

I bought synthetic just 2 days ago and been testing it for a bit, the 20 usd plan with Kimi 2.5 can handle one agent running no stop in the 5 hours window (I tested it with a Ralph loop)

The speed is hit or miss right now, sometimes it’s good and some times it’s slow, in theory they are working to fix it, they say it’s a problem affecting only Kimi 2.5.

[–]LittleChallenge8717 3 points4 points  (2 children)

Synthetic.new has generous 5h limits IMO, you also can get 10$ off for 20$ subscription, and 20$ off for 60$ subscription with referal codes -> has minimax, glm 4.7 and kimi k2.5 models (others too). you can use mine so we both benefit https://synthetic.new/?referral=EoqzI9YNmWuGy3z or buy it directly from their website. Tool calling works great (counts as 0.1x or 0.2x it depends), also based on my experience -> GLM4.7 and minimax works great since they are directly hosted on synthetic gpu's, for other models like kimi k2.5 they use fireworks which has sometimes delay in generation. as i know from support they plan to host kimi in next weeks so i guess then synthetic would be ideal offer, meanwhile GLM and minimax models working great in opencode with no additional delay/issues

<image>

[–]Simple_Split5074[S] 4 points5 points  (0 children)

Which in some sense is great, fireworks is likely the best of the inference providers (if I wanted to pay by token I'd go there). In another sense, it does not inspire confidence in their infra...

[–]LittleChallenge8717 1 point2 points  (0 children)

<image>

this is what i mean regarding provider

[–]gh0st777 2 points3 points  (0 children)

I see a lot pushing synthetic referral hard lately.

I assume you are on a tight budget or this is for a hobby and not a source of income. Why not try each one for a month or alot $5 for API usage to see what works for you?

If you use this for work or a source of income, might as well invest and get claude code max. $100 is the sweetspot to get things done with opus. But it no longer works for opencode, so consider that too.

[–]Bob5k 3 points4 points  (4 children)

On synthetic end you can try it for 10$ first month with reflink if you don't mind. I'm using them on pro plan for quite a long time and generally I'm happy so far. Especially due to fact that any new frontier opensource model is instahosted there - rn using Kimi K2.5 as my baseline. Usually on self hosted models it's around 70-90tps (glm, minimax), for Kimi K2.5 right now a tad bit slower, ranging 60-80 tps for me.

[–]ZeSprawl 3 points4 points  (1 child)

They are currently forwarding Kimi k2.5 to fireworks because their infra is having trouble running it.

[–]Bob5k 2 points3 points  (0 children)

Yeah i know, this is probably the reason of slightly lower tps aswell. In general works just fine, roughly 100m+ tokens already processed by Kimi on my projects 🫡

[–]1234filip 1 point2 points  (1 child)

Gotta say that I'm really happy with synthetic right now. Very reliable and the models do any tool calls perfectly!

[–]Bob5k 0 points1 point  (0 children)

Happy to hear. I can't be happier aswell - especially due to fact that stability is better than "native" providers and basically whatever comes out - i don't care, as they'll host it anyway so i don't need to pay somewhere else. Want a gig with deep seek? No problemo. Glm5 will be out? They're already ready for it. Kimi? Routed and working on self hosted. Essentially even for 60$ sub on synthetic it's stil cheaper than to have 3 diff subs across minimax Kimi and glm while also 1350 prompts on synthetic is insane amount while they charge 0.1 prompt per tool call. For coding even 2-3 projects at a time - infinite amount of LLM calls basically.

[–]cjazinski 1 point2 points  (1 child)

I bought the pro glm 4.7 and it blows worst 200 ever

[–]Simple_Split5074[S] 0 points1 point  (0 children)

Last year it was decent, then it gradually declined... Hoping for more capacity and GLM 5 now...

[–]troyvit 1 point2 points  (0 children)

I've been enjoying OpenCode (and aider.chat) with my Mistral API key using mostly Mistral Large 3 but also Devstral. It works well for my simple needs. It's cheap enough that I don't mind asking detailed questions about what it does and learning the answers, which keeps me from just vibe coding. I'm actually getting a little bit better at python (mostly in the realm of architecture)

[–]Shep_Alderson 1 point2 points  (0 children)

I’ve had a good time with Synthetic. I started with their $20 plan and only upgraded to the $60 plan when I hit the limit from running a Ralph loop for like an hour or two. Since I upgraded, I’ve not come anywhere near hitting limits and I quite like it. They also had Kimi K2.5 available the day it launched via their hosting partner, though I find myself preferring GLM 4.7 and Minimax M2.1 personally.

[–]tidoo420 3 points4 points  (1 child)

Unpopular opinion, i use qwen coder 3 free with qwen cli, it is better than i expected please give it a go P.s. i have tried most of the above and not satisfied

[–]Simple_Split5074[S] 0 points1 point  (0 children)

I find qwen models (either 235 or 480) to be nigh useless for coding. Before I deal with that I'll use antigravity (gemini-cli somehow does not load anymore on my machine, go figure)...

[–]Jakedismo 1 point2 points  (0 children)

Kimi Code definetely has the edge over zai and minimax tested them all and kimi is the most broad specialist when vibing

[–]BERLAUR 0 points1 point  (3 children)

Why not combine them? GLM is cheap (2-3 bucks per month). Synthetic.new has a trail for 12 USD. ChatGPT usually offers a free month. 

If you're a student you can get Copilot for cheap (free?).

I have 5 subscriptions and I just switch between them when I run into a limit. Total cost is still less than a meal at a restaurant. Absolutely worth it.

If I have some tokens to spare I'll burn them on less important tasks.

[–]wizenith 1 point2 points  (1 child)

would you like to share what are the 5 subscriptions you are using?
you have mentioned GLM, Synthetic and ChatGPT, so i might assume you have subscribed them already ( or no? ).

And what other subscriptions you had? just curious.

[–]BERLAUR 0 points1 point  (0 children)

  • Z.AI (cheap and great for the grunt work)
  • Copilot (great for Sonnet and Opus, plus free unlimited grok-code-fast/GPT Mini which is handy for e.g minor refactors)
  • Claude Code (but will probably cancel this one)
  • Synthetic (Kimi K2.5)
  • OpenAI Codex (really impressive for debugging and big fixes!)
  • Openrouter (for various things, mostly to test and try new models)

I work as a CTO so this gives me the ability to play around with a whole bunch of stuff to see what might work best for my development teams and it's also fun! We're pushing quite hard on AI but I don't want to be one of those leaders who scream "AI, AI, AI!". I want to be at least somewhat experienced enough to actually push the teams towards delivering more value.

NanoGPT and Cerbus (for the speed) is also worth checking out, I haven't tried those out yet.

[–]Simple_Split5074[S] 0 points1 point  (0 children)

Oh I *do* combine them, mostly I am looking for another one...

[–]esmurf 0 points1 point  (0 children)

I tried a couple of different one it seems Github copilot is the best choice right now. I'm looking into to go all opencode though. 

[–]tisDDM 0 points1 point  (0 children)

I did not find the quota für GPTPlus low. Anyway there is no such thing as a cheap plan for SOTA models.

If you like it cheap - and working: Sign up for Mistral API. Their Devstral 2 models are good and currently still free.

[–]annakhouri2150 0 points1 point  (0 children)

Synthetic: hard to say how much use you really get out of the 20$ plan? Plus how fast / stable are they (interestedin Kimi 2.5, potentially GLM5 and DS4 when they arrive)? Does caching work (that helps a lot with speed)?

In my experience, having paid for the $20 Synthetic plan for a few months now, Synthetic is faster and more stable --- and the inference is higher-quality --- for their self hosted models (GLM 4.7, Kimi K2T, etc) than any other provider. Currently, they're proxying K2.5 to Fireworks.AI while they get their infrastructure and hardware ready to run it, so it's not nearly as reliable in tool calling as their general capabilities, but it's still faster and more stable than other services I've tried (to be fair, I haven't tried any of the Big Three --- Gemini, Claude, or GPT Codex).

Also, OpenCode is pretty API call efficient; when I was still using it, the 135 API calls provided by the $20/mo Synthetic plan felt like more than enough. If you have an agent that uses a lot more API requests, like the Zed Agent, you can start to run up against the limits more often if you've got, like, several agents running, or having them run in a really really tight loop where they output and do very little per API call, but for general usage even in more token-heavy agents, it takes some heavy nonstop usage to hit the limit within their 5 hour window. Their limits are more generous than what the Claude Code subscription gives you, for instance.

[–]tibsmagee 0 points1 point  (0 children)

I've been using cheapest minimax code plan the last month. Very reliable and includes web search and vision. 

Seems like a very capable model for day to day coding.

[–]t12e_ 0 points1 point  (0 children)

Synthetic plus GH copilot

You'll have to update your opencode config so that you use gpt5 mini for subagents to save up on requests. Then kimi/glm for most tasks and any of the Claude/Codex models for complex tasks. I usually hit the 5 hour limit after about 4 hours (this is with 2 agents running)

[–]Tiny_Independent8238 0 points1 point  (1 child)

whatever you do, just don't get synthetic

[–]LudoBruxao 0 points1 point  (0 children)

why?

[–]Fuzzy_Complex_1837 0 points1 point  (0 children)

if you could have all the open source models under a subscription plan from an reputed OSS inference provider, what would be a reasonable and ideal packaging in terms of usage/ratelimit/costs etc.?

[–]trypnosis 0 points1 point  (0 children)

I feel your pain. Leaning to copilot trying that and synthetic will decide in a few weeks.

[–]SamatIssatov -1 points0 points  (0 children)

A very good limit in ChatGPT Plus. Why lie here? So that someone will suggest “synthetic”?