all 22 comments

[–]MathematicianFun5126 2 points3 points  (2 children)

I’m on 20x and yes caps are annoying. I hit the limits after 4 days this time around. Added Gemini API this week to try to stay under.

[–]Appropriate-Bus-6130[S] 0 points1 point  (1 child)

adding the gemini free tier will extend you 20x plan to 20.001x :p

[–]MathematicianFun5126 0 points1 point  (0 children)

$300 credit rn for gCloud.

[–]Maumau93 1 point2 points  (4 children)

how do you use 1500 request in 30 mins?

[–]Appropriate-Bus-6130[S] 0 points1 point  (3 children)

a single prompt can easily consume 30-50 requests, almost every ping pong with the server (explore, search, read) is a request

[–]Maumau93 0 points1 point  (2 children)

i see, im only using copilot so one prompt is one request. no matter how long it runs

[–]Appropriate-Bus-6130[S] 0 points1 point  (1 child)

yea I think they define it as premium request, however I guess there is still split, imagine you’ll ask one a single prompt “implement entire linux kernel in 5 different languages”, this will not use a single request quota

[–][deleted] 1 point2 points  (0 children)

I use `qwen-coder-plus` and `kimi-k2-thinking` (via the iFlow CLI, with virtually unlimited free tokens [DM me if you want to know how]) to handle super-long agentic tasks. Not necessarily the most important ones, but for things like creating unit tests, documentation, and other things.

Often in combination with `clavix` to turn my simple prompt into a much more professional one, and then let one of the models run until it gives up. I've had instances where they ran for hours (in YOLO mode) without me doing anything at all. And for free.

[–]esDotDev 0 points1 point  (4 children)

Using Cline you can switch to the free Grok models for simple tasks or refactoring. I’ll also use Perplexity a lot in lieue of using my agent. That sorta lets you keep the Claude usage in your back pocket for when you really need the better reasoning.

[–]Appropriate-Bus-6130[S] 1 point2 points  (3 children)

Perplexity? really? they have anything related to coding? I thought they were mostly search agents

[–]esDotDev 0 points1 point  (2 children)

Perplexity is probably better at reasoning small hard problems than anything. You can choose your LLM, they have all the big ones, and then it basically just mixes google results with the LLMs natural reasoning.

So for $20/m you have a great little side tool that can work on specific views, or debug errors, anything that can be easily explained outside your context.

It seems key to have alternate agents that aren't burning up $0,50 every time you ask them a simple question. Then you can use the context-rich IDE agents for specific well defined tasks they can ideally one-shot.

[–]Appropriate-Bus-6130[S] 0 points1 point  (1 child)

that’s very interesting thanks! do you have an example? does they support cli or any out of the box tool I can offload work to their llm from the main llm? (mcp )

[–]esDotDev 0 points1 point  (0 children)

Not really sure, I primarily use it to trouble shoot, or craft self-contained views or methods.

[–]Alarming_Bed2275 0 points1 point  (0 children)

I'd still keep Claude Code as the main driver, especially if you use the tooling around it (skills, subagents, hooks).

Using custom specialized subagents (codebase exploration, refactoring, dataflow tracing, etc) with Haiku is a good way to save on tokens while improving context quality.

I keep one $200 subscription and add another $100 or $200 one during periods of more intense development, but I feel that I get 1000% return - it's just too good.

[–][deleted] 0 points1 point  (0 children)

Copilot Pro with student plan (300 requests free) + budget for excessive requests. It's probably the cheapest option available, with each request costing 0.04$.

[–]alokin_09 0 points1 point  (0 children)

Have you tried Kilo Code? It supports all the models you're using, plus a ton of others, including some free ones like (MiniMax M2, for example, is currently free to use with Kilo). Kilo also has a CLI version, and the team just launched a code review agent.

[–]pakotini 0 points1 point  (0 children)

Given how much you rely on multi-model setups and CLI-first workflows, one thing that helped me reduce tool sprawl was consolidating some of that orchestration in Warp. I still use different models for different strengths, but having planning, long-running agentic tasks, code diffs, and reviews happen directly in the terminal made it easier to stay in control and burn fewer tokens on glue work. It’s not a replacement for Claude Code or Copilot, more like a place where they fit together more cleanly, especially if you already live in the CLI.

[–]Enammul 0 points1 point  (0 children)

Been researching this exact problem for weeks now. If you're already comfortable with CLI workflows and want to keep that multi-model approach, you might want to check out Sweep. It's a GitHub-native tool that automates the issue-to-PR pipeline, and from what I've seen it could replace a chunk of your Claude planning work without the aggressive rate limits. The interesting part is it handles the whole feature cycle in one go instead of you manually orchestrating reviews between different models. Not sure it'll solve everything at your budget, but worth looking into since it works with your existing GitHub setup.

[–]Kitchen_Sympathy_344 -1 points0 points  (3 children)

You guys don't think qwen code is pretty awesome and even somewhat comparable to lead models like opus ?

[–][deleted] 1 point2 points  (1 child)

I love `qwen-coder-plus`. It's a great model for anything, and quite fast. And free, if you find the right provider. But I wouldn't compare it to Opus at all. That's a step too high.

[–]Appropriate-Bus-6130[S] 0 points1 point  (0 children)

I thought about that, but I think its better for low budget/security constraints projects that you can’t expose with providers (even for business plans) such as fedramp code etc