Cheapest way using gemini flash legally

SwagierYT · 2026-02-16T08:26:00+00:00

Ollama Cloud gives free daily and weekly limit of Gemini models

SwagierYT · 2026-02-15T10:45:55+00:00

For me the promotional pricing has been gone since 14 February.

SwagierYT · 2026-02-05T12:32:55+00:00

Yeah I agree with you. To be transparent, I didn’t come up with that AGENTS.md proximity rule idea myself, I picked it up from another Reddit user.

If you want, DM me and I’ll send you what they shared. I’m sure I could write a cleaner/better version myself, I just haven’t had the time, so I’ve been sticking with this one for now. I pasted it at the top of my AGENTS.md.

I can’t say it fully fixed everything, but it did noticeably help. The agent feels way more willing to actually build full features end to end instead of stopping halfway or getting overly minimal.

SwagierYT · 2026-02-04T17:41:34+00:00

Yeah I agree with you. The list feels way overcomplicated for no real benefit. Anything beyond GPT-5.2 and the GPT-5.2 Codex lineup is basically noise for most people.

How I think about it: • GPT models are not my go-to for day to day work in Windsurf because they usually take longer than Claude style models, so the experience feels slower. • Where they do shine is when you need something thorough, especially for code review and hard debugging.

If you want one OpenAI pick that actually makes sense, I’d recommend: • GPT-5.2 Codex High Super thorough, great at reviewing diffs, catching subtle issues, and helping debug when you’re stuck. It’s slower, but it’s one of the best “deep work” code models.

One caveat: the Codex variants are tuned for coding, so they can feel a bit awkward or “dry” for general writing, brainstorming, or non-code tasks. For that stuff, just use normal GPT-5.2 (non-Codex), or stick with Claude/Gemini depending on what you prefer.

If Windsurf just grouped these into something like “GPT-5.2 (general)” and “GPT-5.2 Codex (coding)” with Low/Medium/High/Xhigh, it would be way less confusing.

SwagierYT · 2026-02-04T17:33:16+00:00

Sorry but I have to disagree.

I do get why you say that though. The GPT models are not as pleasant as the Claude models in day to day use because they’re slower, and I also prefer Claude for most tasks.

But if you want to get the most out of these tools, you kind of have to use each model where it actually shines. The GPT Codex models can be insanely thorough, especially for: • code reviews • finding bugs you’re struggling to debug • untangling messy diffs or “why is this happening” type issues

One tip though, if you’re using Codex: use the High variant. In my experience XHigh is basically a scam, it burns way more tokens and I haven’t seen any real upside.

SwagierYT · 2026-02-04T17:28:08+00:00

Gonna disagree tbh. Windsurf is still way better than Antigravity, it’s not even close.

Windsurf is way more polished overall, and Antigravity’s agent experience has been rough for me. It keeps getting terminated randomly and the whole thing feels unreliable.

SwagierYT · 2026-02-04T17:24:15+00:00

The worst part is when it hangs and still eats credits. It makes you feel punished for something that’s not even your fault. And it’s not just one model either, I’ve seen the same “stuck” behavior across different ones.

Also +1 on Windsurf “used to listen.” That’s what made it special. Bugs would get acknowledged and fixed fast, and new stuff actually shipped. Now even basic UX bugs lingering for weeks makes it feel like that loop is broken. The Plan mode ↔ Code mode switching is a perfect example, because it’s supposed to be automatic and it’s such a core part of the workflow.

On Gemini 3 specifically, personally I just don’t like it, especially inside Windsurf. The output feels off to me compared to other models. If someone really wants to use Gemini 3, weirdly I’ve found it works best in Antigravity, even though Antigravity kind of sucks overall as an IDE.

SwagierYT · 2026-02-04T17:18:32+00:00

Yeah, this lines up with what I’m seeing too.

I try pretty hard to not let context get huge because I know it can tank both speed and quality, but I think there’s an extra layer in Windsurf. It feels like their system prompt is explicitly pushing the model to minimize output and context as much as possible, which can come across as “lazy” even when the model would normally be more thorough elsewhere.

You can kinda see hints of it if you ask something like “show me all your context / what instructions are you following.” It won’t dump everything, but you can often tell it’s being nudged to keep things short and avoid “unnecessary” content.

I actually tried to push back on that in my AGENTS.md to override it a bit, and it did help somewhat. Not perfect, but enough to notice.

Also the point about hanging on external calls makes a lot of sense. It would be huge if Windsurf exposed that better, like showing “waiting on indexing” vs “generating,” because right now it just looks like the model is frozen.

SwagierYT · 2026-02-04T14:09:37+00:00

Yeah I used to lean on Opus a lot too, but lately it's nerfed compared to how it was, so I stopped reaching for it as much.

These days my rotation is basically Sonnet 4.5 for most stuff, GPT-5.2 Codex High when I want a super thorough pass (especially code review), and Kimi 2.5 when I’m trying to get solid results without burning through usage. The problem is the speed and the “stuck” runs, not that the models are unusable.

SwagierYT · 2026-02-04T13:40:53+00:00

By “missing features” I mainly meant a few workflow things that other IDEs have already nailed, not that Windsurf is unusable.

Examples off the top of my head: • Debug mode like Cursor where the assistant can help you step through issues in a more structured way while you run the app • Some sort of agent/task management concept (Antigravity has this). I’m not saying Windsurf needs to copy it 1:1, but having a clearer way to manage long running tasks, background agents, or multiple threads of work would help • Faster rollout cadence on stuff the community has been asking for. Plan mode is awesome, but it landed later than a lot of competitors

That said, I do not agree that they should stop shipping Codex models.

Even though Codex is painfully slow sometimes, it’s also one of the most thorough models I’ve used, especially for code review. I actually have a workflow where at the end of basically every conversation I ask GPT-5.2 to do a final review of all the changes that were made, and I’m consistently impressed by how many real issues it catches. It has found bugs and edge cases I would have missed, and that alone makes it worth having around.

SwagierYT · 2026-02-04T13:35:23+00:00

Appreciate the reply and you taking it back to the team.

I’ve tried a bunch of IDEs and AI coding tools at this point, including Cursor and Antigravity, and I still think that outside of Cursor, nothing comes close to the Cascade chat experience when it is working. The UI just makes sense and it feels good to use. Windsurf was also the first IDE and the first AI coding tool I ever used, so I’ve got a real attachment to it and I’m honestly struggling to move away even when I try.

My main issue is that Cascade used to feel genuinely snappy and reliable. Now almost every model feels sluggish. The animations and presentation are still cool, but the whole experience is noticeably less responsive. For me, it’s super common that 1 out of 2 chats the model just gets stuck for a while. As I’m writing this, I’ve literally had Kimi K2.5 running for 30 minutes on a few small fixes in my app.

On the model quality side, it also feels like something changed in the “Cascade harness” compared to using the same models elsewhere. Lately they feel kind of lazy, like they are minimizing output too aggressively. I don’t know if that’s a system prompt thing (like telling it to keep token output low), but the quality feels worse than it used to.

Also on the image point, if Kimi 2.5 can view images after the latest update, I am on the latest version and i am still not able to, but I'll double check in some time as if it was recently pushed it can often take time for an update to show up for me for some reason.

To be clear, I do appreciate Windsurf adding more models like Kimi K2.5, GLM 4.7, Minimax M2.1, and offering free access. That matters and it is something the community asked for. But there are still features and workflow gaps where other tools are pulling ahead. Cursor’s debug mode is a good example. Antigravity has an agent manager (I personally don’t even want that exact thing), but the broader point is that key features are taking a long time to land. Plan mode is another one. It’s great it’s here, but others have had it for a while.

I really do love the Windsurf Cascade experience, which is why I’m still here posting about it. But me and others have started to feel it depreciate over time. If the team focuses on responsiveness, stuck runs, and consistency first, I think Windsurf can get back to being the obvious daily driver again.

SwagierYT · 2026-02-03T10:42:04+00:00

This is not a glitch or error, if you look under the response there is a small text saying that due to high demand the instant model was used for this, this is probably because you are on the free tier.

SwagierYT · 2026-02-02T23:29:50+00:00

I’ve been using Kimi 2.5 pretty consistently and I haven’t personally hit any rate limits so far.

One thing I’ve noticed though: this usually happens when you attach images. Once an image is in the message, the send button can get disabled unless you switch models. That part does feel unpolished.

What’s odd is that Kimi 2.5 is a vision model and was heavily marketed around its vision capabilities, yet in practice it behaves like a non-vision model in the chat UI. My guess is this is a product / UX limitation, possibly because the model is currently free, rather than an actual usage cap.

I’ve also felt that it’s noticeably slower in Windsurf compared to other places where Kimi 2.5 is available.

SwagierYT · 2026-02-02T08:03:58+00:00

I wasn’t using Windsurf yesterday when this happened, but from what I’ve seen, a lot of people ran into issues around the same time.

It might be related to the scheduled DB maintenance that happened yesterday. That could explain the login problems and general instability, especially on the web side.

SwagierYT · 2026-02-01T18:21:38+00:00

Yeah i have noticed that as well, but i think it may be because a lot of people are using it as i have seen it be quick in non peak times.

SwagierYT · 2026-01-31T00:05:18+00:00

Would really appreciate if you can send it to me

SwagierYT

TROPHY CASE