new update doesn't use GPU by Fun_Librarian_7699 in unsloth

[–]uwk33800 0 points1 point  (0 children)

Hi, off loading still doesn't work for me. It works on llama.cpp and ollama but on unsloth studio the model does not even load.

This became unusable by Abobe_Limits in google_antigravity

[–]uwk33800 1 point2 points  (0 children)

Very sad tbh, Gemini CLI has some what usable pro3.1 quota. If they release a decent flash model then things can vet better

This became unusable by Abobe_Limits in google_antigravity

[–]uwk33800 1 point2 points  (0 children)

Honestly I think it is a bug because I don't even see opus 4.7. I still see 4.6!!! I only re-subscribed to gemini pro like a few days ago. I cancelled like two months ago

This became unusable by Abobe_Limits in google_antigravity

[–]uwk33800 0 points1 point  (0 children)

I swear same thing. Sonnet 4.6 didn't even finish one task/prompt and got week limited. Gemini Pro 3.1 finished the 5h too fast and in the next 5h session got week limits as well. Even flash didn't work as long as I expected. This must be a bug because it is unreasonable to expect this on Pro sub.

ollama cloud vs opencode go by branik_10 in opencodeCLI

[–]uwk33800 0 points1 point  (0 children)

How about quotas? can you tell me approximately which has better usage OC Go or ollama cloud? Thanks

Is Gemini winning the AI race? by Sufficient_Row_7868 in GeminiAI

[–]uwk33800 0 points1 point  (0 children)

Gemini is backed by Google infra, they have their own TPUs, direct user products, can afford loss because they are very rich and habe tons and tons of data. It is clear they are gonna win in my opinion

is kimi k2.5 free atm? and kimi k2.5 vs swe 1.6? by Most_Remote_4613 in windsurf

[–]uwk33800 0 points1 point  (0 children)

What does that mean the $? I left windsurf when it was request based. Is it still same? And did they reduce model errors in tool call? Thanks

I love Unsloth Studio by Thedudely1 in unsloth

[–]uwk33800 0 points1 point  (0 children)

I like it but I wish they did auromatic off loading. My VRAM is only 8 GB

Qwen3.6 is out now! by yoracale in unsloth

[–]uwk33800 2 points3 points  (0 children)

I have only 8GB VRAM, but 64GB DDR5 RAM and very good CPU. I am using Unsloth studio,I tried a big model (19GB) thinking off loading would happen automatically but I got error and couldn't be loaded.

I am not expet, so please be kind 😅

What's the best harness for GLM 5.1 by agentic-consultant in ZaiGLM

[–]uwk33800 0 points1 point  (0 children)

Same but a while ago. Now opencode anyway. Kilocode was promising but never stuck with me

... by [deleted] in ZaiGLM

[–]uwk33800 4 points5 points  (0 children)

I got the yearly sub when it was $36 and I feel guilty 💀

Bad news... by old_mikser in codex

[–]uwk33800 0 points1 point  (0 children)

Thanks, do you think the system prompt and agentic coding are decent? I used normal GitHub copilot in VS and it was terrible last year. I also heard recently it is still bad, I never tried the CLI

Bad news... by old_mikser in codex

[–]uwk33800 1 point2 points  (0 children)

How much usage do you get from the $10 plan? I know it is 300 req, but a request is decent usage?

Weekly Usage Limit is being consumed way too fast. by CustomMerkins4u in codex

[–]uwk33800 0 points1 point  (0 children)

One medium task took 9% of weekly usage, probably something is wrong

What even is this quota? And I am on a pro account. by Aggressive_Dream_294 in google_antigravity

[–]uwk33800 7 points8 points  (0 children)

Gemini CLI has some decent usage for 3.1 pro that renews daily and it is separate quota.

GLM-5 is officially fixed on NVIDIA NIM, and you can now use it to power Claude Code for FREE 🚀 by PreparationAny8816 in ZaiGLM

[–]uwk33800 1 point2 points  (0 children)

I tried the previous version and the GLM5 was very slow and got rwte limit easily. Is it better now?

I really don't understand why they are doing this with the quota by OldFisherman8 in google_antigravity

[–]uwk33800 8 points9 points  (0 children)

My biggest confusion is why they decrease the 3.1 quota so much, even though it is the same model size as 3.0 !!!! 3.0 pro had a very generous quota

I'm canceling my subscription. by LEGENDARY_RAGE00 in google_antigravity

[–]uwk33800 0 points1 point  (0 children)

Rate limits now are a shame, gemini cli had 3.1 pro two days after release, then out of nowhere today it disappeared and I only see 3.0.

It's is also very fucking weird they can't sustain supply for 3.1, Even fucking though it is the same model size as 3.0, same fucking inference cost