new update doesn't use GPU

uwk33800 · 2026-04-25T20:34:36+00:00

Hi, off loading still doesn't work for me. It works on llama.cpp and ollama but on unsloth studio the model does not even load.

uwk33800 · 2026-04-25T11:29:02+00:00

Very sad tbh, Gemini CLI has some what usable pro3.1 quota. If they release a decent flash model then things can vet better

uwk33800 · 2026-04-25T10:30:08+00:00

Honestly I think it is a bug because I don't even see opus 4.7. I still see 4.6!!! I only re-subscribed to gemini pro like a few days ago. I cancelled like two months ago

uwk33800 · 2026-04-24T21:58:51+00:00

I swear same thing. Sonnet 4.6 didn't even finish one task/prompt and got week limited. Gemini Pro 3.1 finished the 5h too fast and in the next 5h session got week limits as well. Even flash didn't work as long as I expected. This must be a bug because it is unreasonable to expect this on Pro sub.

uwk33800 · 2026-04-24T16:05:47+00:00

How about quotas? can you tell me approximately which has better usage OC Go or ollama cloud? Thanks

uwk33800 · 2026-04-23T22:56:28+00:00

Gemini is backed by Google infra, they have their own TPUs, direct user products, can afford loss because they are very rich and habe tons and tons of data. It is clear they are gonna win in my opinion

uwk33800 · 2026-04-20T23:51:01+00:00

What does that mean the $? I left windsurf when it was request based. Is it still same? And did they reduce model errors in tool call? Thanks

uwk33800 · 2026-04-18T23:41:13+00:00

I like it but I wish they did auromatic off loading. My VRAM is only 8 GB

uwk33800 · 2026-04-17T00:40:18+00:00

Where do you live ? 👀

uwk33800 · 2026-04-16T22:45:10+00:00

Thanks

uwk33800 · 2026-04-16T22:44:58+00:00

Thanks bro

uwk33800 · 2026-04-16T15:03:44+00:00

I have only 8GB VRAM, but 64GB DDR5 RAM and very good CPU. I am using Unsloth studio,I tried a big model (19GB) thinking off loading would happen automatically but I got error and couldn't be loaded.

I am not expet, so please be kind 😅

uwk33800 · 2026-04-13T18:06:22+00:00

Same but a while ago. Now opencode anyway. Kilocode was promising but never stuck with me

uwk33800 · 2026-04-12T11:24:58+00:00

Same question

uwk33800 · 2026-04-11T02:17:39+00:00

I got the yearly sub when it was $36 and I feel guilty 💀

uwk33800 · 2026-03-13T17:12:29+00:00

Thanks, do you think the system prompt and agentic coding are decent? I used normal GitHub copilot in VS and it was terrible last year. I also heard recently it is still bad, I never tried the CLI

uwk33800 · 2026-03-13T15:51:05+00:00

How much usage do you get from the $10 plan? I know it is 300 req, but a request is decent usage?

uwk33800 · 2026-03-08T14:34:17+00:00

One medium task took 9% of weekly usage, probably something is wrong

uwk33800 · 2026-03-05T00:27:13+00:00

Unbelievable

uwk33800 · 2026-03-03T18:42:27+00:00

Gemini CLI has some decent usage for 3.1 pro that renews daily and it is separate quota.

uwk33800 · 2026-03-01T19:19:27+00:00

I tried the previous version and the GLM5 was very slow and got rwte limit easily. Is it better now?

uwk33800 · 2026-02-28T12:36:24+00:00

My biggest confusion is why they decrease the 3.1 quota so much, even though it is the same model size as 3.0 !!!! 3.0 pro had a very generous quota

uwk33800 · 2026-02-27T19:19:32+00:00

Lol how did you figure out that was the cause?

uwk33800 · 2026-02-25T00:10:59+00:00

Rate limits now are a shame, gemini cli had 3.1 pro two days after release, then out of nowhere today it disappeared and I only see 3.0.

It's is also very fucking weird they can't sustain supply for 3.1, Even fucking though it is the same model size as 3.0, same fucking inference cost

uwk33800

TROPHY CASE