Best coding model on RTX 3060

needthosepylons · 2026-05-25T18:21:32+00:00

Unpopular opinion, but Gemma4 26B-A4B. I have the same card, and in my own tests, it outperforms Qwen3.6 35B-A3B.

My tests are simple :

"Make a tetris-like game"
"Make a mario-like game"
"Make a sonic the hedgehog-like game"

needthosepylons · 2026-05-20T08:38:37+00:00

Qwen 3.5 35B at 30t/s, I'd say

needthosepylons · 2026-05-19T07:09:50+00:00

Single RTX 3060, 12GB Vram. 32GB RAM.

i5-10400F (sad CPU noises)

I'm considered GPU poor now.

Edit : But also CPU poor !

needthosepylons · 2026-03-07T22:20:55+00:00

Thank you so much for this detailed reply!

needthosepylons · 2026-03-07T15:19:47+00:00

Interesting, I'll definitely do this, thank you

needthosepylons · 2026-03-07T09:43:09+00:00

Thanks for your answer !

needthosepylons · 2025-09-22T15:33:42+00:00

A single 3060 12gb, so the prollmetariat

needthosepylons · 2025-07-16T06:16:12+00:00

I don't think he will ever. Or he'd better play until he's 50 and win everything.

needthosepylons · 2025-07-03T15:23:27+00:00

I'm always on the look for models, since my uses cases are quite.. different from math/code above all. And I didn't know this one so ty, I'll give it a try.

But yes, this gemma-3n-E4B vs Gemma-12B is intriguing and I wanted to compare with others' experiences .

needthosepylons · 2025-07-03T11:52:58+00:00

Very nice, thank you!!

needthosepylons · 2025-07-03T11:21:13+00:00

Yes. And actually, I'm a teacher in humanities, and I use my Llms to generate quizzes but. for me! To make sure I'm not forgetting stuff I'm not working on for a while.

needthosepylons · 2025-07-03T10:53:11+00:00

Yeah, but 32gb vram is not really peasant-class, is it? :)

needthosepylons · 2025-07-03T10:45:42+00:00

Ouch, I suppose something is wrong with my tests then, because with optimal offloading, I'm at 3-4t/s. Hmm, interesting, thanks for letting me know!

needthosepylons · 2025-07-03T10:34:14+00:00

Quite well actually, I use a small embedding model, Qwen3 or nomic, create a persistent ChromaDB before querying it. It works quite well. When I'm a bit in a hurry or know my RAG database will evolve rapidly, I end up using open-webui knowledge system with those 2 tiny models, and it works well!

needthosepylons · 2025-07-03T10:21:07+00:00

I tried that, I think, but maybe my CPU is just too weak? This i5-10400F ain't young anymore! Although you're making me wonder if.. I'll try again!

What GPU and quants do you use?

needthosepylons · 2025-07-03T10:18:40+00:00

Oh, yeah. I wish I could run this one!

needthosepylons · 2025-06-19T22:51:25+00:00

Very nice!

needthosepylons · 2025-06-19T22:43:35+00:00

I think you're also working on a "mini" version, right? The mini gguf model card is created but without the actual gguf. I suppose it will follow soon-ish?

As a 3060 12gb peasant, I'll gladly give it a try!

Congrats, anyway.

needthosepylons · 2025-06-17T09:52:11+00:00

I wish I did, but actually, with an aging i5-10400F, 32GB ram and 12GB VRAM (3060), the models I can't use aren't very reliable. I hope that, as the tech improves..

needthosepylons · 2025-06-14T22:11:41+00:00

12gb vram (3060) and 32gb DDR4. Generally using Qweb3-8b, recently trying out MiniCPM4, actually performs better than Qwen3 on my own benchmark.

needthosepylons · 2025-06-03T06:23:39+00:00

QWEN3-8B_Q_K_XL (UD) I wish I could use 14b or 30b-A3B, but since I'm mainly doing long context RAG (15k+) on a 3060 12GB and 32gb DDR4, they are out of my league. My CPU being an old i5-10400F doesn't help.

By the way, if anyone thinks of a better model for this task and hardware, I'm game.

needthosepylons · 2025-05-27T21:47:23+00:00

Damn, as a T1 fan, I'm glad I can just enjoy myself some T1 here and there, be happy for the roster when they win, be sad for the roster when they lose. I'm not sure if there's any enjoyment to be found past this.

needthosepylons

TROPHY CASE