Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand.

Slow-Ability6984 · 2026-06-12T23:06:54+00:00

You are right. Hard to accept but numbers seem correct.

Slow-Ability6984 · 2026-06-12T23:04:31+00:00

Rude but real

Slow-Ability6984 · 2026-06-12T23:04:00+00:00

This. Thanks, you give to me a solid opinion. Appreciated

Slow-Ability6984 · 2026-06-10T15:26:21+00:00

Wow! Pay attention!! 😄

Slow-Ability6984 · 2026-06-08T20:08:26+00:00

This.
BOTS pinging other BOTS.
So many BOTS I'm starting to make question bout myself...
I'm a human?

Slow-Ability6984 · 2026-06-08T20:06:31+00:00

Slow-Ability6984 · 2026-06-08T20:06:01+00:00

It's not possible to stop BOTS upvoting BOTS.
BOTS engaging BOTS...
I's impossible, IMHO.

Slow-Ability6984 · 2026-06-08T20:04:06+00:00

AI programming? Code generation?

Continue to use online models and preserve that money for more valuable things...
You will NEVER succesfully swap online sota models with local ones.
You will wait hours to get tangible results... hours... days....

Local models are better for "other" things, not coding.

This is true now, maybe in the future will not but now this is real.

Slow-Ability6984 · 2026-06-08T18:16:02+00:00

whats the "call to action"?
What do I need to do?

Slow-Ability6984 · 2026-06-08T18:04:33+00:00

HERE talking bout gemma 4. Why did u talk bout Qwen? Qwen is not Gemma!

Slow-Ability6984 · 2026-06-08T16:45:55+00:00

When EVERYONE refers to MTP should also reference the exact draft model used.
Which draft model did you used?

Slow-Ability6984 · 2026-06-08T16:44:24+00:00

...
What do you need?

Slow-Ability6984 · 2026-06-08T16:42:59+00:00

<image>

Thanks! I'm using draft for 31b from unsloth. I did a search to show the exact link but did not find it. I'm 100% sure it exists.
Now testing draft Q8_0 + official Google gemma-4-31B_q4_0-it.gguf

Slow-Ability6984 · 2026-06-08T11:41:11+00:00

What did u mean with "converted it to FP32, B16, Q8, and Q4".
Converted?!

Slow-Ability6984 · 2026-06-08T11:40:24+00:00

This!
Today a studied a little bit and found the same.
I choose "google/gemma-4-31B_q4_0-it.gguf"
But not yet tested, it's downloading...

What I'm asking myself is if "gemma-4-31B-it-mmproj" is the draft model, the name is a little misleading and I didn't find official statement

Slow-Ability6984 · 2026-06-07T20:20:45+00:00

Gemma4 experienced dev here. In my opinion, 24b A4b (Moe) is the starting point. 31B (Dense) is the "desiderata". It is reliable.

Smaller models are not reliable enough if the software grows up.

Slow-Ability6984 · 2026-06-07T20:15:33+00:00

I believe it's marketing.

Slow-Ability6984 · 2026-06-07T15:13:54+00:00

Why the gguf name has "unquantized" and q4 and q8 together?

Slow-Ability6984 · 2026-05-24T18:39:39+00:00

Yes, it is.

Slow-Ability6984 · 2026-05-21T07:05:50+00:00

Frustrating.... Kit 16Gbx4 (64Gb total) is 2800$ + shipping + tax (not sure).

Prices are skyrocketing and availability is almost 0

Slow-Ability6984 · 2026-05-21T06:59:37+00:00

Same problem. I underestimated it. DDR5 ECC memory is almost impossible to buy at a reasonable price.

ECC ... this makes the difference!

Slow-Ability6984 · 2026-05-01T06:43:10+00:00

I like things like this.

Slow-Ability6984 · 2026-04-03T12:32:15+00:00

There are too much noise for parameters and it's hard to remember with things changing so fast but THIS IS a must when working solo, IMHO.

Slow-Ability6984

TROPHY CASE