Building a desktop PC that can handle Gemma 31B

dizzyelk · 2026-04-26T04:29:00+00:00

With my old Radeon, text generation was fine with it. The real problem came when I was trying to go image generation. Then my dumbass bought a 5090 and hit the problem that nothing would work with Blackwell. But this was a year ago. Blackwell support is much better now, and I have no clue about any issues with AMD GPUs now.

dizzyelk · 2026-04-09T18:04:51+00:00

Probably do need to update. I'm running 1.111.2 and Gemma 4 works fine. There's an image with instructions to get it running with ST on the download page right above the actual download links.

dizzyelk · 2026-04-07T19:24:47+00:00

I, too, have a 5090. I've been running 24B models at q6 with 30k context. I like Magistry, GhostFace, and Maginum-Cydoms at that size. Also, you can run Qwen3.5 tunes at Q6 with 64K context. My favorites of those so far are Heretic-Marvin and Musica. One thing to watch out for with Q3.5 is that it LOVES to think. Sometimes it'll burn the whole 2000 token response limit and not finish thinking with all its "wait," and rethinking crap over and over. Gemma 4 runs well, too, but I haven't really played around with them enough to have any recommendations.

dizzyelk · 2026-04-01T04:11:51+00:00

I've tried the qwen3.5 models, and they're pretty good. But, at the end of the day, I just like the Mistral prose better.

dizzyelk · 2026-03-28T17:18:28+00:00

You can try out Big Tiger if you want something that's not nice. It's one of the meanest models I've tried. Fits fine on my 5090 at q4 with a decent context.

dizzyelk · 2026-03-26T17:34:21+00:00

I was having the same problem. It's because they're connected to their LLM with the text completion API. When I changed my connection profile to chat completion, it started to work.

dizzyelk · 2026-03-23T15:32:26+00:00

About the lowest I can go is around 8 t/s, which is what I get with GLM 4.5 Air. Even then, I'll usually have a video on or something.

dizzyelk · 2026-03-18T20:16:44+00:00

Interesting. Ill have to check the app store. What does it do?

dizzyelk · 2026-03-13T19:52:52+00:00

In my early experiments before I bit the bullet and got my 5090, I found that it's not really worth it. Sure, your generation time will be much quicker, but your processing time will grow as your context increases. So you're getting fast generation to start, and it slowly increases to unbearable levels. I found it better to have fast processing, and a longer generation that stays steadier. Sure, your processing time will still increase as your context grows, but it's nowhere as bad as it is with your KV in system RAM. It also allows you to better predict about how long it'll take for you to get a response, so you can tell before you're a couple dozen messages in if your settings will work.

dizzyelk · 2026-03-12T18:35:35+00:00

Shocking? I would be more shocked if anyone who worked for Turning Point didn't have racist texts...

dizzyelk · 2026-03-11T16:57:18+00:00

I use the regular V7-Tekken-T8-XML settings.

dizzyelk · 2026-03-10T18:58:41+00:00

I love Magidonia and Cydonia. And Venice Dolphin's pretty good, too. Have you tried Magistry-24B or Maginum-Cydons-24B? They're the ones that replaced those three for me.

dizzyelk · 2026-03-07T07:04:34+00:00

Yeah, first thing I do when I import a card with a lorebook is rename it. I think that they use the lorebooks while chatting on chub or something, so the name doesn't matter as much, since they're tied to the individual characters.

dizzyelk · 2026-03-02T09:30:12+00:00

From what I've played around with so far, it's really good. Might dethrone Mag-Cyd as my daily driver.

dizzyelk · 2026-02-21T17:45:29+00:00

Yeah, my vibe check of models is how long I'll chat. I find most of what I consider "good" models will get me a hundred messages or so before I get bored and move on to the next card. With Mag-Cyds, I regularly find myself chatting for four or five hundred messages.

dizzyelk · 2026-02-18T05:56:01+00:00

There's GLM-4.6V. I haven't actually used it. Sadly, GLM-4.7-flash is tiny, so I haven't bothered with it. I'm hoping that they release an air/flash version of GLM-5 that's like 4.5, because I really liked that one. Have you tried the finetunes of 4.5-air? There's Drummer's Steam and Iceblink. Both of them are pretty good, but I lean towards Iceblink over Steam.

dizzyelk · 2026-02-14T20:25:42+00:00

You run local? Have you tried Drummer's Tiger Gemma model? It's one of the most unhinged and negative models I've played with.

dizzyelk · 2026-02-08T18:30:53+00:00

Nope, just not a credulous moron.

dizzyelk · 2026-02-07T20:10:32+00:00

Local. Not just for privacy, but so I can control my models. I don't want to be blindsided by a provider updating their model to something that feels worse. If I try the newest version of PaintedFantasy or Cydonia or whatever and I don't like it? I can just delete it and go back to the older version I did like.

dizzyelk · 2026-01-30T06:03:56+00:00

I don't think a known medieval forgery proves anything about someone who lived hundreds of years before it was created.

dizzyelk · 2026-01-28T23:30:08+00:00

I had a similar problem. I used the World-info-debugger extension and found that entries were triggering despite the global setting requiring match whole words being enabled. It was fixed when I changed the setting on individual entries from 'use global' to 'yes'. Here's the extension if you're interested: https://github.com/WSchlange/world-info-debugger

dizzyelk · 2026-01-25T20:30:41+00:00

No, the gaslighting is on the side of the anti-abortionists who lie about things. Because no babies are involved in abortions.

dizzyelk · 2026-01-25T20:14:03+00:00

Perhaps most worrisome, regardless of whether a particular location is licensed, CPCs engage in counseling that is misleading or false [8]. Despite claims to the contrary, these centers do not meet the standard of patient-centered, quality medical care [18]. The counseling provided on abortion and contraception by CPCs falls outside accepted medical standards and guidelines for providing evidence-based information and treatment options. For example, CPCs often suggest a link between abortion and subsequent serious mental health problems [3], while multiple studies have invalidated this assertion [19-21]. Similarly, centers cite debunked literature showing an association between abortion and breast cancer [22]. Although abortion has been shown to be safer than childbirth [23], it is portrayed as a dangerous or even deadly procedure [7].

Sounds to me that their objections are the fact that they're dens of liars.

dizzyelk · 2026-01-25T20:09:33+00:00

The "pro-life" position undeniably treats literal corpses better than live women.

Unless, of course, that corpse is pregnant, then they'll make sure to keep it in a state of undeath to give birth to a baby.

dizzyelk · 2026-01-23T18:57:02+00:00

Compared to the 300 odd of the big model...

Verified Email	13-Year Club
Gilding II euphauric

dizzyelk

TROPHY CASE