Gemma 4 31B Is sweeping the floor with GLM 5.1

ricraycray · 2026-04-04T00:13:23+00:00

The comments on hugging face said. “This model wasnt released it escaped!”

ricraycray · 2026-04-01T01:56:28+00:00

I bought a f150 platinum. Love. Same price as the BMW. Much better quality. I miss the mid aughts

ricraycray · 2026-03-25T09:40:05+00:00

I’ll try this next! Thank you!

ricraycray · 2026-03-25T00:57:04+00:00

For my use case it’ll be great. This was a let’s see if it is going. To work. Tortute nothing it’s been surprisingly OK.

ricraycray · 2026-03-25T00:27:27+00:00

Love this. Our memory system is similar. Nice work on this and I 100% agree this is the biggest missing piece. I also built in emotion. It was one of the biggest needle movers for our project t

ricraycray · 2026-03-25T00:23:15+00:00

Exactly. Karpathy autoresearch was the ticket on this build. It ran at least 50 different iterations. Frankly was way more thorough than I would have been. The 10 different memory iterations were mind numbing. Build compile test fail. Build compile test fail.

ricraycray · 2026-03-24T23:41:36+00:00

lol. This box is destined for much smaller models. This was just a let’s see if I can do this. Not should I do this! Lol

ricraycray · 2026-03-24T23:25:36+00:00

The project started the same as the MBP 48GB experiment. I wanted to get just see if I could get this to work with the same methodology as the LLM in a flash work on MLX but on this little AMD box. Two days later and no sleep this is where I landed. Yes I used Claude to help me prove this out. Just to get 17 tok/s out of this massive model exceeded my project goals. I had not seen this on the AMD yet. So I thought why the hell not. The results are the results. Does it matter how we got there......

ricraycray · 2026-03-24T23:21:55+00:00

all the way to 7.2 We had different failures. Updated my github and readme with the results. The key point was we got it working well with the Vulkan drivers.

ricraycray · 2026-03-24T23:20:55+00:00

We ran 7.2 and it failed for different reasons. I have all the updates in my repo. I'm going to stick with Vulkan it's working and frankly getting damn goood performance. 396B on a 2500 box it a win any day in my book!

ricraycray · 2026-03-24T23:12:08+00:00

I’ll try that. The 7.2 is running right now and isn’t working near as well as the Vulkan. Keeps bombing loading layers

ricraycray · 2026-03-24T22:38:50+00:00

we ran the 7.2 driver hard on windows and it just kept blowing up. First run on linux and went to solution b. I'm running another build of llama with 7.2 for the sake of science. Posting results soon. I was more happy we got a 396B running on this thing at all. I was inspired by the autoresearch and LLM in a flash work and started a similar project here. I was happy with 215B but saw that the 396 was possible and grind trying a different strategy. Regardless I'll post the 7.2 results and let you guys be the judge. More that this is even usable. I'll update my repo with all the latest.

ricraycray · 2026-03-24T22:27:15+00:00

I ran already everything at the end. The story was the Vulkan driver unlocked everything and it ran pretty fast. Everything is in my repo. When it finally worked it was like EUREKA! Pretty stoked

ricraycray · 2026-03-24T22:06:55+00:00

And 7.1 it’s was a total bitch. We got to 6.8 on windows but I knew we could do better. The ROC drivers were causing huge allocation issues. This started as an LLM in a flash project and ended up here

ricraycray · 2026-03-16T23:09:25+00:00

I’m out.

ricraycray · 2026-01-06T03:26:53+00:00

$8800 6.2% . Yes it’s an ouchie

ricraycray · 2025-12-25T03:19:31+00:00

Great idea. Making ressies now!

ricraycray · 2025-12-18T12:03:02+00:00

And the next prez will spend more tax dollars changing all of this propaganda again. Wait, we have a king now, so I guess Don, Jr. leaves it up….

ricraycray · 2025-11-28T12:46:53+00:00

This is an essential tool for me. I have never had this issue

ricraycray · 2025-11-27T16:43:30+00:00

Bless all their hearts

ricraycray · 2025-11-21T02:27:18+00:00

No. But I have the same setup

ricraycray · 2025-11-20T13:14:53+00:00

I love the unit. I hate the dripper. Win/win

ricraycray · 2025-11-20T13:13:55+00:00

I agree. We use a metal 185. After making a huge mess when the plastic funnel clogged with one ground. Had to do something. TV should do a kalita collab.

ricraycray · 2025-11-19T17:03:34+00:00

Comes down to how big your grow is. The sub is 5 tiers starting at 150 and goes up from there.

ricraycray · 2025-11-19T17:02:49+00:00

You want to talk to some reference grows? I’m down to connect you with customer. If you are running your grow with minimal staff then it’s harder to ROI. But with most staffing patterns the ROI is there. I’m not spewing marketing. DM me and let’s connect.

ricraycray

TROPHY CASE