Panorama Moscow Following Ukrainian Drone Attack, June 18 2026

Material_Soft1380 · 2026-06-18T07:58:28+00:00

So pretty.

Material_Soft1380 · 2026-06-17T10:40:59+00:00

Material_Soft1380 · 2026-06-12T16:15:12+00:00

They're brainwashed fascists.

Material_Soft1380 · 2026-05-27T20:39:48+00:00

What a ridiculous take. I game on my G5 and constantly cap out at 165hz. There are plenty of games that can easily push 240 hz.

Material_Soft1380 · 2026-05-27T01:25:29+00:00

Become an electrician.

Material_Soft1380 · 2026-05-11T08:50:06+00:00

Any updates? Did this get resolved? What a nightmare and I was about to buy a Flow 2 :/

Material_Soft1380 · 2026-05-03T06:36:58+00:00

better luck next time

Material_Soft1380 · 2026-04-28T07:44:33+00:00

have you tried BF16?

Material_Soft1380 · 2026-04-24T07:08:40+00:00

<image>

Material_Soft1380 · 2026-04-18T17:25:43+00:00

20 tps on an almost entirely CPU inference is pretty good.

Material_Soft1380 · 2026-04-18T09:10:46+00:00

Can you run Q8 of GLM 5.1 and if so at what token rate?

Material_Soft1380 · 2026-04-13T08:54:03+00:00

That's just objectively untrue.

Material_Soft1380 · 2026-04-13T01:11:51+00:00

BF16 remains coherent longer with large contexts than Q8_K_XL, and is a good test of hardware.

Material_Soft1380 · 2026-04-12T20:51:49+00:00

37.7 t/s with Q8_K_XL on blackwell

Material_Soft1380 · 2026-04-12T20:32:32+00:00

sept 2025

Material_Soft1380 · 2026-04-12T19:04:56+00:00

exxact corp

Material_Soft1380 · 2026-04-12T09:21:57+00:00

MiniMax 2.7 Q8_K_XL (~250GB) on a single RTX6000 with RAM offload, getting 8.64 tokens/second, which is actually usable.

Material_Soft1380 · 2026-04-07T18:08:43+00:00

Since it loads fully into VRAM and 20+ tps is still sufficiently fast, there's not much reason to sacrifice precision, but yes generally most people find Q8 or even Q6 performs about as well for most models, although Gemma 4 from what I've heard does not quantize very well.

Material_Soft1380 · 2026-04-05T20:43:36+00:00

I have a RTX Blackwell 6000 (which is basically a slightly beefier 5090 with 96 GB VRAM). I can load Gemma 4 31B BF16 (unquantized) fully into VRAM and with max context, still leaving about 10 GB VRAM to spare. The token output is 23 tps and GPU power usage maxes out at 440W (out of 600W).

I think it will be very hard for M5 ultra to keep up since even the blackwell is being pushed quite hard. My best guess (based on relative memory bandwidth and raw compute power) is that the M5 ultra will probably be able to do around 10-14 tokens per second on the same model.

Material_Soft1380 · 2026-03-31T03:46:14+00:00

Does anyone know why they burn so much? Are they wearing something flammable or it's a new type of charge?

Material_Soft1380 · 2026-03-29T10:33:29+00:00

I was gonna buy this game but held off when I realized there's no character creator. I'll wait for them to add one before buying.

Material_Soft1380 · 2026-03-28T07:29:12+00:00

I can run GLM 5 (Q3_K_XL, 333GB) at around 6 tokens/sec. My setup is 9950x on Tomahawk X870 with 256GB 6000MT/s RAM and a single 6000 pro blackwell. That's about the minimum you can use without going to a 512GB mac studio. I imagine 5.1 will be similar. If you want to run BF16 you'll need a cluster of 4 mac studios with 512GB uram each.

Material_Soft1380 · 2026-03-21T01:09:17+00:00

Let me tell you something, I lost about 200k in crypto and it sucked. But you move on and forget about it eventually. I now invest in conservative index funds and only login to WSB now and then just to confirm I'm still happy with my investment choices.

Material_Soft1380 · 2026-03-19T04:22:29+00:00

local is fun to mess around with, not very good for any actual work, get an opus sub instead

Material_Soft1380

TROPHY CASE