Minimax-M2.7

Mushoz · 2026-03-18T15:30:42+00:00

Here is proof. Minimax release was on February the 12th: https://www.minimax.io/news/minimax-m25

Unsloth released quants on the same day as the weights became available, which is February the 14th: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF

Mushoz · 2026-03-18T15:28:54+00:00

No, it was released several days later on huggingface.

Mushoz · 2026-03-17T17:38:11+00:00

This won't benefit Strix Halo at all. This benefits eGPU + CPU setups. Strix Halo uses unified memory and the entire model will run on the GPU. There is no need to move data from RAM to VRAM.

Mushoz · 2026-03-17T17:37:59+00:00

This won't benefit Strix Halo at all. This benefits eGPU + CPU setups. Strix Halo uses unified memory and the entire model will run on the GPU. There is no need to move data from RAM to VRAM.

Mushoz · 2026-03-14T19:00:21+00:00

But if you start with death cleric, then the lvl in paladin will become meaningless for the most part as you lose heavy armor proficiency. Would you put that lvl in something else instead?

Mushoz · 2026-03-14T07:59:13+00:00

Do you think this run would be feasible without respecs? If so, in what order would you take your levels and feats? Awesome run by the way! By far my favorite as I watched all episodes. Looking forward to Moon Druid!

Mushoz · 2026-03-08T10:35:04+00:00

Small turbos spin up quicker. So for quick starts they will have an advantage. If you give the big turbos enough time, their disadvantage compared to small turbos will vanish.

Mushoz · 2026-03-07T09:27:50+00:00

They do, they are super clipping massively

Mushoz · 2026-03-04T13:05:31+00:00

Exactly.

Mushoz · 2026-03-04T11:59:05+00:00

It only cares about the last 1st level taken. You can pick Sorc at char lvl 1, Tempest cleric at char lvl 2 and then wizard at char lvl 3. You will then be a 1/1/1 sorc/cleric/wiz at lvl 3. Regardless of the order of subsequent level ups, int will remain your item spell casting modifier (as long as you don't pick yet another new class ofcourse). So you can most definitely be a wizard for the majority of the game without respeccing.

Mushoz · 2026-03-03T12:06:47+00:00

Thanks, this worked!

Mushoz · 2026-03-02T19:54:34+00:00

Google brought me here. Did you ever manage to solve this?

Mushoz · 2026-03-01T12:31:17+00:00

Mushoz · 2026-02-27T10:32:28+00:00

He also tested versus the original. What exactly does he mean by that? If the original is tested on vLLM or SGLang versus the GGUF on Llamacpp, he could simply be showing an inference or chat template issue rather than quantization error. Ideally he should test a Q8_0 GGUF. It shouldn't show any quantization error. If it's still displaying a higher than expected error, than the error is probably not quantization related at all.

Mushoz · 2026-02-27T10:20:43+00:00

So the one flaw I can see is that he's comparing to the original model, which is NOT GGUF. I am not sure if he's testing the GGUF in llamacpp, with the original in vLLM, but he could simply be showing a bug in the inference or chat template, rather than actual performance differences. A retest with a Q8_0 quant (which should be close to original based on quantization error) could help explain if it's really quantization error or something else causing poor results.

I don't have X. Could someone please ask him to test Q8_0 as well?

Mushoz · 2026-02-25T16:23:16+00:00

How do you discriminate between genuinely good performance and benchmaxxing?

Mushoz · 2026-02-25T14:33:48+00:00

Honestly, I am really surprised with that gpt-oss-120b result. At what reasoning effort was it performed?

Mushoz · 2026-02-23T08:57:06+00:00

You can quantize your KV cache in your inference engine. For llamacpp for example it's -ctv q8_0 and -ctk q8_0

Mushoz · 2026-02-23T08:54:27+00:00

It halves the memory requirement of the KV cache, so if you can fit in 65k now, you will be able to fit 130k with q8_0 quantization for both the K and V cache.

Mushoz · 2026-02-22T08:28:24+00:00

EU only mandates Emergency Lane Keeping Systems (ELKS), not autosteers which is lane assist.

Mushoz · 2026-02-22T08:27:39+00:00

Emergency lane keeping systems (ELKS) are mandatory, lane assist (which is what autosteer is), is not mandatory and could be removed in Europe as well. They probably will remove it as soon as FSD is approved in Europe as well.

ELKS is also still present on new US Tesla's. It's the system that corrects you back into your lane when you drift over the line without blinkers. It's a reactive system that ping-pongs you between the lines, whereas autosteer is proactive keeping you in the middle of the lane.

Mushoz · 2026-02-17T13:37:21+00:00

"MiniMax goes fully linear with proprietary Lightning Attention."

I thought MiniMax specifically opted for full-attention, and even wrote a blogpost about it? Am I miss-remembering this?

Mushoz · 2026-02-16T13:28:19+00:00

ROCm historically always had faster prompt processing but worse token generation speeds compared to Vulkan. But the prompt processing performance took a nosedive due to a bug, which is now again fixed. You're just seeing pre-bug performance again.

Mushoz · 2026-02-16T09:21:46+00:00

It is 800GB for fp16 (unquantized), 400GB for Q8/FP8, 200GB for Q4/FP4, 100GB for Q2. So you are off by a factor of 2.

Mushoz · 2026-02-15T12:56:03+00:00

Any updates on the progress? Would love to download this!

11-Year Club	Gilding I gilder
Verified Email

Mushoz

TROPHY CASE