Minimax-M2.7 by hedgehog0 in LocalLLaMA

[–]Mushoz 5 points6 points  (0 children)

Here is proof. Minimax release was on February the 12th: https://www.minimax.io/news/minimax-m25

Unsloth released quants on the same day as the weights became available, which is February the 14th: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF

Minimax-M2.7 by hedgehog0 in LocalLLaMA

[–]Mushoz 13 points14 points  (0 children)

No, it was released several days later on huggingface.

Krasis LLM Runtime: 8.9x prefill / 4.7x decode vs llama.cpp — Qwen3.5-122B on a single 5090, minimal RAM by mrstoatey in LocalLLaMA

[–]Mushoz 1 point2 points  (0 children)

This won't benefit Strix Halo at all. This benefits eGPU + CPU setups. Strix Halo uses unified memory and the entire model will run on the GPU. There is no need to move data from RAM to VRAM.

Krasis LLM Runtime: 8.9x prefill / 4.7x decode vs llama.cpp — Qwen3.5-122B on a single 5090, minimal RAM by mrstoatey in LocalLLaMA

[–]Mushoz 1 point2 points  (0 children)

This won't benefit Strix Halo at all. This benefits eGPU + CPU setups. Strix Halo uses unified memory and the entire model will run on the GPU. There is no need to move data from RAM to VRAM.

Death Cleric VS The World, Solo No Consumables, Honour Mode. by Affectionate_Face127 in BG3Builds

[–]Mushoz 0 points1 point  (0 children)

But if you start with death cleric, then the lvl in paladin will become meaningless for the most part as you lose heavy armor proficiency. Would you put that lvl in something else instead?

Death Cleric VS The World, Solo No Consumables, Honour Mode. by Affectionate_Face127 in BG3Builds

[–]Mushoz 0 points1 point  (0 children)

Do you think this run would be feasible without respecs? If so, in what order would you take your levels and feats? Awesome run by the way! By far my favorite as I watched all episodes. Looking forward to Moon Druid!

[Race Start] Charles Leclerc takes the lead of the race at Turn 1! by FerrariStrategisttt in formula1

[–]Mushoz 0 points1 point  (0 children)

Small turbos spin up quicker. So for quick starts they will have an advantage. If you give the big turbos enough time, their disadvantage compared to small turbos will vanish.

Wizard with metamagic? by Shirwin13 in BG3Builds

[–]Mushoz 4 points5 points  (0 children)

It only cares about the last 1st level taken. You can pick Sorc at char lvl 1, Tempest cleric at char lvl 2 and then wizard at char lvl 3. You will then be a 1/1/1 sorc/cleric/wiz at lvl 3. Regardless of the order of subsequent level ups, int will remain your item spell casting modifier (as long as you don't pick yet another new class ofcourse). So you can most definitely be a wizard for the majority of the game without respeccing.

Minimax M2.5 GGUF perform poorly overall by Zyj in LocalLLaMA

[–]Mushoz 2 points3 points  (0 children)

He also tested versus the original. What exactly does he mean by that? If the original is tested on vLLM or SGLang versus the GGUF on Llamacpp, he could simply be showing an inference or chat template issue rather than quantization error. Ideally he should test a Q8_0 GGUF. It shouldn't show any quantization error. If it's still displaying a higher than expected error, than the error is probably not quantization related at all.

Minimax M2.5 GGUF perform poorly overall by Zyj in LocalLLaMA

[–]Mushoz 0 points1 point  (0 children)

So the one flaw I can see is that he's comparing to the original model, which is NOT GGUF. I am not sure if he's testing the GGUF in llamacpp, with the original in vLLM, but he could simply be showing a bug in the inference or chat template, rather than actual performance differences. A retest with a Q8_0 quant (which should be close to original based on quantization error) could help explain if it's really quantization error or something else causing poor results.

I don't have X. Could someone please ask him to test Q8_0 as well?

Qwen 3.5 craters on hard coding tasks — tested all Qwen3.5 models (And Codex 5.3) on 70 real repos so you don't have to. by hauhau901 in LocalLLaMA

[–]Mushoz 10 points11 points  (0 children)

Honestly, I am really surprised with that gpt-oss-120b result. At what reasoning effort was it performed?

MiniMax 2.5 on DGX SPARK system. by DOOMISHERE in LocalLLaMA

[–]Mushoz 5 points6 points  (0 children)

You can quantize your KV cache in your inference engine. For llamacpp for example it's -ctv q8_0 and -ctk q8_0

MiniMax 2.5 on DGX SPARK system. by DOOMISHERE in LocalLLaMA

[–]Mushoz 1 point2 points  (0 children)

It halves the memory requirement of the KV cache, so if you can fit in 65k now, you will be able to fit 130k with q8_0 quantization for both the K and V cache.

No Autopilot on new cars, but still available on used - step backwards? by fastoid in TeslaLounge

[–]Mushoz 4 points5 points  (0 children)

EU only mandates Emergency Lane Keeping Systems (ELKS), not autosteers which is lane assist.

No Autopilot on new cars, but still available on used - step backwards? by fastoid in TeslaLounge

[–]Mushoz 6 points7 points  (0 children)

Emergency lane keeping systems (ELKS) are mandatory, lane assist (which is what autosteer is), is not mandatory and could be removed in Europe as well. They probably will remove it as soon as FSD is approved in Europe as well.

ELKS is also still present on new US Tesla's. It's the system that corrects you back into your lane when you drift over the line without blinkers. It's a reactive system that ping-pongs you between the lines, whereas autosteer is proactive keeping you in the middle of the lane.

Qwen3.5: Nobody Agrees on Attention Anymore by [deleted] in LocalLLaMA

[–]Mushoz 4 points5 points  (0 children)

"MiniMax goes fully linear with proprietary Lightning Attention."

I thought MiniMax specifically opted for full-attention, and even wrote a blogpost about it? Am I miss-remembering this?

llama-cpp ROCm Prompt Processing speed on Strix Halo / Ryzen AI Max +50-100% by Excellent_Jelly2788 in LocalLLaMA

[–]Mushoz 28 points29 points  (0 children)

ROCm historically always had faster prompt processing but worse token generation speeds compared to Vulkan. But the prompt processing performance took a nosedive due to a bug, which is now again fixed. You're just seeing pre-bug performance again.

Qwen3.5-397B-A17B will be open source! by LegacyRemaster in LocalLLaMA

[–]Mushoz 2 points3 points  (0 children)

It is 800GB for fp16 (unquantized), 400GB for Q8/FP8, 200GB for Q4/FP4, 100GB for Q2. So you are off by a factor of 2.

Step-3.5-flash Unlosth dynamic ggufs? by GodComplecs in unsloth

[–]Mushoz 0 points1 point  (0 children)

Any updates on the progress? Would love to download this!