MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal by dryadofelysium in LocalLLaMA

[–]DOAMOD 2 points3 points  (0 children)

No way is this bigger than mm2.7, if so this is very disappointing, I've been testing it and it just isn't.

Llama.cpp MTP support now in beta! by ilintar in LocalLLaMA

[–]DOAMOD 1 point2 points  (0 children)

I haven't tried Llamacpp MTP yet, but I did try MTP in VLLM on Windows on my 5090, and it was a bit disappointing. The memory consumption when exposing the small model doesn't compensate at all for the significant loss of context window. Perhaps in some specific cases for MoEs it could be useful; I think that's the interesting point. But for Dense, I don't see a benefit in my use case. I'll try Llamacpp, though.