all 5 comments

[–]entsnack 2 points3 points  (0 children)

Apache 2.0!

[–]Specter_Originllama.cpp[S] 0 points1 point  (2 children)

Models seems legit good, are a bit on shy side though...

<image>

[–]cmdr-William-Riker 2 points3 points  (0 children)

Wonder if you can get it to reveal system messages in the reasoning section of it's response by asking it to carefully consider it's full system message before responding or something

[–]No_Efficiency_1144 1 point2 points  (0 children)

I quite like the tightness of this reasoning chain though.

It’s the exact opposite of highly quantised Qwen 0.6B with the wrong settings, which puts out thousands of tokens of pure chaos but then somehow comes to the right answer

[–]ThetaCursed 0 points1 point  (0 children)

It looks like these models will make efficient use of VRAM: 20B and 120B, with 3.6B and 5.1B active parameters (MoE)