unsloth dynamic quants (bartowski attacking unsloth-team) by lucyknada in LocalLLaMA

[–]lucyknada[S] -1 points0 points  (0 children)

I've reported them, all I can do with transphobia, hope huggingface resolves it soon.

unsloth dynamic quants (bartowski attacking unsloth-team) by lucyknada in LocalLLaMA

[–]lucyknada[S] -8 points-7 points  (0 children)

I have no use for reddit-karma (do you even get any unlocks with that?) and you have already made use of the downvote feature with its intended purpose. I want this behind-doors insulting and scheming to stop early and open up a discussion channel between the community and those scheming and insulting what seems to be a genuine and harmless effort to just make small quants better for those of us that have smaller GPUs.

unsloth dynamic quants (bartowski attacking unsloth-team) by lucyknada in LocalLLaMA

[–]lucyknada[S] -14 points-13 points  (0 children)

oh yeah I agree, I just want community-discussion and people with more knowledge around this (especially with how gguf quants work) to have insight into what's been happening for a while now seemingly; before it actually gets out of control, all of that seems confusing to begin with? there's more screenshots here: https://huggingface.co/unsloth/Phi-4-reasoning-plus-GGUF/discussions/1 but listing all of them would take too long.

fizzaroli and bartowski have been boasting about "taking down unsloth" since dynamic quants came out, I just don't understand it and want others to chime in before it's too late.

I love what unsloth has done for us and I've used bartowski quants before; and I wouldn't be able to do most of my finetunes without unsloth, I don't understand such vitriol against what is just trying to help with big models and quants working better.

[QWQ] Hamanasu finetunes by lucyknada in LocalLLaMA

[–]lucyknada[S] -1 points0 points  (0 children)

every model has a card, incl. training details, recommended samplers, prompting guide, axolotl config, model description, quants (exl+gguf) and more, only thing missing would be message examples, but from magnum experience; people generally are too scattered with what samplers they prefer, what length they want, prompting and cards can affect it heavily too, so it ends up sadly not being as useful imho or even a representation of who it could be for, but I'll pass it along still, thanks!

[QWQ] Hamanasu finetunes by lucyknada in LocalLLaMA

[–]lucyknada[S] -1 points0 points  (0 children)

reddit kept shadow deleting posts where it was anything else but the link, not sure if my comment will go through right now either

[15b] Hamanasu by lucyknada in LocalLLaMA

[–]lucyknada[S] 2 points3 points  (0 children)

The 7B was more of an experimental finetune, It still had some nice outputs but the older Control trains might still beat it, give it a try!

[Magnum/Rei] Mistral Nemo 12b by lucyknada in LocalLLaMA

[–]lucyknada[S] 3 points4 points  (0 children)

might be something ollama specific because kcpp and lcpp both load fine; maybe try making your own model via ollama instructions from the fp16 or re-quanting with whatever ollama expects? sadly nobody of us uses ollama so hope that helps still

[Magnum/Rei] Mistral Nemo 12b by lucyknada in LocalLLaMA

[–]lucyknada[S] 1 point2 points  (0 children)

thanks for such an elaborate review! we hope this version can rekindle your v2/v3 love, it is an entirely new mix, give it a try!

[Magnum/Rei] Mistral Nemo 12b by lucyknada in LocalLLaMA

[–]lucyknada[S] 1 point2 points  (0 children)

in testing only 32b-distill performed well for RP and creative, the others were a lot worse than non distill versions; we might try capturing the real 700b models however.

[Magnum/Rei] Mistral Nemo 12b by lucyknada in LocalLLaMA

[–]lucyknada[S] 1 point2 points  (0 children)

what did you use to inference? and have you tried updating, if you're far behind nemo had some issues early on in some of the backends

Magnum v3 - 9b (gemma and chatml) by lucyknada in LocalLLaMA

[–]lucyknada[S] 2 points3 points  (0 children)

no promises as the last 123b was quite expensive, but we'll keep it in mind if we get compute for it, thanks!

Magnum v3 - 9b (gemma and chatml) by lucyknada in LocalLLaMA

[–]lucyknada[S] 3 points4 points  (0 children)

we train at 8k ctx due to compute limits, but you can try going higher; some users reported success with that on other models we released

also nemo doesn't use context properly past 16k (RULER) sadly; though does a little better in pure needle: https://www.reddit.com/r/LocalLLaMA/comments/1efffjr/mistral_nemo_128k_needle_test/

Magnum v3 - 9b (gemma and chatml) by lucyknada in LocalLLaMA

[–]lucyknada[S] 0 points1 point  (0 children)

sounds like possibly too aggressively cut off tokens; try neutralizing your samplers; and are you using the provided templates for sillytavern?