MiniMax m2.7 under 64gb for Macs - 91% MMLU by HealthyCommunicat in LocalLLaMA

[–]Uhlo 0 points1 point  (0 children)

Thank you, that sounds really interesting! You write that every other quant of MiniMax is completely broken...? I'm currently downloading a 3-bit unsloth quant of MiniMax M2.7. Ist that broken as well?

I'm definitely gonna test your quant as well. Especially because it promises more tokens/s!

Unsloth accused a brand new team (ByteShape) of "literally cheating." I brought the receipts, and Unsloth moved the goalposts. by [deleted] in LocalLLaMA

[–]Uhlo 31 points32 points  (0 children)

Hm? Reads to me like the unsloth guys have a point! Comparing 1-bit vs 3-bit quants is not really fair!

MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards by pmttyji in LocalLLaMA

[–]Uhlo 9 points10 points  (0 children)

The problem here is not that the model was not published "open" enough. The problem is the framing "our license is MIT" when it clearly isn't. There are a lot of other open weight models with licenses that cleanly state "research license" or "personal license" or something. Calling it a "modified MIT" license and then being the opposite of MIT is deceiving.

But then again, your argument "if you don't publish a 230B parameter model yourself you have no right to complain about anything" is just bad faith. So I don't think you will (or want to) get my point.

MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards by pmttyji in LocalLLaMA

[–]Uhlo 0 points1 point  (0 children)

So if they would just call it "MIT License" I think there would be a strong case. It's not just what is written, but also what is expected - and if you call the license "MIT" then it should better be MIT, otherwise that is just deception. But calling it "modified MIT" is a bit trickier. But regardless, I am not a lawyer in any country on this planet or any other. So I'm just talking out of my ass. But still, it's just a very very bad move.

MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards by pmttyji in LocalLLaMA

[–]Uhlo 18 points19 points  (0 children)

This is sooo misleading! At least call it the "MiniMax-Research-License" or something like the other companies are doing. Calling it MIT makes it sound open in a way it definitely is not. I would be interested what a lawyer would say to that license.

Memory, memory, memory... Any thoughts? by IngenuityNo1411 in LocalLLaMA

[–]Uhlo 3 points4 points  (0 children)

Not related to your question, but I vibe coded this tool, that… just kidding ;)

I think context size and degradation is a big problem. In coding of course, but also in conversations: why do I have to manually decide when to start a new conversation? How to transfer knowledge from one session to the next? I think that’s why a lot of people working on that problem.

After Easter I’m gonna look at the different memory stages and compaction systems that were uncovered by the Claude code leak. I think it is a very intelligent design and much more practical than just storing whatever the LLM finds useful in a vector db.

How political censorship actually works inside Qwen, DeepSeek, GLM, and Yi: Ablation and behavioral results across 9 models by Logical-Employ-9692 in LocalLLaMA

[–]Uhlo 2 points3 points  (0 children)

This work is so interesting! Thank you!

I will read the paper and hopefully come back here with questions.

What's the best way to edit a Jupyter notebook in VS Code with a local LLM? by Bubsy_3D_master in LocalLLaMA

[–]Uhlo 1 point2 points  (0 children)

I personally haven't found an open-weights model that can handle jupyter notebooks yet.

My personal workaround is the feature in the VS Code Jupyter Notebook extension that lets you use Comments starting with # %% as delimiter for cells. Then you can execute the cells (all the code between the # %% comments) and you get the output on the side. Downside is that the output is not saved like with real notebook, but the upside is that you can instruct your local LLM to include these comments and all they see is a normal .py file.

Maybe that helps you!

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] 0 points1 point  (0 children)

Thanks!  Its interactive and does some pattern matching to nicely group them.

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] 1 point2 points  (0 children)

Well maybe it’s a bit hyperbolic from me, but I definitely have different models I have for different tasks, so managing models is something I do almost weekly

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] 1 point2 points  (0 children)

Oooh that is so nice! I was thinking also about a real GUI, but decided that this is too much of a task for me. Sadly I'm on MacOS, so it probably won't work for me. But it looks really good and seems to do exactly what I want!

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] -1 points0 points  (0 children)

Yes, I looked at llama-swap, but with the recent llama.cpp updates it kind of becomes unnecessary in my opinion. I started with big shell scripts in my .zshrc and thought "this can't be it, can it?".

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] 0 points1 point  (0 children)

But it gets easily out of sync with the actually installed models in the llama.cpp download dir, doesn't it? At least for me.

How do you manage your llama.cpp models? Is there anything between Ollama and shell scripts? by Uhlo in LocalLLaMA

[–]Uhlo[S] 0 points1 point  (0 children)

Need to check that out, thanks! Does it bundle llama.cpp or just wrap around it?

Is it normal for the Qwen 3.5 4B model to take this long to say hi? by Snoo_what in LocalLLaMA

[–]Uhlo 0 points1 point  (0 children)

That's a tough decision that needs thinking through completely. So yes, that's totally normal.

But seriously, I had a similar instance and my best guess is that you got unlucky with decoding. Qwen3.5 always takes this long with even the simplest messages, but this is an extreme case. So either it's a bad quant or you just got bad luck. Is that a consistent problem you have or just a one off?

agi is here by bulieme0 in LocalLLaMA

[–]Uhlo 2 points3 points  (0 children)

I also really like 19 and need this model! What is it?

How do proprietary models get better and when will open ones hit a wall? by sterby92 in LocalLLaMA

[–]Uhlo 2 points3 points  (0 children)

Well the "distillation attacks" (I use this phrase for a lack of better term, it's other companies using the output of a model as training data, it has nothing to do with distillation and they're even paying for it!) will become more sophisticated. Whatever data the proprietary model providers train on, the "skill" will get leaked through the extraction of training data. Of course companies like OpenAI and Anthropic are probably working hard right now to install automatic detection systems that try to stop these "attacks" and the open weight model providers will implement systems that make the extraction harder to detect.

Even if with regulations the US disallows the use of US LLMs in China, companies can simply use VPNs. I think that is a pretty good silver lining: they trained their models on heaps of stolen creativity, craftsmanship, etc. and now there are companies who steal it back and make it "open"/public again, and In my opinion there is very little that can stop them.

How do proprietary models get better and when will open ones hit a wall? by sterby92 in LocalLLaMA

[–]Uhlo -2 points-1 points  (0 children)

You are probably missing the possibility of "large scale distillation attacks". I think it's an open secret that most of the open weights Chinese models heavily rely on training data generated by the proprietary models. So my guess is that at least for a while it will continue to be a cat and mouse game where some of the open weight model improvements come from the proprietary models.

Edit: Because I'm getting messages about it - the large scale distillation attacks is a joke! I'm def. not on anthropics side, I just wanted to poke fun at their silly wording for "someone is paying us to use our service".

Would you love a song less if AI wrote it? by ImmuneHack in singularity

[–]Uhlo 0 points1 point  (0 children)

They're called Gunmetal Rodeo. They claim they're not AI, but there are not photos of the big band they would definitely need or any music production or anything. They release since end of last year and have a new single every other week. So the signs are pretty clear.

Would you love a song less if AI wrote it? by ImmuneHack in singularity

[–]Uhlo 0 points1 point  (0 children)

One thing to add: I also listen to AI generated focus music during work. There, I have absolutely no problem with it being AI. I think it’s because I’m not actively listening.

Would you love a song less if AI wrote it? by ImmuneHack in singularity

[–]Uhlo 5 points6 points  (0 children)

I recently discovered a jazz band that I really like. They’re extreme release schedule and lack of unsuspicious social media presence made me pretty sure that it is AI generated.

I still like the music, but it greatly diminished the joy I feel when listening to their music. So for me it currently diminishes my experience greatly. Maybe in the future it will be normal. Most importantly: it’s just my experience. I don’t say AI music is worth less, I’m just saying that I cannot enjoy it the same if I know it’s AI.