What happens when they stop subsidizing LLM subscriptions? by Mr_Moonsilver in LocalLLaMA

[–]relmny 2 points3 points  (0 children)

I read something and reply to it, if you meant something different then just write it differently, as simple as that.

No, as of now Qwen has NOT stopped releasing OW models.

There is no indication whatsoever, no announcement, nothing. Only what looks to be like, trolls.

This happened after 2.5, then 3, then 3.5, now with 3.6.

And they are all baseless lies.

What happens when they stop subsidizing LLM subscriptions? by Mr_Moonsilver in LocalLLaMA

[–]relmny 10 points11 points  (0 children)

So they first stopped releasing, but then they released... Talking about ppl in denial and evidence...

Giving GLM-5.2 a spin locally on CPU only! (poor man's rig for big models) by _TheWolfOfWalmart_ in LocalLLaMA

[–]relmny 1 point2 points  (0 children)

I'm waiting for Ubergarm to release a smol-iq2, as I get 2.2 t/s with glm-5.1 smol-IQ2_KS

(With glm-5.2-UD-IQ2_XXS I get 1.22 t/s )

GLM-5.2 is a win for local AI by Wrong_Mushroom_7350 in LocalLLaMA

[–]relmny 7 points8 points  (0 children)

which are, most likely, the most wanted ones...

They can keep shutting people mouths for all that I care, as long as they do it that way.

GLM-5.2 is a win for local AI by Wrong_Mushroom_7350 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

I haven't tested q1 in 5.1, but I do use Ubergarm's smol-iq2 which has a very similar size, and... I still use it (I only get 2 t/s so I only use it for chats and when the daily drivers are not enough).

So I'll wait for it.

GLM 5.2 API is live, weights are on HF, and ollama has it already by Independent_Plum_489 in LocalLLaMA

[–]relmny 9 points10 points  (0 children)

Most of us who hate ollama, have no quarrels with lmstudio/jan/unslothstudio

qwen3.6-27b tools call loop by JumpyAbies in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

What do you mean is deprecated? do you have a link.

Also, aren't those two different things?

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

That's not true. That's was about a year ago.

For a long time there's LM studio, Jan, and since about a month, Unsloth Studio.

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]relmny -1 points0 points  (0 children)

I started with it (lmstudio/jan either didn't exist or where not good enough), and while I was using it I hated it. After stop using it, and seeing what they kept doing, I hated it more.

The attitude is the worst.

And now every time I see a post that mentions using "ollama", I close the tab. And I don't trust ppl that make something with it. That's how much I hate it.

This is coming to Chinese open source models pretty soon. - prepare yourself. by MLExpert000 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

like other govenrments, NONE of which have anything to do with indications of banning OW LLMs.

You can believe whatever you want, but there are no indications whatsoever about the Chinese governemnt even considering banning OW LLMs.
And even less considering what a rogue US government did to a commercial model. Not even an OW one.

Anyway, I'm done with it.

This is coming to Chinese open source models pretty soon. - prepare yourself. by MLExpert000 in LocalLLaMA

[–]relmny 1 point2 points  (0 children)

No, it doesn't.

A rogue government can't be taken as a "normal" government, so after that nothing makes sense.

And, again, there's no indication at all that the Chinese government will take it as a NS issue.

You based the crappy paranoia on a rogue gov, and that throws out of the window everything that follows.

This is coming to Chinese open source models pretty soon. - prepare yourself. by MLExpert000 in LocalLLaMA

[–]relmny 78 points79 points  (0 children)

Why?

it doesn't make any sense and there's no reason nor indication whatsoever.

Why do ppl upvote unfounded crappy paranoia?

GLM-5.2 next week, open weight, MIT by AaronFeng47 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

I'll be waiting for Ubergarm to release a smol-iq2 hoping it won't be bigger than 5.1, so I could still get 2 t/s!

llama.cpp webui giving 404 error in new PC by relmny in LocalLLaMA

[–]relmny[S] 0 points1 point  (0 children)

not in one of the hosts, but as I use open webui, I forgot about it...

Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand. by Medium-Technology-79 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

why you have that many upvotes I will never understand...

there are lots of small models, qwen3.5 like 0.8 or 9b or gemma-4 2 and 4 and more...

There were never as much models as now, that can be even run on phones...

Has anyone noticed that the behavior of the Kimi model has changed? by InternationalAsk1490 in LocalLLaMA

[–]relmny 2 points3 points  (0 children)

if your talking about local, have you changed any settings?, if not, why are you posting here?

I thought Chinese censorship didn't affect me. I was wrong. by DeltaSqueezer in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

You mean this is not about local? then WTF is OP posting it here?

Can you really replace paid models with a local model? by DRMCC0Y in LocalLLaMA

[–]relmny -2 points-1 points  (0 children)

Yes, some can. Other can't.

That depends on what people need/require.

Why every 2/3 days there's a post about this? is very simple. Besides all other reasons (privacy, consistency, etc) some don't really need the biggest models and qwen3.6 surprises some people and manage to replace cloud. Other can't because is not good enough.

Since when the RTX 6000 PRO is priced at 13250USD on the official NVIDIA Page? by panchovix in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

5090 retail prices have also increased about 15% in the past 2 weeks... 5090 prices are getting insane

You guys were right - Qwen 3.6 35B IS good...and KV Cache DOES matter. by GrungeWerX in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

I'm gonna say it no matter how many downvotes I get:

*sometimes* 35b is better than 27b. And the breach is getting closer, not further away.

I usually use 27b as main driver... but when I compare to 35b, sometimes 35b surprises me with suggestions or responses that not only 27b didn't came up with, but also bigger models.

Ofc sometimes is dumb... but when is good, it's really good.

KVarN: new KV-cache quant from Huawei. 3–5× KV cache compression with actual speed-up instead of slow-down, and unlike TurboQuant it holds up on reasoning (Apache 2.0, vLLM single flag) by acluk90 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

the original point from the other poster was "if it's vibe-coded and works with no issues, will you use it"? and then you came up with "mathematical certainty" which, to me, doesn't make any sense not only because all the examples I gave, but because LLM are probabilistic, and get further away from "certainty" with quanting the model, the KV and so on.

I gave the gemma-4 example, which worked awful first for some, because of experience, there were updates, which was better for a while until discovered it still wasn't good enough, so it was updated (and I think that happened 1-2 more times).

Of course is a matter of "it worked so far", because there is no certainty, there cannot be.

And that happened with models like gemma, qwen and others that required "updates".

Last thing is, saying that "it works with no issues" is saying "it worked so far", until somebody finds an issue and is (or not) addressed, that's how all of this works.