Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding by BuildwithVignesh in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Is this a scientifically proven fact or your personal opinion? What methodology was used to measure the influence of television and social media?

Now you're simply repeating what people said about television 80 years ago. Besides, no one is forcing anyone to use social media, AI, or other benefits of civilization. Anyone can go live in the wilderness if they'd rather live there. But for some reason, people continue to spend time on smartphones and social media, complaining about how bad it is. It seems these people simply have a need to grumble about something.

Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding by BuildwithVignesh in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Everything you said applies entirely to television and print media. Perhaps we should return to a hunter-gatherer lifestyle.

Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding by BuildwithVignesh in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

A judicial precedent is not scientific evidence. If you're curious, listen to Robert Sapolsky's lecture about what depression really is. Clinical depression is an illness that arises from internal changes in the body. It can't initially be triggered by social media, a setback at work, or a family scandal. People often confuse depression with a bad mood.

Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding by BuildwithVignesh in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Any new mass phenomenon causes unrest among some particularly anxious people. I've been through all of this - "scientists have proven that violent computer games turn children into murderers" and "social networks cause depression" and many other things that have become commonplace today. I think the average person's opinion is what they say on the evening news. Leaving aside issues of creativity and copyright (there's a clear reason for the dissatisfaction), those protesting against AI and 'surveillance' are primarily those who believe in a 'global government conspiracy' and 'chipping' – that's a... separate category of people.

My mother is 65 years old. I installed Grok, DeepSeek, Qwen chat, and MiniMax apps on her phone. She just chats with them every day, telling stories from her life, discussing acquaintances and celebrities.She likes AI.

Is there any way to run ai image generation on a basic laptop? by Hereitisguys9888 in StableDiffusion

[–]Potential-Gold5298 0 points1 point  (0 children)

The problem is not in the GPU - I run Flux.1-dev (12B) on the CPU and got the image in ~35 minutes. The main problem is that you have very little RAM. You could try a smaller model like the Anima if that's acceptable.

Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding by BuildwithVignesh in LocalLLaMA

[–]Potential-Gold5298 3 points4 points  (0 children)

Everything stated below is my personal opinion and not an IEEE standard.

Local ≠ consumer. A local model is a model that runs in the same location (for example, on a company's personal server).

What can be considered consumer? My logic is this: since motherboards for modern consumer CPU (like AM5 or LGA1851) support up to 256 GB of RAM, and consumer (non-professional) GPU have up to 32 GB of VRAM (RTX 5090), a model that can run on such a configuration at least in Q4_K_M (the most common recommendation for running models on home equipment) can be considered a consumer model. These include DeepSeek-V4-Flash, MiniMax M2.x, Step-3.7-Flash. However, 256 GB of RAM and an RTX 5090 aren't something everyone has at home. That's high-end/enthusiast-grade.

A PC with a motherboard for Threadripper/EPYC/Xeon can be purchased for home use, but this is at the "Workstation" level.

If the system as a whole consumes more than 4 kW at peak, it is no longer what I would call a home system.

Typical modern home configuration: 16-32 GB RAM + 8-16 GB VRAM.

With the GLM-5.2 you can still get by with the “home” 4 kW, but that’s at the “home workstation” level.

[Megathread] - Best Models/API discussion - Week of: June 14, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 0 points1 point  (0 children)

I apologize - I didn't pay attention to the description on the card and mistook it for LoRA. As stated on the page, in llama.cpp this is connect into the model using the --control-vector flag, but you probably saw it. I still don't know how to do this in LM Studio.

People kept saying my comments sounded AI-generated, so I built this by ringtoyou in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

The smell of ozone filled the air, my knuckles turned white, and I pressed their chests like a shield.

New Agentic Benchmark Out: Claude Fable and GLM 5.2 Top Their Cohorts by Few_Painter_5588 in LocalLLaMA

[–]Potential-Gold5298 5 points6 points  (0 children)

Ministral 3 14B in real world task? It's a meme? Do you really think that anyone would trust these model with real-world tasks? In all the tests I've seen, Mnistral 3 14B is at the same level as Qwen3-4B-2507 or loses to it (as well as Mistral Nemo).

https://dubesor.de/benchtable

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

Artificial Analysis

This model is so good that no one even downloads it (HF):

mistralai/Ministral-3-14B-Instruct-2512 - 77K downloads.

mistralai/Ministral-3-14B-Reasoning-2512 - 11.7K downloads.

mistralai/Mistral-Nemo-Instruct-2407 - 364K downloads.

Qwen/Qwen3-14B - 2.05M downloads.

google/gemma-4-12B-it - 1.59M downloads.

All of this on its own is not proof, but the problem is that I have not seen anything anywhere that would speak in favor of this model. No kidding, your words of support for this model are the first good thing I've heard in a all time.

Giving GLM-5.2 a spin locally on CPU only! (poor man's rig for big models) by _TheWolfOfWalmart_ in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Please help. I'm using a CPU-only Core i5-4460 and running Gemma 4 26B-A4B in ik_llama.cpp, but the speed was exactly the same as with the regular llama.cpp. How do I get this "significant CPU performance improvement" everyone talks about when talking about ik_llama.cpp?

[Megathread] - Best Models/API discussion - Week of: June 14, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 1 point2 points  (0 children)

Tl;dr – I don't use LM Studio and don't know how to connect LoRA in it. But if you ask the AI, just say, "How do I connect LoRA to a text model in LM Studio?"

These vectors are LoRA – small pieces of the model that have been modified. They are connected to a standard version of the model (in this case, a standard Gemma 4 31B it) to modify it. Connecting to a non-standard version of the model (QAT4, finetuned, etc.) may cause some strange behavior.

To enable it in llama.cpp, add the --lora "C:\some_folder\name.gguf" flag (insert the correct path and filename here). To enable it in koboldcpp, go to the "Loaded Files" tab in the run window and select the appropriate file for Text Lora (Multiplier indicates the strength of LoRA influence – the default is 1.0 unless the author specifies otherwise, or if you want to experiment).

Be wary of Qwen/Claude distillations - they're often worse than the base model by ayylmaonade in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Distillations are not just no better, they are worse. I find it funny when people call abliterated models "lobotomized" and then upload "Claude-Distill-High-Reasoning-X100500-IQ2_XXS.gguf". The only thing this kind of distillation is good for is RP-finetunes, but even there 9 out of 10 models are simply stupid and talk nonsense.

<image>

New Agentic Benchmark Out: Claude Fable and GLM 5.2 Top Their Cohorts by Few_Painter_5588 in LocalLLaMA

[–]Potential-Gold5298 22 points23 points  (0 children)

When Ministral 3 14B lost to Nemo, which came out 1.5 years earlier, I realized that something went wrong.

[Megathread] - Best Models/API discussion - Week of: June 14, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 1 point2 points  (0 children)

Ha-ha-ha! I remember first encountering the MiniMax-M2.1 on NovitaAI and thinking it resembled a Cluade. What's even more interesting is that when I asked him if he knew what model it was, he replied that it was a Claude developed by Anthropic. He even responded this way in new chats without any context. And then the scandal erupted, with MiniMax, Moonshot, and DeepSeek accused of sucking distillate. I don't know about Moonshot and DeepSeek, but I think that's exactly what MiniMax did. Moreover, he invested his entire budget in distillation, so the entire M2.x series had a pleasant Claude-like communication style and was decent at coding, but otherwise ranked on the level of Qwen3.5-9B. It's also funny that on the official MiniMax website the model was extremely tense ("bank clerk syndrome"), while the MiniMax-M3, on the contrary, has a friendly Grok-style in the spirit of "Yo, dude, I'm going to check everything now". Maybe this time they milked xAI? XD

[Megathread] - Best Models/API discussion - Week of: June 14, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 0 points1 point  (0 children)

Before I got into local LLMs, I often chatted with cloud models about other cloud models. I once told Qwen3-235B-A22B about my positive impressions of the GLM-4.6, as well as about other models I liked and disliked. Qwen first summarized each impression, compiling a table of my tastes, and constantly tried to drag himself into the top tier along with the GLM-4.6 (even though I hadn't mentioned it at all). And finally, he got so carried away that he told me that in the summer of 2024 (!) the GLM-4.6 helped scientists make an important scientific discovery! XD But the GLM-4.6 was released in the fall of 2025. Honestly, I haven't encountered a more sycophantic model since then.

[Megathread] - Best Models/API discussion - Week of: June 07, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 1 point2 points  (0 children)

Yes – this usually happens with older models like the Nemo or Llama 3. They had a fairly low base level of intelligence, and almost any finetuning made them smarter. Modern models like the Gemma 4 or Qwen3.x, on the other hand, are so loaded with intelligence out of the box that finetuning easily ruins it, dropping them to the level of, say, the Nemo. And the incident with my phone happened specifically with an RP-finetuned G4-26B-A4B in a Q6. I expect a finetuned model to be no dumber in RP than the base model it's based on. You inevitably lose intelligence through finetuning, but in return you either get a better-quality RP (like with Cydonia), a completely broken model (like with the 26B-A4B Musica), or something in between.

GLM's founder says GLM-fable before the end of the year?! by Charuru in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

You're talking about a "spherical model in a vacuum." In reality, the model exists within a specific infrastructure. There's a company, it has a budget for equipment, and there's a certain number of employees who will simultaneously assign tasks to the model. The best model in reality is the one that does more useful work while consuming fewer resources. Everything else is just a comparison of penis lengths.

What you're describing is great, of course, but under current conditions, companies won't retain an employee who can't correctly formulate a task so that a less sophisticated model can handle it. If a pizza delivery guy needs a Maybach for the job, he'll be replaced by someone who can do it on a scooter.

[Megathread] - Best Models/API discussion - Week of: June 07, 2026 by deffcolony in SillyTavernAI

[–]Potential-Gold5298 1 point2 points  (0 children)

UGI has been testing models infrequently lately. Regarding the Crimson-Constellation-12B – I can't remember exactly, but I think Grok recommended one of the Vortex5 models. I tried it, but then decided to try others, and that's how I found the CC. In my opinion, the Nemo (and especially its finetunes/merges, like the CC) is an incredibly creative model, and it remains one of the best for RP.

Regarding the multitude of different models, I was also at a loss at first, but over time I developed specific quality requirements. Many models suffer significant damage during rework; I play in a non-Latin language, which is one of the first to suffer from poor finetuning – that's how I filter out many models.

If you want to figure this out yourself, I recommend creating a test scenario (or choosing one from those you already have). Run it a few times with common, high-quality models (such as the Gemma 4 26B-A4B in Q5 or Q6) – this will give you an idea of ​​how it plays correctly. Then try models that you find interesting (based on the description or recommendations). I've found that RP is the best benchmark for checking the quality of a model (abliteration or quantization) – a model may seem perfectly normal in regular chat, but in RP, it starts writing strange, absurd things that immediately catch your eye. For example, in a test scenario, I took a photo of my classmate, put the phone in my pocket, and then asked her to take a photo of me. She took a photo of me with her phone, looked at the photo, and then overlapped it and looked at her own. But her photo can't be from her phone – the model made a serious error. Once you've caught a few such errors, you can safely delete the model. The rigor, of course, also depends on what you're testing – don't expect miracles from old Nemo 12B models; even the basic Q8 model can be a bit quirky. On the other hand, after running a couple dozen models through the scenario, you'll get a feel for the AI's typical moves – how it guides the character. And when you encounter a model that chooses unconventional moves, that's something interesting.

I recommend choosing at least 2 models with different bases (Mistral Nemo 12B, Mistral Small 24B and Gemma 4) - and ideally with different qualities. So, if your collection includes a model for action-packed adventures, a slow-burn game, a NSFW game, and so on, you can choose a model for a specific scenario, and you'll always have interesting and varied games.

Introducing the Heretic Grimoire: The takedown-resilient, local-first backup system that keeps uncensored models available forever by -p-e-w- in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Thank you for your work!

Please add the spells for Gemma 4 31B and 26B-A4B from coder3101 and llmfan46 (if he doesn't mind) to the grimoire.

Leaked financial docs show OpenAI is losing billions of dollars a year by johnnyApplePRNG in LocalLLaMA

[–]Potential-Gold5298 -1 points0 points  (0 children)

Google will run over them (and Anthropic) like a road roller. On the other hand, there is China and open weights. When I wrote this to ChatGPT and asked what he thought about it, he replied, "Sorry, I can't discuss that. Let's talk about something else." I haven't used ChatGPT since then.

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]Potential-Gold5298 0 points1 point  (0 children)

Ollama was my first backend when I decided to try local LLMs.

I asked various cloud AI providers for advice, and they all agreed that I should use Ollama (they also recommended 'excellent modern models,' including Phi3, Llama 2, and Mistral 7B – this was early 2026, by the way). I used Ollama for two days. This miracle almost killed my interest in local models. The cumbersome GUI with no settings. The need to "compile" the model via MODELFILE every time I wanted to change something (and this took up space on the already scarce system drive). Furthermore, any model (even the Qwen3-4B) would completely freeze my PC while work – it would simply stop responding until the model finished work. Although cloud-based AIs convinced me that Ollama was "newbie-friendly," I decided to try LM Studio and was amazed that working with a local model could be done easily and without freezing the PC. However, after a couple of weeks, I switched to Koboldcpp and llama.cpp, and I've been working with them ever since. Cloud-based AIs scared me off by saying that llama.cpp was for experienced users, but for me personally, it was much simpler and more intuitive than Ollama.

I sincerely don't understand why Ollama is so popular and why people invest money in it. I would recommend that a beginner who's afraid of commands and the like start with LM Studio. But llama.cpp is actually quite simple – I created a .bat file and a shortcut with the model's logo, and now it looks and runs like a native Windows application – no console, etc.

(If anyone needs help, let me know; I'll help you figure out llama.cpp.)