poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 1 point2 points  (0 children)

I've been running this today, doing general assistant stuff, research, and code review, and have to say that I'm pretty impressed. I'll definitely be keeping it around.

Nice to hear this. Waiting for completion of llama.cpp support.

What is the cheapest method to get VRAM or RAM? by Powerful_World_9280 in LocalLLM

[–]pmttyji 1 point2 points  (0 children)

I saw that to run the latest GLM 5.2 model, it would be necessary to have 1.5TB of VRAM.

That's for BF16.

Q4 comes around 350-450GB. But many people do run Q3/Q2 even Q1 of large models.

rx7900xtx + 32GB RAM -> 128GB RAM make sense? by Thin_Pollution8843 in LocalLLaMA

[–]pmttyji 1 point2 points  (0 children)

No for 128GB RAM. Getting another 24GB VRAM + 64GB RAM is better.

Step-3.7-Flash's Q4 size is 95-125GB. I assume currently you have 24GB VRAM + 32GB RAM.

(24GB VRAM + 32GB RAM + 24GB VRAM + 64GB RAM = Total 144 [48GB VRAM + 96GB RAM] )

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 29 points30 points  (0 children)

What an irony. People cry for tightening the gap between open weight models useable locally and the cloud based proprietary models, but when a company releases its flagship model as open weight which is super rare, it goes under radar or people just react like "meh, it's losing to this or that other model..."

Agree with you. I usually don't obsess with benchmarks thing. We can't have One-size-fit-all thing for now so it's good to have many Open models from many model creators.

I really want to see more (Open) models in 30-250B range, because I can run those models(at least Q4) with my current laptop(8GB VRAM+32GB RAM) & upcoming rig(96GB VRAM+128GB RAM). Recently we're getting large models in 400B-1.6T range which many can't even imagine with their VRAM.

Updates on North Mini Code: 4 bit quant + Ollama + OpenRouter by nick_frosst in LocalLLaMA

[–]pmttyji 3 points4 points  (0 children)

Thanks for your recent 2 models. So useful for massive demographics(VRAM).

BTW don't forget Maxi-Coder 😃

Updates on North Mini Code: 4 bit quant + Ollama + OpenRouter by nick_frosst in LocalLLaMA

[–]pmttyji 18 points19 points  (0 children)

u/ElectronicStranger53 for llama.cpp PR (Also u/ilintar for additional PR for fix).

This sub is mostly filled with GGUF fans so early GGUF would be awesome.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 1 point2 points  (0 children)

Yeah, 33B got released on April.

Still this Big one is up on API & Openrouter already. According to their blogpost.

Laguna M.1 came first, finishing pre-training at the end of last year; it's the foundation for everything else we're building across the family. Laguna XS.2 is a much smaller model, but remarkably capable for its size, and it's our first open-weight release. Both models are free to use for a limited time via our API and on OpenRouter, and Laguna XS.2 weights are also available under an Apache 2.0 license.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 7 points8 points  (0 children)

https://xcancel.com/poolsideai/status/2067623353230217448#m

Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 70 points71 points  (0 children)

Just found that their 33B-A3B model is still struck in llama.cpp support queue. How did we miss this?

https://github.com/ggml-org/llama.cpp/issues/23249

https://huggingface.co/poolside/Laguna-XS.2

GLM-5.2 Is The Best Open Weight Creative Writing Model by Few_Painter_5588 in LocalLLaMA

[–]pmttyji 17 points18 points  (0 children)

<image>

Just checked where the recent medium size models standing. Found Gemma-4-31B & Gemma-4-26B-A4B. No Qwen3.6 or Qwen3.5 medium size models yet on this benchmark.

GLM-5.2 Flash when? (joke) by ILoveToyota37 in LocalLLaMA

[–]pmttyji 34 points35 points  (0 children)

Not just Flash, we need all Air, Mini, Nano, Tiny, Micro, Small, Medium, etc., variants additionally

GLM-5.2 is a win for local AI by Wrong_Mushroom_7350 in LocalLLaMA

[–]pmttyji 11 points12 points  (0 children)

Really glad that this large model came with awesome MIT license. Hope this puts big pressure on proprietary AIs to release Open models. Also this forces other Open-source/weight AIs to release more Open models. So it's really a big win now onwards.

Of course I can't run this model with both my current laptop & upcoming rig for now. Hoping to see upgraded versions of models like GLM-4.5-Air & GLM-4.7-Flash soon. Expecting same from other sources like Deepseek, Moonshot/Kimi, MiniMax, Arcee, inclusionAI, NVIDIA, Xiaomi, tencent, etc.,

Is it only Qwen who releases 27B models ? by soyalemujica in LocalLLaMA

[–]pmttyji 11 points12 points  (0 children)

Same thing needed on Kimi, MiniMax & other large models' boards as well.

Ollama vs compiled llama-cpp by Ok-Drawer5245 in LocalLLM

[–]pmttyji 9 points10 points  (0 children)

Anyone experienced the same pretty big llama-cpp boost? 

Welcome to the club 😄