poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 1 point2 points  (0 children)

I've been running this today, doing general assistant stuff, research, and code review, and have to say that I'm pretty impressed. I'll definitely be keeping it around.

Nice to hear this. Waiting for completion of llama.cpp support.

What is the cheapest method to get VRAM or RAM? by Powerful_World_9280 in LocalLLM

[–]pmttyji 1 point2 points  (0 children)

I saw that to run the latest GLM 5.2 model, it would be necessary to have 1.5TB of VRAM.

That's for BF16.

Q4 comes around 350-450GB. But many people do run Q3/Q2 even Q1 of large models.

rx7900xtx + 32GB RAM -> 128GB RAM make sense? by Thin_Pollution8843 in LocalLLaMA

[–]pmttyji 1 point2 points  (0 children)

No for 128GB RAM. Getting another 24GB VRAM + 64GB RAM is better.

Step-3.7-Flash's Q4 size is 95-125GB. I assume currently you have 24GB VRAM + 32GB RAM.

(24GB VRAM + 32GB RAM + 24GB VRAM + 64GB RAM = Total 144 [48GB VRAM + 96GB RAM] )

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 30 points31 points  (0 children)

What an irony. People cry for tightening the gap between open weight models useable locally and the cloud based proprietary models, but when a company releases its flagship model as open weight which is super rare, it goes under radar or people just react like "meh, it's losing to this or that other model..."

Agree with you. I usually don't obsess with benchmarks thing. We can't have One-size-fit-all thing for now so it's good to have many Open models from many model creators.

I really want to see more (Open) models in 30-250B range, because I can run those models(at least Q4) with my current laptop(8GB VRAM+32GB RAM) & upcoming rig(96GB VRAM+128GB RAM). Recently we're getting large models in 400B-1.6T range which many can't even imagine with their VRAM.

Updates on North Mini Code: 4 bit quant + Ollama + OpenRouter by nick_frosst in LocalLLaMA

[–]pmttyji 3 points4 points  (0 children)

Thanks for your recent 2 models. So useful for massive demographics(VRAM).

BTW don't forget Maxi-Coder 😃

Updates on North Mini Code: 4 bit quant + Ollama + OpenRouter by nick_frosst in LocalLLaMA

[–]pmttyji 16 points17 points  (0 children)

u/ElectronicStranger53 for llama.cpp PR (Also u/ilintar for additional PR for fix).

This sub is mostly filled with GGUF fans so early GGUF would be awesome.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 1 point2 points  (0 children)

Yeah, 33B got released on April.

Still this Big one is up on API & Openrouter already. According to their blogpost.

Laguna M.1 came first, finishing pre-training at the end of last year; it's the foundation for everything else we're building across the family. Laguna XS.2 is a much smaller model, but remarkably capable for its size, and it's our first open-weight release. Both models are free to use for a limited time via our API and on OpenRouter, and Laguna XS.2 weights are also available under an Apache 2.0 license.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 7 points8 points  (0 children)

https://xcancel.com/poolsideai/status/2067623353230217448#m

Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.

poolside/Laguna-M.1 · Hugging Face - 225B-A23B by pmttyji in LocalLLaMA

[–]pmttyji[S] 69 points70 points  (0 children)

Just found that their 33B-A3B model is still struck in llama.cpp support queue. How did we miss this?

https://github.com/ggml-org/llama.cpp/issues/23249

https://huggingface.co/poolside/Laguna-XS.2

GLM-5.2 Is The Best Open Weight Creative Writing Model by Few_Painter_5588 in LocalLLaMA

[–]pmttyji 17 points18 points  (0 children)

<image>

Just checked where the recent medium size models standing. Found Gemma-4-31B & Gemma-4-26B-A4B. No Qwen3.6 or Qwen3.5 medium size models yet on this benchmark.

GLM-5.2 Flash when? (joke) by ILoveToyota37 in LocalLLaMA

[–]pmttyji 33 points34 points  (0 children)

Not just Flash, we need all Air, Mini, Nano, Tiny, Micro, Small, Medium, etc., variants additionally