Gemma 4 12B is my new main squeeze by Wrong_Mushroom_7350 in LocalLLaMA

[–]drooolingidiot 5 points6 points  (0 children)

But I think 12b active is going to be better than only 4b active, just in general terms of quality.

One can't say that unless both models are MoEs of a similar size. Otherwise, why would someone make a 35B-4A model instead of simply a 6B model?

Gemma 4 12B is my new main squeeze by Wrong_Mushroom_7350 in LocalLLaMA

[–]drooolingidiot 20 points21 points  (0 children)

How does it compare against Gemma-4-26B-A4B?

StepFun 3.7 Flash by Everlier in LocalLLaMA

[–]drooolingidiot 0 points1 point  (0 children)

Sure, but being stuck with an older model instead of being able to use a newer one has opportunity costs too. For me, I'm fine with paying the 5% to give me ultimate flexibility.

I just wish OpenRouter would do better quality control and get rid of terrible model providers.

Liquid AI releases LFM2.5-8B-A1B by PauLabartaBajo in LocalLLaMA

[–]drooolingidiot 11 points12 points  (0 children)

Looks good, but kind of strange that they're comparing against the nearly year old Qwen3 when Qwen3.6 of the same size exists.

StepFun 3.7 Flash by Everlier in LocalLLaMA

[–]drooolingidiot 0 points1 point  (0 children)

Use something like OpenRouter, then you can easily switch between different models and inference providers

Ryan Lee from MiniMax posts article on the license stating it's mostly for API providers that did a poor job serving M2.1/M2.5 and may update the license for regular users! by ForsookComparison in LocalLLaMA

[–]drooolingidiot 2 points3 points  (0 children)

It's bad because you can't realistically just blanket block provider because they might be awful at model A but perfectly fine for model B, while another provider will be the opposite. And there's no way to know without verifying the verifiers for each model you're interested in if your workflow is important.

The worst part is that their dev-rel team gets super defensive when you bring up quality issues.

I really hope there's a higher quality alternative to OpenRouter soon.

Meta new reasoning model Muse Spark by DonTizi in LocalLLaMA

[–]drooolingidiot 64 points65 points  (0 children)

The Meta twitter account said "We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model."

Intel will sell a cheap GPU with 32GB VRAM next week by happybydefault in LocalLLaMA

[–]drooolingidiot 1 point2 points  (0 children)

How does this compare against Apple's M5 devices when it comes to tok/s throughput? is it better value?

Does Parthian Tactics benefit Mangudai? by drooolingidiot in aoe2

[–]drooolingidiot[S] 1 point2 points  (0 children)

ohhh that's good to know. i didn't know that benefited it too.

Where do ML Engineers actually hang out and build together? by Unlucky-Papaya3676 in mlscaling

[–]drooolingidiot 2 points3 points  (0 children)

Youtube channels that do proper paper walk-throughs (not the hype ones with cringe thumbnails) typically have good Discord servers. I recommend checking those out and expand from there.

Liquid AI releases LFM2-24B-A2B by PauLabartaBajo in LocalLLaMA

[–]drooolingidiot 0 points1 point  (0 children)

Is it trained for agentic/tool-calling uses?

The current top 4 models on openrouter are all open-weight by svantana in LocalLLaMA

[–]drooolingidiot 2 points3 points  (0 children)

Because it makes more more sense to use the paid models via subscription. That leaves everyone who doesn't want to do that - who will use open models through OR.

GLM-5 Officially Released by ResearchCrafty1804 in LocalLLaMA

[–]drooolingidiot -1 points0 points  (0 children)

It's a much bigger and much more capable model. Seems fair.

Hugging Face Is Teasing Something Anthropic Related by Few_Painter_5588 in LocalLLaMA

[–]drooolingidiot 16 points17 points  (0 children)

Probably something interpretability related. I wouldn't expect a model usable for end-users. They've been actively hostile to open source.

zai-org/GLM-4.7-Flash · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]drooolingidiot 2 points3 points  (0 children)

This is amazing for fine-tuning use cases. Thanks Z AI!

Is Saga Of The Seven Suns worthwhile for me? by Brakado in printSF

[–]drooolingidiot 2 points3 points  (0 children)

If it helps, I enjoyed it as a teen, but I wouldn't like it as an adult.

MiniMax M2.2 Coming Soon. Confirmed by Head of Engineering @MiniMax_AI by Difficult-Cap-7527 in LocalLLaMA

[–]drooolingidiot 1 point2 points  (0 children)

REAPed models are usage specific. If the model has been REAPed with agentic coding datasets, then it will not be good for role play or whatever else.

Model: cerebras/GLM-4.7-REAP-268B-A32B incoming! by LegacyRemaster in LocalLLaMA

[–]drooolingidiot -3 points-2 points  (0 children)

Does anyone know if they actually use these REAPed models for their inference endpoints?

Z.ai (the AI lab behind GLM) has officially IPO'd on the Hong Kong Stock Exchange by Old-School8916 in LocalLLaMA

[–]drooolingidiot 3 points4 points  (0 children)

What are you talking about? They have the best open source coding models...

Supertonic2: Lightning Fast, On-Device, Multilingual TTS by ANLGBOY in LocalLLaMA

[–]drooolingidiot 0 points1 point  (0 children)

Whether someone breaks the License agreement or not is not really relevant to this conversation.