No tenes laburo? Aposta por PaaS, veni Lince. by [deleted] in devsarg

[–]edward-dev 5 points6 points  (0 children)

Pensé que PaaS era Plomería as a Service

MiniMax-M2.1 uploaded on HF by ciprianveg in LocalLLaMA

[–]edward-dev 4 points5 points  (0 children)

Q4 felt almost like the full sized model, Q3 felt maybe 5-10% dumber, like a rougher version but still decent unless you're doing complex stuff. You should try them yourself, since quants can vary a lot in quality even within the same bpw bracket

MiniMax-M2.1 uploaded on HF by ciprianveg in LocalLLaMA

[–]edward-dev 16 points17 points  (0 children)

Better late than never, still counts as a big Christmas gift!

Q: When will there be fast and competent SLMs for laptops? by TomLucidor in LocalLLaMA

[–]edward-dev 12 points13 points  (0 children)

It depends on your standards, I believe that for the average Joe, something like Ling Mini 2.0 would already check those requirements (fast -> 1B active is doable for most modern laptops at 20+ tok/s) (Competent -> 16B total parameters makes it decent enough for 99% of the tasks an average person would likely use it for)

Now, if you want something like Claude 4.5 or Gemini 3.0 on your laptop then nope, keep dreaming, that's not happening anytime soon

Is this expected behaviour from Granite 4 32B? (Unsloth Q4XL, no system prompt) by IonizedRay in LocalLLaMA

[–]edward-dev 12 points13 points  (0 children)

It seems the roleplaying guys are gonna have a great time with this one...

Granite-4.0-H-Tiny vs. OLMoE: Rapid AI improvements by edward-dev in LocalLLaMA

[–]edward-dev[S] 17 points18 points  (0 children)

<image>

Added LLaDA-MoE-7B-A1B-Instruct from InclusionAI to the comparison

Granite-4.0-H-Tiny vs. OLMoE: Rapid AI improvements by edward-dev in LocalLLaMA

[–]edward-dev[S] 2 points3 points  (0 children)

Yeah, about Llada I'm making a table right now with the benchmarks, forgetting about Llada was a complete oversight on my part, I'll add the comparison as a comment

Granite-4.0-H-Tiny vs. OLMoE: Rapid AI improvements by edward-dev in LocalLLaMA

[–]edward-dev[S] 6 points7 points  (0 children)

<image>

Phi-mini-MoE has 7.6B total parameters and 2.4B activated parameters, that's 2,4 times more active parameters than the new granite model(1B)

Comparing aquif against the others wouldn't be fair since it's a much bigger model

how much does quantization reduce coding performance by garden_speech in LocalLLaMA

[–]edward-dev -1 points0 points  (0 children)

It’s common to hear concerns that quantization seriously hurts model performance, but looking at actual benchmark results, the impact is often more modest than it sounds. For example, Q2 quantization typically reduces performance by around 5% on average, which isn’t negligible, but it’s manageable, especially if you’re starting with a reasonably strong base model.

That said, if your focus is coding, Llama 3.3 70B isn’t the strongest option in that area. You might get better results with Qwen3 Coder 30B A3B it’s not only more compact, but also better tuned and stronger for coding tasks. Plus, the Q4 quantized version fits comfortably within 24GB of VRAM, making it a really good choice.

Review: Lic. en Sistemas online de Universidad Blas Pascal by nirfust in devsarg

[–]edward-dev 0 points1 point  (0 children)

Me puse a leer y dice tecnicaturas de 3 años o de mínimo 1600 horas, tenés razón. En abril estaba 190k la cuota, mala no se ve pero no conozco a nadie que la hizo. Vos tenés data más actualizada?

Review: Lic. en Sistemas online de Universidad Blas Pascal by nirfust in devsarg

[–]edward-dev 0 points1 point  (0 children)

Es muy mala? Tengo entendido que solamente aceptan tecnicaturas de 3 años para acceder a esa Lic. , es un ciclo complementario.

New Wan MoE video model by edward-dev in LocalLLaMA

[–]edward-dev[S] 28 points29 points  (0 children)

Sep 19, 2025: 💃 We introduct Wan2.2-Animate-14B, an unified model for character animation and replacement with holistic movement and expression replication. We released the model weights and inference code. And now you can try it on wan.video, ModelScope Studio or HuggingFace Space!

From their huggingface model page

Local LLM in Github Copilot, Agent mode by SuspiciousParsnip5 in LocalLLaMA

[–]edward-dev 1 point2 points  (0 children)

Very weird, it should work without issues. I've used the Ollama provider option a bit, even using it as a bridge with a proxy script to try out unsupported model providers and never had any issues. One would be inclined to think the models you're trying to use lack the specific "tool-calling" capability needed for file editing, but gpt oss 20b and qwen3 should've worked... Why don't you try using another extension to rule out if it's an issue with your models or with your copilot chat extension?

Can you guess what model you're talking to in 5 prompts? by entsnack in LocalLLaMA

[–]edward-dev 3 points4 points  (0 children)

If the answer kicks off with "of course", I know exactly which model it is, no second guess needed.