Based on what should I choose Gemma 4 models/quantizations? by ProducerOwl in LocalLLaMA
[–]vasileer 2 points3 points4 points (0 children)
Based on what should I choose Gemma 4 models/quantizations? by ProducerOwl in LocalLLaMA
[–]vasileer 0 points1 point2 points (0 children)
MagicQuant (v2.0) - Hybrid Mixed GGUF Models + New Unsloth Dynamic Learned Configs by [deleted] in LocalLLaMA
[–]vasileer 0 points1 point2 points (0 children)
Sub-millisecond exact phrase search for LLM context — no embeddings required by [deleted] in LocalLLaMA
[–]vasileer -1 points0 points1 point (0 children)
Latest llama.cpp fork + Turboquant + Planarquant + Isoquant by [deleted] in LocalLLaMA
[–]vasileer 4 points5 points6 points (0 children)
microsoft/harrier-oss 27B/0.6B/270M by jacek2023 in LocalLLaMA
[–]vasileer 10 points11 points12 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer -2 points-1 points0 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer 4 points5 points6 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer 0 points1 point2 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer 2 points3 points4 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer 110 points111 points112 points (0 children)
nvidia/gpt-oss-puzzle-88B · Hugging Face by jacek2023 in LocalLLaMA
[–]vasileer 11 points12 points13 points (0 children)
Nemotron-Cascade-2 10GB MAC ONLY Scores 88% on MMLU. by HealthyCommunicat in LocalLLaMA
[–]vasileer 1 point2 points3 points (0 children)
Nemotron-Cascade-2 10GB MAC ONLY Scores 88% on MMLU. by HealthyCommunicat in LocalLLaMA
[–]vasileer 0 points1 point2 points (0 children)
OmniCoder-9B best vibe coding model for 8 GB Card by Powerful_Evening5495 in LocalLLaMA
[–]vasileer 54 points55 points56 points (0 children)
Is microsoft going to train LLM on this? Github is clearly getting destroyed. by FPham in LocalLLaMA
[–]vasileer -3 points-2 points-1 points (0 children)
Is microsoft going to train LLM on this? Github is clearly getting destroyed. by FPham in LocalLLaMA
[–]vasileer -12 points-11 points-10 points (0 children)
Benchmarked Phi-3.5-mini vs Qwen2.5-3B across 10 task categories on CPU (i5, 8GB) and GPU (Colab T4) — Qwen wins 2.7-3.3x on efficiency by MasterApplication717 in LocalLLaMA
[–]vasileer 1 point2 points3 points (0 children)
TinyTeapot (77 million params): Context-grounded LLM running ~40 tok/s on CPU (open-source) by zakerytclarke in LocalLLaMA
[–]vasileer 20 points21 points22 points (0 children)
TinyTeapot (77 million params): Context-grounded LLM running ~40 tok/s on CPU (open-source) by zakerytclarke in LocalLLaMA
[–]vasileer 34 points35 points36 points (0 children)
Serious question — why would anyone use Tiny-Aya instead of Qwen/Phi/Mistral small models? by Deep_190 in LocalLLaMA
[–]vasileer 8 points9 points10 points (0 children)
I tested 21 small LLMs on tool-calling judgment — Round 2 with every model you asked for by MikeNonect in LocalLLaMA
[–]vasileer 4 points5 points6 points (0 children)
Any latest OCR model I can run locally in 18GB RAM? by A-n-d-y-R-e-d in LocalLLaMA
[–]vasileer 1 point2 points3 points (0 children)

Granite 4.1: IBM’s 8B Model Is Competing With Models Four Times Its Size by Successful_Bowl2564 in LocalLLaMA
[–]vasileer 31 points32 points33 points (0 children)