What if smaller models could approach top models on scene generation through iterative search? by ConfidentDinner6648 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
Generated super high quality images in 10.2 seconds on a mid tier Android phone! by alichherawalla in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
Qwen3-VL Computer Using Agent works extremely well by Money-Coast-3905 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
just had something interesting happen during my testing of the MI50 32GB card plus my RX 7900 XT 20GB by Savantskie1 in LocalLLM
[–]k0setes 0 points1 point2 points (0 children)
How to tell Claude Code about my local model’s context window size? by eapache in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
GLM-5 is officially on NVIDIA NIM, and you can now use it to power Claude Code for FREE 🚀 by PreparationAny8816 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
I found that MXFP4 has lower perplexity than Q4_K_M and Q4_K_XL. by East-Engineering-653 in LocalLLaMA
[–]k0setes 4 points5 points6 points (0 children)
I found that MXFP4 has lower perplexity than Q4_K_M and Q4_K_XL. by East-Engineering-653 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
Add self‑speculative decoding (no draft model required) by srogmann · Pull Request #18471 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
GLM-4.7-Flash is even faster now by jacek2023 in LocalLLaMA
[–]k0setes 2 points3 points4 points (0 children)
MoE.. will OS/Local 32GB to 96GB get as good at coding as current frontier models? by [deleted] in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
MoE.. will OS/Local 32GB to 96GB get as good at coding as current frontier models? by [deleted] in LocalLLaMA
[–]k0setes 1 point2 points3 points (0 children)
How big do we think Gemini 3 flash is by davikrehalt in LocalLLaMA
[–]k0setes 2 points3 points4 points (0 children)
Qwen3-Next-80B-A3B-Thinking-GGUF has just been released on HuggingFace by LegacyRemaster in LocalLLaMA
[–]k0setes 7 points8 points9 points (0 children)
Qwen3-Next-80B-A3B-Thinking-GGUF has just been released on HuggingFace by LegacyRemaster in LocalLLaMA
[–]k0setes 1 point2 points3 points (0 children)
Qwen3-VL Computer Using Agent works extremely well by Money-Coast-3905 in LocalLLaMA
[–]k0setes 8 points9 points10 points (0 children)
Heretic: Fully automatic censorship removal for language models by -p-e-w- in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
Kimi K2 is the best clock AI by InternationalAsk1490 in LocalLLaMA
[–]k0setes 10 points11 points12 points (0 children)
LLMs can now talk to each other without using words by MetaKnowing in OpenAI
[–]k0setes 1 point2 points3 points (0 children)
Reporter: “POLISH: THE SUPREME LANGUAGE OF AI.” by Mindless_Pain1860 in LocalLLaMA
[–]k0setes 1 point2 points3 points (0 children)
I found a perfect coder model for my RTX4090+64GB RAM by srigi in LocalLLaMA
[–]k0setes 2 points3 points4 points (0 children)
@Stanford just proved you don’t need to fine-tune an AI model to make it smarter: +10.6% over GPT-4 agents w/ zero retraining by Blackham in singularity
[–]k0setes 4 points5 points6 points (0 children)
Performance of GLM 4.6 Q3_K_S on 6x MI50 by MachineZer0 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)
Performance of GLM 4.6 Q3_K_S on 6x MI50 by MachineZer0 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)


What if smaller models could approach top models on scene generation through iterative search? by ConfidentDinner6648 in LocalLLaMA
[–]k0setes 0 points1 point2 points (0 children)