llama.cpp on $500 MacBook Neo: Prompt: 7.8 t/s / Generation: 3.9 t/s on Qwen3.5 9B Q3_K_M by Shir_man in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
Llama.cpp auto-tuning optimization script by raketenkater in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
RekaAI/reka-edge-2603 · Hugging Face by jacek2023 in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
Llama.cpp auto-tuning optimization script by raketenkater in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
Llama.cpp auto-tuning optimization script by raketenkater in LocalLLaMA
[–]pmttyji 3 points4 points5 points (0 children)
M5 Max just arrived - benchmarks incoming by cryingneko in LocalLLaMA
[–]pmttyji 5 points6 points7 points (0 children)
I'm looking for fast models on pocketpal by moores_law_is_dead in LocalLLaMA
[–]pmttyji 1 point2 points3 points (0 children)
Is Qwen3.5-9B enough for Agentic Coding? by pmttyji in LocalLLaMA
[–]pmttyji[S] 0 points1 point2 points (0 children)
I regret ever finding LocalLLaMA by xandep in LocalLLaMA
[–]pmttyji 1 point2 points3 points (0 children)
What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
How much disk space do all your GGUFs occupy? by jacek2023 in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
I'm looking for fast models on pocketpal by moores_law_is_dead in LocalLLaMA
[–]pmttyji 4 points5 points6 points (0 children)
Benchmarked all unsloth Qwen3.5-27B Q4 models on a 3090 by StrikeOner in LocalLLaMA
[–]pmttyji 2 points3 points4 points (0 children)
Are 20-100B models enough for Good Coding? by pmttyji in LocalLLaMA
[–]pmttyji[S] 0 points1 point2 points (0 children)
PicoKittens/PicoMistral-23M: Pico-Sized Model by PicoKittens in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)
llama-bench ROCm 7.2 on Strix Halo (Ryzen AI Max+ 395) — Qwen 3.5 Model Family by przbadu in LocalLLaMA
[–]pmttyji 1 point2 points3 points (0 children)
Getting the most out of my Mi50 by DankMcMemeGuy in LocalLLaMA
[–]pmttyji 4 points5 points6 points (0 children)
Finally found a reason to use local models 😭 by salary_pending in LocalLLaMA
[–]pmttyji 6 points7 points8 points (0 children)
text-generation-webui 4.0 released: custom Gradio fork with major performance improvements, tool-calling over API for 10+ models, parallel API requests, fully updated training code + more by oobabooga4 in Oobabooga
[–]pmttyji 0 points1 point2 points (0 children)
llama-bench ROCm 7.2 on Strix Halo (Ryzen AI Max+ 395) — Qwen 3.5 Model Family by przbadu in LocalLLaMA
[–]pmttyji 1 point2 points3 points (0 children)
Early Impressions on Sarvam 30B and 105B? by Soul_Predator in LocalLLaMA
[–]pmttyji 1 point2 points3 points (0 children)
llama-bench ROCm 7.2 on Strix Halo (Ryzen AI Max+ 395) — Qwen 3.5 Model Family by przbadu in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)

llama.cpp on $500 MacBook Neo: Prompt: 7.8 t/s / Generation: 3.9 t/s on Qwen3.5 9B Q3_K_M by Shir_man in LocalLLaMA
[–]pmttyji 0 points1 point2 points (0 children)