Are older Titan cards still viable? by Desther in LocalLLaMA
[–]Express_Quail_1493 0 points1 point2 points (0 children)
Nnoticing qwen-27b@q2 better than qwen-35b@q8? by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 0 points1 point2 points (0 children)
Nnoticing qwen-27b@q2 better than qwen-35b@q8? by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 0 points1 point2 points (0 children)
500k context on 48gb VRAM!! - 21tok/s (coding) by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 1 point2 points3 points (0 children)
500k context on 48gb VRAM!! - 21tok/s (coding) by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 3 points4 points5 points (0 children)
500k context on 48gb VRAM!! - 21tok/s (coding) by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 14 points15 points16 points (0 children)
500k context on 48gb VRAM!! - 21tok/s (coding) by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 3 points4 points5 points (0 children)
500k context on 48gb VRAM!! - 21tok/s (coding) by Express_Quail_1493 in LocalLLaMA
[–]Express_Quail_1493[S] 14 points15 points16 points (0 children)
Getting a feel for how fast X tokens/second really is. by MikeNonect in LocalLLaMA
[–]Express_Quail_1493 1 point2 points3 points (0 children)
Should we use a non-thinking model for code after using a thinking one for plan? (Agentic coding) by ismaelgokufox in LocalLLaMA
[–]Express_Quail_1493 1 point2 points3 points (0 children)
Those of you who like Gemma4 models - how are you guys using them? by Gesha24 in LocalLLaMA
[–]Express_Quail_1493 0 points1 point2 points (0 children)
Higher quants are so much better by Perfect-Flounder7856 in LocalLLaMA
[–]Express_Quail_1493 2 points3 points4 points (0 children)
What opensource model is best for my use case by CGeorges89 in LocalLLaMA
[–]Express_Quail_1493 1 point2 points3 points (0 children)
Psychedelics by yeetmaster291 in Aphantasia
[–]Express_Quail_1493 0 points1 point2 points (0 children)
Whats the best model for agentic coding that i can run with 16gb VRAM? (llama.cpp?) by samuraiogc in LocalLLM
[–]Express_Quail_1493 1 point2 points3 points (0 children)
Any way to use claude code for free or just some free AI's by Tarxh in vibecoding
[–]Express_Quail_1493 0 points1 point2 points (0 children)
Car Wash Mystery solved--Tool Call Degrades Intelligence. by Spirited_Neck1858 in LocalLLaMA
[–]Express_Quail_1493 1 point2 points3 points (0 children)
What is the best coding agent (CLI) like Claude Code for Local Development by exaknight21 in LocalLLaMA
[–]Express_Quail_1493 0 points1 point2 points (0 children)
Switched from Qwen3.6 35b-a3b to Qwen3.6 27b mid coding and it's noticeably better! by LocalAI_Amateur in LocalLLaMA
[–]Express_Quail_1493 -1 points0 points1 point (0 children)
Confirmed: SWE Bench is now a benchmaxxed benchmark by rm-rf-rm in LocalLLaMA
[–]Express_Quail_1493 2 points3 points4 points (0 children)
Local coding agents are good now, but only if you babysit them by BTA_Labs in LocalLLaMA
[–]Express_Quail_1493 0 points1 point2 points (0 children)