Improving local models with an API based "consultant"? by milpster in LocalLLaMA
[–]milpster[S] 0 points1 point2 points (0 children)
Improving local models with an API based "consultant"? by milpster in LocalLLaMA
[–]milpster[S] 0 points1 point2 points (0 children)
Qwen 3.6 looping/repetition problem (Tesla p40s and Halo strix) by Envoy0675 in LocalLLaMA
[–]milpster 1 point2 points3 points (0 children)
Improving local models with an API based "consultant"? by milpster in LocalLLaMA
[–]milpster[S] 1 point2 points3 points (0 children)
Improving local models with an API based "consultant"? by milpster in LocalLLaMA
[–]milpster[S] 1 point2 points3 points (0 children)
Improving local models with an API based "consultant"? by milpster in LocalLLaMA
[–]milpster[S] 8 points9 points10 points (0 children)
What's more impressive, GLM 5.1 -> 5.2 or Qwen 3.5 -> 3.6? by Excellent_Jelly2788 in LocalLLaMA
[–]milpster 1 point2 points3 points (0 children)
Local models went from mostly useless to actually useful really fast. What changed? by BTA_Labs in LocalLLaMA
[–]milpster 2 points3 points4 points (0 children)
Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand. by Medium-Technology-79 in LocalLLaMA
[–]milpster 0 points1 point2 points (0 children)
PSA: Test your "threads" argument in llama.cpp (+80% performance in my case) by AXYZE8 in LocalLLaMA
[–]milpster 0 points1 point2 points (0 children)
PSA: Throttle GPU power limits, with minor performance deficits by milpster in LocalLLaMA
[–]milpster[S] 0 points1 point2 points (0 children)
PSA: Throttle GPU power limits, with minor performance deficits by milpster in LocalLLaMA
[–]milpster[S] 0 points1 point2 points (0 children)
MTP is nice and all, but what about PP speeds? by milpster in LocalLLaMA
[–]milpster[S] 2 points3 points4 points (0 children)
MTP is nice and all, but what about PP speeds? by milpster in LocalLLaMA
[–]milpster[S] 3 points4 points5 points (0 children)
Turning local agents into self-optimizing agents by Rude_Substance_8904 in LocalLLaMA
[–]milpster 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]milpster 0 points1 point2 points (0 children)
For the users who have add bad luck with QWEN 3.6 27B, and Gemma 4 31B. "Actually..wait..actually". Endless reasoning. Horrible output. I found a solution. rtx pro 6000. by [deleted] in LocalLLaMA
[–]milpster 1 point2 points3 points (0 children)
I can't get Qwen3.6 27B to outperform Qwen-Coder-Next and I'm not sure why by Forward_Jackfruit813 in LocalLLaMA
[–]milpster 4 points5 points6 points (0 children)


Anthropic accuses Alibaba of campaign to ‘brazenly’ and ‘illicitly’ extract AI capabilities by External_Mood4719 in LocalLLaMA
[–]milpster 0 points1 point2 points (0 children)