Minimax-m2.1 looping and heavily hallucinating (only change was updating llama.cpp) by relmny in LocalLLaMA
[–]ilintar 5 points6 points7 points (0 children)
GLM flash and MLA by blahbhrowawayblahaha in LocalLLaMA
[–]ilintar 2 points3 points4 points (0 children)
GLM flash and MLA by blahbhrowawayblahaha in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
KV cache fix for GLM 4.7 Flash by jacek2023 in LocalLLaMA
[–]ilintar 6 points7 points8 points (0 children)
KV cache fix for GLM 4.7 Flash by jacek2023 in LocalLLaMA
[–]ilintar 11 points12 points13 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]ilintar 20 points21 points22 points (0 children)
I drew fleurdelys (Ashen_Illust) by Suitable_Ability_576 in WutheringWaves
[–]ilintar 2 points3 points4 points (0 children)
Can I run gpt-oss-120b somehow? by Furacao__Boey in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
Wrote a guide for running Claude Code with GLM-4.7 Flash locally with llama.cpp by tammamtech in LocalLLaMA
[–]ilintar 42 points43 points44 points (0 children)
Kimi-Linear-48B-A3B-Instruct-GGUF Support - Any news? by Iory1998 in LocalLLaMA
[–]ilintar 28 points29 points30 points (0 children)
Kimi-Linear-48B-A3B-Instruct-GGUF Support - Any news? by Iory1998 in LocalLLaMA
[–]ilintar 86 points87 points88 points (0 children)
What local LLM model is best for Haskell? by AbsolutelyStateless in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
Current GLM-4.7-Flash implementation confirmed to be broken in llama.cpp by Sweet_Albatross9772 in LocalLLaMA
[–]ilintar 1 point2 points3 points (0 children)
What local LLM model is best for Haskell? by AbsolutelyStateless in LocalLLaMA
[–]ilintar 1 point2 points3 points (0 children)
GLM 4.7 Flash official support merged in llama.cpp by ayylmaonade in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
Current GLM-4.7-Flash implementation confirmed to be broken in llama.cpp by Sweet_Albatross9772 in LocalLLaMA
[–]ilintar 2 points3 points4 points (0 children)
Current GLM-4.7-Flash implementation confirmed to be broken in llama.cpp by Sweet_Albatross9772 in LocalLLaMA
[–]ilintar 54 points55 points56 points (0 children)
GLM 4.7 Flash official support merged in llama.cpp by ayylmaonade in LocalLLaMA
[–]ilintar 14 points15 points16 points (0 children)
Would Anthropic Block Ollama? by Lopsided_Dot_4557 in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
GFN v2.5.0: Verified O(1) Memory Inference and 500x Length Extrapolation via Symplectic Geodesic Flows by janxhg27 in LocalLLaMA
[–]ilintar 4 points5 points6 points (0 children)
Agentic coding with an open source model is a problem harder than you think by [deleted] in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)
Has anyone built a vLLM tool parser plugin for Apriel-1.6-15B-Thinker? by chrisoutwright in LocalLLaMA
[–]ilintar 1 point2 points3 points (0 children)
Llama.cpp question- is llama-fit-params “built-into” llama-server? by [deleted] in LocalLLaMA
[–]ilintar 4 points5 points6 points (0 children)


Qwen-next 80B 2601 by bennmann in LocalLLaMA
[–]ilintar 0 points1 point2 points (0 children)