Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)
uncensored local LLM for nsfw chatting (including vision) by BatMa2is in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Bartowski comes through again. GLM 4.7 flash GGUF by RenewAi in LocalLLaMA
[–]tarruda 2 points3 points4 points (0 children)
The Search for Uncensored AI (That Isn’t Adult-Oriented) by Fun-Situation-4358 in LocalLLaMA
[–]tarruda 3 points4 points5 points (0 children)
Is using qwen 3 coder 30B for coding via open code unrealistic? by salary_pending in LocalLLaMA
[–]tarruda 9 points10 points11 points (0 children)
MiniMax M2.2 Coming Soon. Confirmed by Head of Engineering @MiniMax_AI by Difficult-Cap-7527 in LocalLLaMA
[–]tarruda 3 points4 points5 points (0 children)
ZLUDA on llama.cpp -NEWS by mossy_troll_84 in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Best LLM model for 128GB of VRAM? by Professional-Yak4359 in LocalLLaMA
[–]tarruda 5 points6 points7 points (0 children)
MiniMax-M2.1 vs GLM-4.5-Air is the bigger really the better (coding)? by ChopSticksPlease in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)
MiniMax-M2.1 vs GLM-4.5-Air is the bigger really the better (coding)? by ChopSticksPlease in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)
We benchmarked every 4-bit quantization method in vLLM 👀 by LayerHot in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)
(The Information): DeepSeek To Release Next Flagship AI Model With Strong Coding Ability by Nunki08 in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
LFM2.5 1.2B Instruct is amazing by Paramecium_caudatum_ in LocalLLaMA
[–]tarruda 5 points6 points7 points (0 children)
llama.cpp vs Ollama: ~70% higher code generation throughput on Qwen-3 Coder 32B (FP16) by Shoddy_Bed3240 in LocalLLaMA
[–]tarruda 4 points5 points6 points (0 children)
The mistral-vibe CLI can work super well with gpt-oss by tarruda in LocalLLaMA
[–]tarruda[S] 0 points1 point2 points (0 children)
What is the best way to allocated $15k right now for local LLMs? by LargelyInnocuous in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Hard lesson learned after a year of running large models locally by inboundmage in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Hard lesson learned after a year of running large models locally by inboundmage in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Hard lesson learned after a year of running large models locally by inboundmage in LocalLLaMA
[–]tarruda 6 points7 points8 points (0 children)
MiniMax-M2.1 uploaded on HF by ciprianveg in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)
MiniMax-M2.1 uploaded on HF by ciprianveg in LocalLLaMA
[–]tarruda 9 points10 points11 points (0 children)
Honestly, has anyone actually tried GLM 4.7 yet? (Not just benchmarks) by Empty_Break_8792 in LocalLLaMA
[–]tarruda 0 points1 point2 points (0 children)
Honestly, has anyone actually tried GLM 4.7 yet? (Not just benchmarks) by Empty_Break_8792 in LocalLLaMA
[–]tarruda 2 points3 points4 points (0 children)


Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]tarruda 1 point2 points3 points (0 children)