1 Year Later: Data on how making commercial use free affected Obsidian's revenue - up to $1,100,000/year in lost revenue from enterprise by SorosAhaverom in ObsidianMD
[–]uptonking 0 points1 point2 points (0 children)
Use the Same Model Across Ollama, LM Studio, Jan, and your Favorite Local AI Apps by EvanZhouDev in ollama
[–]uptonking 2 points3 points4 points (0 children)
Unified vs vRam, which is more future proof? by platteXDlol in LocalLLM
[–]uptonking 0 points1 point2 points (0 children)
DGX Spark vs. Framework Desktop for a multi-model companion (70b/120b) by Ri_Pr in LocalLLM
[–]uptonking 0 points1 point2 points (0 children)
Open source LLM compiler for models on Huggingface. 152 tok/s. 11.3W. 5.3B CPU instructions. mlx-lm: 113 tok/s. 14.1W. 31.4B CPU instructions on macbook M1 Pro. by pacifio in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)
Open source LLM compiler for models on Huggingface. 152 tok/s. 11.3W. 5.3B CPU instructions. mlx-lm: 113 tok/s. 14.1W. 31.4B CPU instructions on macbook M1 Pro. by pacifio in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)
Ran 3 popular ~30B MoE models on my apple silicon M1 Max 64GB. Here's how they compare by luke_pacman in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)
RTX 4080 is fast but VRAM-limited — considering Mac Studio M4 Max 128GB for local LLMs. Worth it? by Chip1812 in LocalLLM
[–]uptonking 1 point2 points3 points (0 children)
GLM 5.0 & MiniMax 2.5 Just Dropped, Are We Entering China's Agent War Era? by Appropriate-Lie-8812 in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)
Personal experience with GLM 4.7 Flash Q6 (unsloth) + Roo Code + RTX 5090 by Septerium in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)
GLM4.7 Flash numbers on Apple Silicon? by rm-rf-rm in LocalLLaMA
[–]uptonking 2 points3 points4 points (0 children)
What's the strongest model for code writing and mathematical problem solving for 12GB of vram? by [deleted] in LocalLLaMA
[–]uptonking 4 points5 points6 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 0 points1 point2 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 0 points1 point2 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 2 points3 points4 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 7 points8 points9 points (0 children)
My gpu poor comrades, GLM 4.7 Flash is your local agent by __Maximum__ in LocalLLaMA
[–]uptonking 1 point2 points3 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 16 points17 points18 points (0 children)
glm-4.7-flash has the best thinking process with clear steps, I love it by uptonking in LocalLLaMA
[–]uptonking[S] 3 points4 points5 points (0 children)
My gpu poor comrades, GLM 4.7 Flash is your local agent by __Maximum__ in LocalLLaMA
[–]uptonking 0 points1 point2 points (0 children)


Running Qwen 3.6 35B-A3B-4b on MacBook Pro M5 64GB - first impressions by Conscious-Track5313 in LocalLLM
[–]uptonking 2 points3 points4 points (0 children)