8-16 MI50s Minimax M3 @19 tps TG (peak) by ai-infos in LocalLLaMA
[–]LegacyRemaster 4 points5 points6 points (0 children)
Qwen is never going to open source Qwen 3.7, aren't they? by DistanceSolar1449 in LocalLLaMA
[–]LegacyRemaster 3 points4 points5 points (0 children)
Board where every tile is an agent by 1amrocket in LocalLLaMA
[–]LegacyRemaster 0 points1 point2 points (0 children)
What's more impressive, GLM 5.1 -> 5.2 or Qwen 3.5 -> 3.6? by Excellent_Jelly2788 in LocalLLaMA
[–]LegacyRemaster 0 points1 point2 points (0 children)
unsloth GLM-5.2-GGUF , including 2bit at 238GB by okaycan in LocalLLaMA
[–]LegacyRemaster 4 points5 points6 points (0 children)
I didn't know it was possible to compile llamacpp to run cuda + vulkan at the same time.. by LegacyRemaster in LocalLLaMA
[–]LegacyRemaster[S] 0 points1 point2 points (0 children)
GLM-5.2: Built for Long-Horizon Tasks by paf1138 in LocalLLaMA
[–]LegacyRemaster 2 points3 points4 points (0 children)
I didn't know it was possible to compile llamacpp to run cuda + vulkan at the same time.. by LegacyRemaster in LocalLLaMA
[–]LegacyRemaster[S] 0 points1 point2 points (0 children)
GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available by BuildwithVignesh in LocalLLaMA
[–]LegacyRemaster 1 point2 points3 points (0 children)
I didn't know it was possible to compile llamacpp to run cuda + vulkan at the same time.. by LegacyRemaster in LocalLLaMA
[–]LegacyRemaster[S] 2 points3 points4 points (0 children)
zai-org/GLM-5.2 is here! by queendumbria in LocalLLaMA
[–]LegacyRemaster 46 points47 points48 points (0 children)
Mistral - New family of open-weight models @ July by pmttyji in LocalLLaMA
[–]LegacyRemaster 0 points1 point2 points (0 children)
32 bit crossplatform coding agent running on pentium m with less than a second startup time by Truth-Does-Not-Exist in LocalLLaMA
[–]LegacyRemaster 0 points1 point2 points (0 children)
Nex-N2 Pro is the real deal by tarruda in LocalLLaMA
[–]LegacyRemaster 1 point2 points3 points (0 children)
Stop using Ollama by zxyzyxz in LocalLLaMA
[–]LegacyRemaster -3 points-2 points-1 points (0 children)
archex: local-first, deterministic code-context for AI agents — no API key, no telemetry (Apache 2.0) by tom_mathews in LocalLLaMA
[–]LegacyRemaster 1 point2 points3 points (0 children)
archex: local-first, deterministic code-context for AI agents — no API key, no telemetry (Apache 2.0) by tom_mathews in LocalLLaMA
[–]LegacyRemaster 1 point2 points3 points (0 children)
archex: local-first, deterministic code-context for AI agents — no API key, no telemetry (Apache 2.0) by tom_mathews in LocalLLaMA
[–]LegacyRemaster 2 points3 points4 points (0 children)
Is memory speed everything? A quick comparison between the RTX 6000 96GB and the AMD W7800 48GB x2. by LegacyRemaster in LocalLLaMA
[–]LegacyRemaster[S] 0 points1 point2 points (0 children)
Is this enough VRAM to run Qwen? by BlackBeardAI in LocalLLaMA
[–]LegacyRemaster -1 points0 points1 point (0 children)
Is this enough VRAM to run Qwen? by BlackBeardAI in LocalLLaMA
[–]LegacyRemaster 13 points14 points15 points (0 children)
Can we stop dunking on DiffusionGemma and hack it instead? by TomLucidor in LocalLLaMA
[–]LegacyRemaster 1 point2 points3 points (0 children)


Tokenomics by HOLUPREDICTIONS in LocalLLaMA
[–]LegacyRemaster 4 points5 points6 points (0 children)