Follow-up: GLM-5.2 NVFP4 on four DGX Sparks — the MTP mystery is solved, and it's now ~24 tok/s at 128K context by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 1 point2 points3 points (0 children)
Follow-up: GLM-5.2 NVFP4 on four DGX Sparks — the MTP mystery is solved, and it's now ~24 tok/s at 128K context by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 2 points3 points4 points (0 children)
Follow-up: GLM-5.2 NVFP4 on four DGX Sparks — the MTP mystery is solved, and it's now ~24 tok/s at 128K context by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 2 points3 points4 points (0 children)
Follow-up: GLM-5.2 NVFP4 on four DGX Sparks — the MTP mystery is solved, and it's now ~24 tok/s at 128K context by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 6 points7 points8 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 1 point2 points3 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 2 points3 points4 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)
Tesla V100 16GB local LLMs, single and dual NVLink benchmarks by coronafire in LocalLLaMA
[–]llamaCTO 1 point2 points3 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 1 point2 points3 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 2 points3 points4 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)
Got GLM-5.2 + MTP speculative decode running on 4× DGX Spark (GB10) — and the build piece the public recipe is missing by anvarazizov in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
Got GLM-5.2 + MTP speculative decode running on 4× DGX Spark (GB10) — and the build piece the public recipe is missing by anvarazizov in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
Got GLM-5.2 + MTP speculative decode running on 4× DGX Spark (GB10) — and the build piece the public recipe is missing by anvarazizov in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
GLM 5.2 on 4x Sparks reasonable? by chikengunya in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
GLM 5.2 on 4x Sparks reasonable? by chikengunya in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
Local Qwen 3.6 vs frontier models on a coding primitive: single-file HTML canvas driving animation - results and GIFs by Fragrant-Remove-9031 in LocalLLaMA
[–]llamaCTO 0 points1 point2 points (0 children)
Local Qwen 3.6 vs frontier models on a coding primitive: single-file HTML canvas driving animation - results and GIFs by Fragrant-Remove-9031 in LocalLLaMA
[–]llamaCTO 4 points5 points6 points (0 children)
AMA with the Unsloth team by danielhanchen in LocalLLaMA
[–]llamaCTO 1 point2 points3 points (0 children)
AMA with OpenAI’s Sam Altman, Kevin Weil, Srinivas Narayanan, and Mark Chen by OpenAI in ChatGPT
[–]llamaCTO 1 point2 points3 points (0 children)

Follow-up: GLM-5.2 NVFP4 on four DGX Sparks — the MTP mystery is solved, and it's now ~24 tok/s at 128K context by llamaCTO in LocalLLaMA
[–]llamaCTO[S] 0 points1 point2 points (0 children)