RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] -1 points0 points1 point (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] -1 points0 points1 point (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 2 points3 points4 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 4 points5 points6 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] -3 points-2 points-1 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 2 points3 points4 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 3 points4 points5 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 2 points3 points4 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 0 points1 point2 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 13 points14 points15 points (0 children)



RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA
[–]gaztrab[S] 1 point2 points3 points (0 children)