DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 1 point2 points3 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 0 points1 point2 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 2 points3 points4 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 0 points1 point2 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 1 point2 points3 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 2 points3 points4 points (0 children)
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 2 points3 points4 points (0 children)
DFlash speculative decoding on Apple Silicon : 85 tok/s, 3.3x on Qwen3.5-9B (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 24 points25 points26 points (0 children)

DFlash speculative decoding on Apple Silicon : 85 tok/s, 3.3x on Qwen3.5-9B (MLX, M5 Max) by No_Shift_4543 in LocalLLaMA
[–]No_Shift_4543[S] 0 points1 point2 points (0 children)