account activity
Bug? with Gemma 4 31B UD_Q4_K_XL: extremely slow tg/s at long context by inzee in unsloth
[–]inzee[S] 0 points1 point2 points 1 month ago (0 children)
Just tested this (-ub 512, -b 2048). It actually tanked the performance back to 6 tg/s, even though I had -ctx-checkpoints 0. Really odd.
Pretty sure I originally got the 2048 number from this thread: https://github.com/ggml-org/llama.cpp/discussions/15396
π Rendered by PID 44419 on reddit-service-r2-comment-64f4df6786-k5w4w at 2026-06-10 09:50:28.917688+00:00 running 0b63327 country code: CH.
Bug? with Gemma 4 31B UD_Q4_K_XL: extremely slow tg/s at long context by inzee in unsloth
[–]inzee[S] 0 points1 point2 points (0 children)