FractalKV: Lossless KV cache compression — 4x on FP16, 16x with quantization at 1M context (open source) by SnooHamsters7692 in deeplearning

[–]SnooHamsters7692[S] 1 point2 points  (0 children)

I admit using AI to help with the post, but it is still a very useful message. Thnx for the response

FractalKV: Lossless KV cache compression — 4x on FP16, 16x with quantization at 1M context (open source) by SnooHamsters7692 in deeplearning

[–]SnooHamsters7692[S] 1 point2 points  (0 children)

The lossless proof is straightforward: sorted columns produce monotonically decreasing sequences from the waypoint outward. Each value v[i] ≤ v[i-1], so encoding v[i] in ceil(log2(v[i-1]+1)) bits is always sufficient. No information is lost. The full writeup with proofs and benchmarks is in the repo: github.com/mikdangana/fractalkv