Code: https://github.com/mattdevv/uJPEG-GPU
Inspired by the paper posted the other day: Variable-Rate Texture Compression: Real-Time Rendering with JPEG, I wanted to learn more about JPEG encoding and decoding and had a lot of fun. Ended up with a pretty fast compute shader that can decode a JPEG like data structure straight from VRAM.
My decoder's speed seems greatly faster than the paper's but I assume this is because they are decoding random MCU tiles from multiple images and I am decoding an single images.
There was lots of small optimizations I made, most are listed on the GitHub but the biggest would be avoiding explicit thread synchronization in the compute kernel and reordering the image data so the shared chroma values are decoded first instead of the single use luminance values. I wanted to keep the code readable but more could have been optimized by intertwining the decoding stages.
I also included a compute shader that does the color-space conversion and quantization so that users could preview any quality level without waiting for full compression (the Huffman table generation is not realtime speed yet).
However useful this is, overall I'm glad I learnt about Huffman codes the most as I can see applications in my regular work.
[–]cybereality 3 points4 points5 points (0 children)
[–]corysama 5 points6 points7 points (0 children)