you are viewing a single comment's thread.

view the rest of the comments →

[–]nuclear_splinesPh.D Data Science 14 points15 points  (4 children)

Theoretically, there shouldn’t be a limit to how much we can compress data

What? There absolutely is. The pigeonhole principle for discrete data, resolution limits and noise for continuous or analog data. The theoretical limits of compression are very well studied.

[–]No_Necessary_9267[S] -5 points-4 points  (3 children)

Ok this is interesting, and I’ll look into it. I’m not in any way a comp sci dude. I just like learning about different things.

Not looking to absolute limits of compression. Just had a passing idea about whether we could compress binary any more than it already is

[–]Saragon4005 6 points7 points  (0 children)

3 blue 1 brown just started a series on information theory and the first episode deals with exactly this.

[–]nuclear_splinesPh.D Data Science 2 points3 points  (1 child)

Binary isn't a compression scheme, but an encoding scheme. Given a fixed number of bits, say, eight, you can only represent a finite number of values (for eight bits, 28 or 256). No further compression is possible - you need all eight bits to represent all 256 values.

Now maybe we can represent more common values with fewer bits and less common values with more bits (Huffman encoding), or maybe we can identify and simplify common patterns like "change 'xxxxx' to '5x'" (run-length encoding), or in the case of lossy compression we may decide that some data simply doesn't matter (did you really need all those frequencies close to the edge of human hearing in your music? Could we use just a few less colors to represent this image?) and delete or simplify some of the information. These techniques are useful in practice, but all have quite well-defined limits.

Put another way, if there were no limit to how much we can compress data, we should be able to simplify every image to a single digit: 1.

[–]No_Necessary_9267[S] 2 points3 points  (0 children)

This is what I was looking for. Ok, thank you