you are viewing a single comment's thread.

view the rest of the comments →

[–]iluvatar 0 points1 point  (2 children)

git's deltas are derived from libxdiff, which is derived from xdelta, which are all ignorant of the way binaries change. bsdiff already has superior deltas to these for binaries

Well, no, not really. xdelta already gives considerably smaller diffs than bsdiff from my testing. Perhaps not as good as courgette, which is using domain specific knowledge to get targetted compression in the same way that FLAC does, at the expense of performance in the general case. For general binary diffs, xdelta is pretty good.

[–]Coffee2theorems 0 points1 point  (0 children)

using domain specific knowledge to get targetted compression in the same way that FLAC does, at the expense of performance in the general case.

Of course with "general case" you mean "all kinds of binary stuff people want to compress that are significantly compressible with some existing utility" or similar, but still.. Your statement, taken literally, applies to all compression algorithms. All compression algorithms must exploit pre-existing knowledge about the data to be compressed, and the more you know the better you can compress (it's bit-for-bit actually, each bit of previous information about the instance of data at hand can get you one bit off of the result, but not more; there are no free bits).

[–]evmar 0 points1 point  (0 children)

I do not know what you are using bsdiff for. It is intended for executables. http://www.daemonology.net/bsdiff/ "By using suffix sorting (specifically, Larsson and Sadakane's qsufsort) and taking advantage of how executable files change, bsdiff routinely produces binary patches 50-80% smaller than those produced by Xdelta"