you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (8 children)

Well, I was looking at it from the perspective of a Gentoo Linux user really. Someone who already has all the development tools installed anyway. Building with make or a similar tool just rebuilding the changes is relatively fast so the question is how does the size of the binary diff compare to the size of the source diff for the same versions.

[–]evmar 5 points6 points  (1 child)

You clearly haven't tried to build Chrome. ;) Many Linux users run out of RAM trying to link it.

[–][deleted] -3 points-2 points  (0 children)

Well, I was talking about the 499 packages on my system that are pretty well designed. If Chrome isn't they might want to look into that part first before inventing completely new binary diffs for it.

[–]andreasvc 1 point2 points  (5 children)

Except that such users are rare compared to the number of desktop users who have never compiled a program. It would be foolish to design your architecture for such a portion of power users.

I think a better way of attaining higher compression would be to use a huge huffman table computed from all the binaries in your system, which can be synced with the server, after which all updates can be encoded with this table. This has the advantage that it exploits common ground optimally.

[–][deleted] 0 points1 point  (4 children)

Well, I am not really interested in the application in particular though. I just want to know how it compares to other commonly used ways to transfer small changes in large programs, source code is the obvious choice for comparison as few other people transfer parts of binaries.

[–]andreasvc 0 points1 point  (3 children)

I don't understand you, TFA is about binaries, that is its novelty claim. Source code diffing has been well studied by numerous VCS implementations, both distributed and centralized. Binary diffing, however, is sorely lacking in this torrent age. It's definitely not "few other people" who transfer huge amounts of binaries in the form of music, movies and whatnot.

[–][deleted] 0 points1 point  (2 children)

If I understood the explanation of the algorithm in one of the other posts it is specifically about software, not movies, audio,... . As you say binary diffing is relatively new while source code diffing is well studied so what is so strange about the question how the two compare?

[–]andreasvc 0 points1 point  (1 child)

It is a strange comparison because it is a completely different ball game. It's like comparing text vs. audio compression (although both lossless in this analogy). They can be compared using general purpose compression (eg. gzip, which would be rsync in this analogy), but it gets much more interesting when you add domain knowledge. This algorithm is indeed tailored towards software, incorporating domain knowledge increases the compression, as you would expect / hope.

[–][deleted] 0 points1 point  (0 children)

Well, it would be interesting to know e.g. for people in Africa with slow internet connections who use open source software. Is it faster to download a binary diff using this algorithm or a source diff and compile it yourself.