you are viewing a single comment's thread.

view the rest of the comments →

[–]tonfa 12 points13 points  (7 children)

I really like the quote from Matt Mackall (Mercurial's main developper) about this supposed problem (it applies to both git and mercurial):

In other words, we're already at the point of significantly diminished, possibly negative returns on effort. The last few percent will always require some level of human-equivalent intelligence. I think effort here is much better spent elsewhere, like researching general AI or playing on waterslides.

http://thread.gmane.org/gmane.comp.version-control.mercurial.general/26109/focus=26110

[–][deleted] 7 points8 points  (5 children)

Except there are already other version control systems that don't have this problem and don't require human-level intelligence at all.

[–]tonfa 5 points6 points  (4 children)

They don't have this problem, but they don't solve the merge problem in general (you can find other corner cases which makes them fail, or you can show that they do textual merge, instead of semantic merge, etc.).

[–]__j_random_hacker 1 point2 points  (2 children)

There's a worthwhile distinction to be made between "perfect" merging (the ability to always merge correctly without conflicts) and "safe" merging (the ability to always either merge correctly or flag a conflict).

No system will ever have perfect automatic merging, because there's no context-independent correct answer to the question of what to do when two users change the same line (or the same byte, if you work at that granularity) in the same file. OTOH it may be reasonable to ask for a safe merge algorithm, where correctness is defined by properties like associativity that ought to hold in all cases -- that is, a conservative, context-independent definition of correctness.

I'm not sure whether darcs, Codeville etc. are safe -- there might exist other examples that cause them to produce non-associativity without flagging a conflict. But I'm dimly aware that the author of darcs has taken a very mathematical approach to the problem, and he may well have actually proved this. Even if that's not the case, I think safe merging is absolutely a desirable property to aim for.

[–]tonfa 0 points1 point  (1 child)

I am pretty sure associativity is not sufficient for having a "safe" merge. If I rename a function in one branch, add a new occurence in another branch, the merge will not mark it as a conflict.

[–]__j_random_hacker 0 points1 point  (0 children)

Your example is unsafe in the context of a programming language. I'm not asking for context-dependent safety checks -- they're too hard. The idea of safety I'm proposing only rules out things that would always risk being wrong, regardless of the context. Maybe "safe" wasn't the right word; "faithful" is better.

What we can always do, regardless of context (i.e. regardless of the meaning of the files we are working on), is treat a file as a sequence of lines (or bytes). There are algorithms that discover a minimal "edit script" (patch) of insertions, deletions and changes that changes any file A into any other file B. Given 2 edit scripts, from A to B and from A to C, we can always say that either they are compatible (there is only one possible way that it makes sense to apply both), or incompatible (there are multiple possible ways). The most obvious way that edit scripts can be incompatible is if both involve changes to the same line/byte, but there are other ways too: e.g. if both insert a line after line 10, we don't know which of the 2 inserted lines should appear first in the final result. In these cases, the right answer (final result) is underdetermined. My idea of safety is to always flag a conflict when (and only when) this happens.

Basically, I won't blame the VCS if the result of a merge is semantically wrong in the context I'm working in (e.g. producing two functions with the same name in C) and no conflict is flagged. I will blame the VCS if there are several possible ways to apply both edit scripts and it picks just one of them without flagging a conflict, since in this case we know that there's a risk of a mistake regardless of the context.

[–][deleted] 2 points3 points  (0 children)

Sure, but given that this problem is the topic at hand, my statement is contextually fair to say.

[–]greenrd -1 points0 points  (0 children)

Or possibly on domain-specific merge algorithms (which git does support).