3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
1 |
================ |
2 |
LCA Tree Merging |
|
3 |
================ |
|
3514.4.30
by John Arbash Meinel
Several updates. |
4 |
|
5 |
There are 2 ways that you get LCA merge resolution in bzr. First, if you use |
|
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
6 |
``bzr merge --lca``, the *content* of files will be resolved using a Least Common |
7 |
Ancestors algorithm. That is described in <lca-merge.html> not here. |
|
3514.4.30
by John Arbash Meinel
Several updates. |
8 |
|
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
9 |
This document describes how we handle merging tree-shape when there is not |
10 |
a single unique ancestor (criss-cross merge). With a single LCA, we use |
|
11 |
simple 3-way-merge logic. |
|
3514.4.30
by John Arbash Meinel
Several updates. |
12 |
|
13 |
When there are multiple possible LCAs, we use a different algorithm for |
|
14 |
handling tree-shape merging. Described here. |
|
15 |
||
16 |
As a simple example, here is a revision graph which we will refer to often:: |
|
17 |
||
18 |
. BASE |
|
19 |
. / \ |
|
20 |
. LCA1 LCA2 |
|
21 |
. | \ / | |
|
22 |
. | X | |
|
23 |
. | / \ | |
|
24 |
. THIS OTHER |
|
25 |
||
26 |
In this graph, ``THIS`` and ``OTHER`` both have ``LCA1`` and ``LCA2`` in their |
|
27 |
ancestry but neither is an ancestor of the other, so we have 2 least common |
|
28 |
ancestors. The unique common ancestor is ``BASE``. (It should be noted that in |
|
29 |
this text we will talk directly about ``LCA1`` and ``LCA2``, but the algorithms |
|
30 |
are designed to cope with more than 2 LCAs.) |
|
31 |
||
32 |
||
33 |
Scalars |
|
34 |
======= |
|
35 |
||
36 |
Definition |
|
37 |
---------- |
|
38 |
||
39 |
I'm defining scalar values as ones that cannot be 'merged' on their own. For |
|
40 |
example, the name of a file is "scalar". If one person changes "foo.txt" to |
|
41 |
"foo.c" and someone else changes "foo.txt" to "bar.txt" we don't merge the |
|
42 |
changes to be "bar.c", we simply conflict and expect the user to sort it out. |
|
43 |
||
44 |
We use a slightly different algorithm for scalars. |
|
45 |
||
46 |
||
47 |
Resolution Algorithm |
|
48 |
-------------------- |
|
49 |
||
50 |
(This can be seen as ``bzrlib.merge.Merge3Merger._lca_multi_way``` |
|
51 |
||
52 |
1. If ``THIS`` and ``OTHER`` have the same value, use it. There is no need to |
|
53 |
inspect any other values in this case. Either nothing was changed (all |
|
54 |
interesting nodes would have the same value), or we have "accidental |
|
55 |
convergence" (both sides made the same change.). |
|
56 |
||
57 |
2. Find the values from ``LCA1`` and ``LCA2`` which are not the same as |
|
58 |
``BASE``. The idea here is to provide a rudimentary "heads" comparison. |
|
59 |
Often, the whole tree graph will have a criss-cross, but the per-file |
|
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
60 |
(per-scalar) graph would be linear, and the value in one LCA strictly |
3514.4.30
by John Arbash Meinel
Several updates. |
61 |
dominates the other. It is possible to construct a scenario where one side |
62 |
dominates the other, but the dominated value is not ``BASE``, but a second |
|
63 |
intermediate value. Most scalars are rarely changed, so this is unlikely to |
|
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
64 |
be an issue. The trade-off is having to generate and inspect the |
3514.4.30
by John Arbash Meinel
Several updates. |
65 |
per-scalar graph. |
66 |
||
67 |
If there are no LCA values that are different from ``BASE``, we use a simple |
|
68 |
3-way merge with ``BASE`` as the base value. |
|
69 |
||
70 |
3. Find the unique set of LCA values that do not include the ``BASE`` value. |
|
71 |
If there is only one unique LCA value, we again use three-way merge logic |
|
72 |
using that unique value as the base. |
|
73 |
||
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
74 |
4. At this point, we have determined that we have at least 2 unique values in |
3514.4.30
by John Arbash Meinel
Several updates. |
75 |
our LCAs which means that ``THIS`` and ``OTHER`` would both have to resolve |
76 |
the conflict. If they resolved it in the same way, we would have caught that |
|
77 |
in step 1. So they either both picked a different LCA value, or one (or |
|
78 |
both) chose a new value to use. |
|
79 |
||
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
80 |
If ``OTHER`` and ``THIS`` both picked a different LCA value, we conflict. |
3514.4.30
by John Arbash Meinel
Several updates. |
81 |
|
82 |
If ``OTHER`` and ``THIS`` both have values that are not LCA values, we also |
|
83 |
conflict. (Same as 3-way, both sides modified a value in different ways.) |
|
84 |
||
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
85 |
5. (optional) The only tricky part is this: if ``OTHER`` has a LCA value, but |
3514.4.30
by John Arbash Meinel
Several updates. |
86 |
``THIS`` does not, then we go with ``THIS``, and conversely if ``THIS`` has |
87 |
an LCA value, but ``OTHER`` does not, then we go with ``OTHER``. The idea is |
|
88 |
that ``THIS`` and ``OTHER`` may have resolved things in the same way, and |
|
89 |
then later changed the value to something newer. (They could have also |
|
90 |
resolved it differently, and then one side updated again.) |
|
91 |
||
92 |
||
93 |
``InventoryEntry.revision`` |
|
94 |
--------------------------- |
|
95 |
||
96 |
The last-modified revision for an entry gets treated differently. This is |
|
97 |
because how it is generated allows us to infer more information. Specifically, |
|
98 |
any time there is a change to an entry (rename, or content change) the last |
|
99 |
modified revision is updated. Further, if we are merging, and both sides |
|
100 |
updated the entry, then we update the last-modified revision at the merge |
|
101 |
point. |
|
102 |
||
103 |
For a picture example:: |
|
104 |
||
105 |
. A |
|
106 |
. / \ |
|
107 |
. B C |
|
108 |
. \ / |
|
109 |
. D |
|
110 |
||
111 |
||
112 |
For a single entry, the last modified revision in ``D`` is: |
|
113 |
||
114 |
1) ``A`` if neither ``B`` or ``C`` modified it |
|
115 |
2) ``B`` if ``B`` modified and ``C`` did not |
|
116 |
3) ``C`` if ``C`` modified and ``B`` did not |
|
117 |
4) ``D`` if ``B`` and ``C`` modified it |
|
118 |
||
119 |
This means that if the last modified revision is the same, there have been no |
|
3514.4.42
by John Arbash Meinel
Cleanups to the algorithm documentation. |
120 |
changes in the intermediate time. If ``OTHER`` also has the same last modified |
3514.4.30
by John Arbash Meinel
Several updates. |
121 |
revision as *any* LCA, then we know that all other LCAs' last-modified |
122 |
revisions are in the ancestry of that value. (Otherwise, when ``OTHER`` would |
|
123 |
need to create a new last modified revision as part of the merge.) |
|
124 |
||
125 |
.. |
|
126 |
vim: ft=rst tw=74 ai |