6
by mbp at sourcefrog
import all docs from arch |
1 |
*************** |
2 |
Shared branches |
|
3 |
*************** |
|
4 |
||
5 |
How much can we simulate the single-branch model of svn/cvs? GNU arch |
|
6 |
does reasonably well, but at the cost of making disconnected/connected |
|
7 |
operation leave ugly merge marks in the project history. |
|
8 |
||
9 |
This does seem to be a drawback of the otherwise-good branch/tree |
|
10 |
fusion. |
|
11 |
||
12 |
`Greg Hudson`__ makes the very good point that the Linux kernel is |
|
13 |
quite unusual among free software projects in relying on human |
|
14 |
integrators, rather than having a shared branch between the |
|
15 |
committers. The other case is very small projects (distcc?) where |
|
16 |
there is a single person with commit access. |
|
17 |
||
18 |
__ http://web.mit.edu/ghudson/thoughts/bitkeeper.whynot |
|
19 |
||
20 |
I think decentralized systems can still be very good, but they need to |
|
21 |
support this model very well: a team of committers, working on a |
|
22 |
single trunk branch, but merging from other branches as well. The |
|
23 |
Samba team is more typical of a demanding open-source project than the |
|
24 |
kernel is. |
|
25 |
||
26 |
People fundamentally want to be able to share a tree and all commit |
|
27 |
into it. It's fine to support other models (like sending patches to |
|
28 |
an owner/integrator/maintainer), but it is not reasonable to require |
|
29 |
that there be exactly one such person. |
|
30 |
||
31 |
arch-pqm is a clever kludge; shouldn't be required. Setting up email |
|
32 |
on all clients, and scripts to automatically process email on the |
|
33 |
server is certainly possible. But for some people it may not be easy; |
|
34 |
it is never trivial to debug; and particularly for Windows users it |
|
35 |
may be very hard. If you don't control your own email server or |
|
36 |
cannot easily run scripts it will be even more messy. |
|
37 |
||
38 |
The model of Bitkeeper, and the default model of Bazaar-NG, is that |
|
39 |
each branch has one owner and exists in one place. People submit |
|
40 |
their patches to the owner of the branch. This is a pretty good |
|
41 |
model, but has some limitations. |
|
42 |
||
43 |
It is a tough transition from CVS; it requires that people learn a |
|
44 |
whole new way of working as well as a new tool. |
|
45 |
||
46 |
If the person who owns the branch is away, should all integration |
|
47 |
work stop? |
|
48 |
||
49 |
I'm not totally confident in Darcs; it seems like the shared branch |
|
50 |
can reach a state that is not exactly the same as any of the |
|
51 |
checkouts. Maybe not? In other respects it is a good model. |
|
52 |
||
53 |
Possible solution: bound branches |
|
54 |
--------------------------------- |
|
55 |
||
56 |
You can make a branch which is a write-through mirror of another |
|
57 |
branch. They work like CVS checkouts in that you must be connected |
|
58 |
and fully up-to-date to commit. These can hold as much history as you |
|
59 |
like from the main branch, so you can view history, annotate, make new |
|
60 |
branches when offline. If you are offline for a long time then make a |
|
61 |
new branch, then integrate it back later. |
|
62 |
||
63 |
The effect is similar to systems which have separate working copies |
|
64 |
and storage: we have one branch used primarily as a workarea, and |
|
65 |
another used as storage. I think that making them both branches |
|
66 |
is possibly cleaner: it expresses the way some state may be held |
|
67 |
locally, allows offline branching from the bound branch, etc. |
|
68 |
||
69 |
:: |
|
70 |
||
71 |
$ bzr get http://foo.net/foo ./foo-main |
|
72 |
$ cd foo-main |
|
73 |
(make some changes) |
|
74 |
$ bzr commit -m 'Add my neat feature' |
|
75 |
bzr: error: this branch is not-up-to-date with master |
|
76 |
http://foo.net/foo |
|
77 |
- run 'bzr update' |
|
78 |
- or detach this branch from its parent to work independently |
|
79 |
$ bzr update |
|
80 |
- setting aside local changes |
|
81 |
- pulling changes from parent |
|
82 |
- putting back local changes |
|
83 |
$ bzr commit -m 'Add my neat feature' |
|
84 |
||
85 |
The ``get`` command is very similar to ``branch``, but means that |
|
86 |
commits will be written back. |
|
87 |
||
88 |
This means that the ``pull`` or ``update`` command must be able to |
|
89 |
work even when there are local changes, by setting them aside. This |
|
90 |
in turn means that local changes may conflict with remote changes, and |
|
91 |
that has to be resolved -- that is no worse than in pulling in changes |
|
92 |
from elsewhere. |
|
93 |
||
94 |
commit to a master repository should fail if the slave's patch history |
|
95 |
is not exactly equal to that of the master -- the slave has diverged. |
|
96 |
If the slave history is a prefix of the master history then an update |
|
97 |
will do it. Otherwise, they have diverged and the slave gets |
|
98 |
detached. |
|
99 |
||
100 |
We can have shared server-side hooks, even for access control, at |
|
101 |
least in principle. |
|
102 |
||
103 |
Perhaps ``bind`` is a nicer word than ``slave``. |
|
104 |
||
105 |
It is conceptually possible to take an existing branch and bind it to |
|
106 |
another, as long as the history of the first is a prefix of that of |
|
107 |
the second. One could also unbind a branch. These should at most be |
|
108 |
advanced commands, because in general it is simpler just to make a new |
|
109 |
branch with the desired properties. |
|
110 |
||
111 |
Difficulties |
|
112 |
~~~~~~~~~~~~ |
|
113 |
||
114 |
It may be a bit hard to get a remote commit exactly right, |
|
115 |
particularly if we want to keep a working copy there. If working |
|
116 |
copies are optional, then turning it off will keep things simple but |
|
117 |
not totally avoid the problem. |
|
118 |
||
119 |
I think this is just an example of a general problem though: there is |
|
120 |
no totally satisfactory way to atomically update a working copy. |
|
121 |
(What happens to cvs or svn if you interrupt while it's doing this? |
|
122 |
Generally you get a broken wc.) What we can do is create a lock while |
|
123 |
the update is taking place, which can be used to at least detect the |
|
124 |
problem. |
|
125 |
||
126 |
If changes are allowed to the working copy of the master branch then |
|
127 |
they might conflict with what is committed by the slave. Should those |
|
128 |
changes be merged into the working copy of the parent? If so, they |
|
129 |
might conflict. |
|
130 |
||
131 |
(darcs handles this case by inserting conflict markers in the remote |
|
132 |
file, which seems unsatisfactory to me.) |
|
133 |
||
134 |
So I am inclined to say that at the moment of pushing to a master |
|
135 |
branch, the destination should be clean. If we have explicit edits |
|
136 |
then probably there should be no editable files there. (We could |
|
137 |
perhaps make all the files read-only for the duration of the update.) |