6
by mbp at sourcefrog
import all docs from arch |
1 |
***************************************** |
2 |
Opportunities for improvement on GNU Arch |
|
3 |
***************************************** |
|
4 |
||
5 |
||
6 |
Bazaar-NG is based on the GNU Arch system, and inherits a lot of its |
|
7 |
design from Arch. However, there are several things we will change in |
|
8 |
Baz to (we hope) improve the user experience. |
|
9 |
||
10 |
The core design of Arch is good, brilliant even. It can scale from |
|
11 |
small projects too large ones, and is a good foundation for building |
|
12 |
tools on top. However, the design is far too complex, both in |
|
13 |
concepts and execution. So the plan is to cut out as many things as |
|
14 |
we can, add a few other good concepts from other systems, and try to |
|
15 |
make it into a whole that is consistent and understandable. |
|
16 |
||
17 |
||
18 |
Good bits to keep |
|
19 |
----------------- |
|
20 |
||
21 |
* Roll-up changesets |
|
22 |
||
23 |
No other system is able to express this valuable idea: "I merged all |
|
24 |
these changes from other people; here is the result." |
|
25 |
||
26 |
However, it should *also* be possible to bring in perfect-fit |
|
27 |
patches without creating a new commit. |
|
28 |
||
29 |
* Star-merge |
|
30 |
||
31 |
Find a common ancestor on diverged and cross-merged branches. |
|
32 |
||
33 |
* Apply isolated changesets. |
|
34 |
||
35 |
We should extend this by having a good way to send changesets by |
|
36 |
email, preferably readable even by people who are not using Arch. |
|
37 |
||
38 |
* GPG signing of commits. |
|
39 |
||
40 |
Open source hackers almost all have GPG keys already, and GPG deals |
|
41 |
with a lot of PKI functions to do with propagating, signing and |
|
42 |
revoking keys. |
|
43 |
||
44 |
Signed commits are interesting in many ways, not least of which in |
|
45 |
detecting intrusion to code servers. |
|
46 |
||
47 |
* Anonymous downloads can be done without an active server. |
|
48 |
||
49 |
Good for security; also very good for people who do not have a |
|
50 |
permnanently-connected machine on which they can install their own |
|
51 |
software, or which is very tightly secured. |
|
52 |
||
53 |
It's neat that you can upload over only sftp/ftp, but I'm not sure |
|
54 |
it's really worth the hassle; getting properly atomic operations |
|
55 |
over remote-file protocols is hard. |
|
56 |
||
57 |
* Clean and transparent storage format. |
|
58 |
||
59 |
This is a neat hack, and gives people assurance that they can get |
|
60 |
their data back out again even if the tool disappears. Very nice. |
|
61 |
(Bazaar-NG won't keep the exact same format, but the ideas will be |
|
62 |
similar.) |
|
63 |
||
64 |
* Relatively easily parseable/scriptable shell interface. Good for |
|
65 |
people writing web/emacs/editor/IDE interfaces, or scripts based it. |
|
66 |
||
67 |
* Automatically build (and hardlink) revision libraries, with |
|
68 |
consistency checks. |
|
69 |
||
70 |
I don't know how many people want *every* revision in a library, but |
|
71 |
it can be handy to have a few key ones. |
|
72 |
||
73 |
In general making use of hardlinks when they are available and safe |
|
74 |
is nice. |
|
75 |
||
76 |
* Rely on ssh for remote access, authentication, and confidentiality. |
|
77 |
||
78 |
* Patch headers separate from patch bodies. (Sometimes you only want |
|
79 |
one.) |
|
80 |
||
81 |
* Autogeneration of Changelogs -- but should be in GNU format, at |
|
82 |
least optionally. I'm not convinced auto-updating them in the tree |
|
254
by Martin Pool
- Doc cleanups from Magnus Therning |
83 |
is worthwhile; it makes merges weird. |
6
by mbp at sourcefrog
import all docs from arch |
84 |
|
85 |
* Sealing branches. |
|
86 |
||
87 |
It seems useful to prevent accidental commits to things that are |
|
88 |
meant to be stable. However, the set-once nature of sealing is |
|
89 |
undesirable, because people can make mistakes or want to seal more |
|
90 |
than once. |
|
91 |
||
92 |
One possibility is to have a voluntary write-protect flag set on |
|
93 |
branches that should not normally be updated. One can remove the |
|
94 |
flag if it turns out it was set wrongly. |
|
95 |
||
96 |
* ``resolved`` command in Bazaar-1.1 |
|
97 |
||
98 |
Good for preventing accidental breakage. |
|
99 |
||
100 |
* Multi-level undo -- though could perhaps be more understandable, |
|
101 |
perhaps through ``undo-history``. |
|
102 |
||
103 |
||
104 |
Bits to cut out |
|
105 |
--------------- |
|
106 |
||
107 |
One lesson from usability design is that it does not always work to |
|
108 |
have a complex model and then try to hide complexity in the user |
|
109 |
interface. If you want something to be a joy to use, that must be |
|
110 |
designed in from the bottom up. |
|
111 |
||
112 |
(Some developers may react to tla by thinking "eww, how gross" on |
|
113 |
particular points. As much as possible we might like to fix these.) |
|
114 |
||
115 |
* General impression that the tool is telling you how to run your life. |
|
116 |
||
117 |
* Non-standard terminology |
|
118 |
||
119 |
Arch uses terms like "version" and "category" in ways that are |
|
120 |
confusing to people accustomed to other version control systems. |
|
121 |
This is not helpful. |
|
122 |
||
123 |
Therefore: development proceeds on a *branch*, which is a series of |
|
124 |
*revisions*. Simple and obvious. |
|
125 |
||
126 |
* Too many commands. |
|
127 |
||
128 |
* Command-line options are wierdly inconsistent with both other |
|
129 |
systems, with each others, and with what people would like to do. |
|
130 |
For example, I would think the obvious usage is ``bzr diff [FILE]``, |
|
131 |
but ``tla diff`` does not let you specify a file at all. |
|
132 |
||
133 |
Most commands should take filenames as their argument: log, diff, |
|
134 |
add, commit, etc. |
|
135 |
||
136 |
* Despite having too many commands, there are massive and glaring |
|
137 |
gaps, such reverting a single file or a tree. |
|
138 |
||
139 |
* Commands are too different from what people are used to in CVS, and |
|
140 |
often not for a good reason. |
|
141 |
||
142 |
* Identifiers are too long. In part this is because Arch tries to |
|
143 |
have identifiers which are both human-assigned and universally unique. |
|
144 |
||
145 |
* Archive names are probably unnecessary. |
|
146 |
||
147 |
* Part of the reason for complexity in archives is that the Arch |
|
148 |
design wants to be able to go and find patches on other branches at |
|
149 |
a later time. (This is not really implemented or used at the |
|
150 |
moment.) |
|
151 |
||
152 |
I think the complexity is unjustified: changesets and revisions have |
|
153 |
universally unique names so they can simply be archived, either on |
|
154 |
the machine of the person who wants them or on a central site like |
|
155 |
supermirror. |
|
156 |
||
157 |
* The tool is *unforgiving*; if people create a branch with the wrong |
|
158 |
name it will be around forever. |
|
159 |
||
160 |
* Branches are heaviweight; a record always persists in the archive. |
|
161 |
Sometimes it is good to create micro-branches, try something out, |
|
162 |
and then discard them. If nobody wants the changes, there is no |
|
163 |
reason for the tool to keep them. |
|
164 |
||
165 |
* Working offline requires creating a new branch and merging back and |
|
166 |
forth. This is both more work than it should be, and also polutes |
|
167 |
the "story" told by branching. |
|
168 |
||
169 |
As much as possible, the *accidental* difference of the location of |
|
170 |
the repository should not effect the *semantics* of branches. |
|
171 |
||
172 |
(However, some merging may obviously be necessary when there is |
|
173 |
divergence.) |
|
174 |
||
175 |
* Archive registration. This causes confusion and is unnecessary. |
|
176 |
||
177 |
Proposed solutions such as archive aliases or an additional command |
|
178 |
to register-and-get make it worse. |
|
179 |
||
180 |
* Wierd file names (``++`` and ``,,``, which persist in user |
|
181 |
directories and cause breakage of many tools. Gives a bad |
|
182 |
impression, and it's even worse when people have to interact with |
|
183 |
them. |
|
184 |
||
185 |
* Overly-long identifiers. (One advantage of pointing to branches |
|
186 |
using filenames or URLs is that the length of the path depends on |
|
187 |
how close it is to the users location, and they can more easily use |
|
188 |
||
189 |
* Too slow by default. |
|
190 |
||
191 |
Arch can be made fast, but in the hands of a nonexpert user it is |
|
192 |
often slow. For most users, disk is cheaper than CPU time, which is |
|
193 |
cheaper than network roundtrips. The performance model should be |
|
194 |
transparent -- users should not be surprised that something is slow. |
|
195 |
||
196 |
* Tagging onto branches. |
|
197 |
||
198 |
Unifying tags and commits is interesting, but the result is hard to |
|
199 |
mentally model; even Arch maintainers can't say exactly how it is |
|
200 |
supposed to work in some cases. |
|
201 |
||
202 |
* Reinventing the world from scratch in libhackerlab/frob/pika/xl. |
|
203 |
||
204 |
Those are all fine projects and may be useful in the future, but |
|
205 |
they are totally unnecessary to write a great version control |
|
206 |
system. It is not an enormous project; it is not CPU-cycle |
|
207 |
critical; something like Python will be fine. |
|
208 |
||
209 |
* Lack (for the moment) of an active server. |
|
210 |
||
211 |
Given that network traffic is the most expensive thing, we can |
|
212 |
possibly get a better solution by having intelligence on both sides |
|
213 |
of the link. Suppose we want to get just one file from a previous |
|
214 |
revision... |
|
215 |
||
216 |
* Poor Windows/Mac support. |
|
217 |
||
218 |
Even though many developers only work on Linux, this still holds a |
|
219 |
tool back. The reason is this: at least some projects have some |
|
220 |
developers on Windows some of the time. Those projects can't switch |
|
221 |
to Arch. Most people want to only learn one tool deeply, so it |
|
222 |
won't be Arch. |
|
223 |
||
224 |
Don't make any overly Unixy assumptions. Avoid too-cute filesystem |
|
225 |
dependencies. |
|
226 |
||
227 |
Being in Python should help with portability: people do need to |
|
228 |
install it, but many developers will already have it and the total |
|
229 |
burden is possibly less than that of installing C requisite |
|
230 |
libraries. |
|
231 |
||
232 |
* Quirky filename support. |
|
233 |
||
234 |
Files with non-ascii names, or names containing whitespace tend to |
|
235 |
be handled poorly, perhaps partly because of arch's shell heritage. |
|
236 |
||
237 |
By swallowing XML we do at least get automatic quoting of wierd |
|
238 |
strings, and we will always use UTF-8 for internal storage. |
|
239 |
||
240 |
* Complex file-id-tagging |
|
241 |
||
242 |
Nobody should be expected to understand this. There are two basic |
|
243 |
cases: people want to auto-add everything, and want to add by hand. |
|
244 |
Both can be reasonably accomodated in a simpler system. |
|
245 |
||
246 |
* Complex naming-convention regexps in ``.arch-inventory`` and |
|
247 |
``{arch}/id-tagging-method``. (The fact that there are two |
|
248 |
overlapping mechanisms with very different names is also bad.) |
|
249 |
||
250 |
All this complexity basically just comes down to versioned, ignored, |
|
251 |
unknown, the same as in every other system. So we might as well |
|
252 |
just have that. |
|
253 |
||
254 |
There are relatively few cases where regexps help more than globs, |
|
255 |
and people do find them more complex. Even experienced users can |
|
256 |
forget to escape ``\.``. We can have a bit of flexibility with |
|
257 |
(say) zsh-style extended globs like ``*.(pyo|pyc)``. |
|
258 |
||
259 |
* Some files inside ``{arch}`` are meant to be edited by the user, and |
|
260 |
some are not. This is a flaw common to other systems, including |
|
261 |
Bitkeeper. The user should be clear on whether they should touch |
|
262 |
things in a directory or not. |
|
263 |
||
264 |
* Source-librarian function works poorly. |
|
265 |
||
266 |
It is not the place of a tool to force people to stay organized; it |
|
267 |
should just facilitate it. In any case, a library without |
|
268 |
descriptive text is of little use. So bazaar-ng does not force |
|
269 |
three-level naming but rather lets people arrange their own trees, |
|
270 |
and put on their own descriptions (either within the tree, or by |
|
271 |
e.g. having a wiki page listing branches, descriptions and URLs.) |
|
272 |
||
273 |
* Whining about inode mismatches on pristines/revlibs. |
|
274 |
||
275 |
It's fine that there is validation, but the tool should not show off |
|
276 |
its limitations. Just do the right thing. |
|
277 |
||
278 |
* More generally, not quite enough consistency/safety checking. |
|
279 |
||
280 |
* Unclear what commands work on subdirs and what works on the whole |
|
281 |
tree. |
|
282 |
||
283 |
* Hard to share work on a single branch -- though still not really too |
|
284 |
bad. |
|
285 |
||
286 |
* Lack of partial commits of added/deleted files. |
|
287 |
||
288 |
* Separate id tags for each file; simple implementation but probably |
|
289 |
costs too much disk space. |
|
290 |
||
291 |
* Way too many deeply-nested directories; should be just one. |
|
292 |
||
293 |
* ``.listing`` files are ugly and a point of failure. They can cause |
|
294 |
trouble on some servers which limit access to dot files. |
|
295 |
||
296 |
Isn't it possible to have the top-level file be predictable and find |
|
297 |
everything else needed from there? |
|
298 |
||
299 |
* Summary separate from log message. |
|
300 |
||
301 |
Simpler to just have one message, and let people extract the first |
|
302 |
line/sentence if they wish. |
|
303 |
||
304 |
Rather than 'keywords', let arbitrary properties be attached to the |
|
305 |
revision at the time of commit. |
|
306 |
||
307 |
||
308 |
||
309 |
Simpler disconnected operation |
|
310 |
------------------------------ |
|
311 |
||
312 |
A basic distributed VCS operation is to make it easy to work on an |
|
313 |
offline laptop. Arch can do this in a few ways, but none of them are |
|
314 |
really simple. |
|
315 |
||
316 |
http://wiki.gnuarch.org/moin.cgi/mini_5fTravellingOftenWithArch |
|
317 |
||
318 |
Yaron Minsky writes (2005-01-18): |
|
319 |
||
320 |
I was wondering what people considered to be a good setup for using |
|
321 |
Arch on a laptop. Here's the basic situation. I have a few projects |
|
322 |
that reside in arch repositories on my desktop computer. Basically, |
|
323 |
I'd like to be able to do commits from my laptop, and have those |
|
324 |
commits eventually migrate up to the main repository. I understand |
|
325 |
that the right way of doing this is to set up archives on the laptop. |
|
326 |
But what's the cleanest way of doing this? And is there some way of |
|
327 |
making the commits I do on the laptop show up cleanly and individually |
|
328 |
on the desktop once they are merged in? |
|
329 |
||
330 |
||
331 |
Tagging-method |
|
332 |
-------------- |
|
333 |
||
334 |
baz default is much less strict. |
|
335 |
||
336 |
Much of tla depends on being able to categorize files. Some hangovers |
|
337 |
from larch -- eg precious and backup are essentially the same. junk |
|
338 |
is never deleted today. |
|
339 |
||
340 |
Automatic version control with 'untagged-source source'. But this is |
|
341 |
deprecated for baz? |
|
342 |
||
343 |
Annoyed by |
|
344 |
||
345 |
- defaults |
|
346 |
- having the feature at all |
|
347 |
- complex way to define it |
|
348 |
||
349 |
Default of 166 lines. |
|
350 |
||
351 |
Remove id-tagging-method command or at most make it read-only. If |
|
352 |
people really want to use deprecated methods they can just edit the |
|
353 |
file. |
|
354 |
||
355 |
So we can ship a default id-tagging which works the same as CVS/Svn: |
|
356 |
give warnings for files that are not known to be junk. This is the |
|
357 |
default in baz right now. |
|
358 |
||
359 |
Also we have .arch-inventory, which is per-directory. |
|
360 |
||
361 |
||
362 |
||
363 |
Why not have 'baz ignore FILENAME'? To remove ignores, perhaps you |
|
364 |
have to edit the .arch-inventory. Print "FILTER added to |
|
365 |
PATH/.arch-inventory"; create and baz-add this file if it doesn't. |
|
366 |
||
367 |
Docs should perhaps emphasize .arch-inventory as the basic method and |
|
368 |
only mention =tagging-method as an advanced topic. |
|
369 |
||
370 |
||
371 |
||
372 |
Should this really be regexps, or just file globs? |