~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to doc/todo-from-arch.txt

Committer: mbp at sourcefrog
Date: 2005-03-09 04:08:15 UTC
Revision ID: mbp@sourcefrog.net-20050309040815-13242001617e4a06

import from baz patch-364

files added:
bzrlib/tests.py

files removed:
.bzrignore

.rsyncexclude

HACKING

Makefile

NEWS

TODO

build-api

bzr-man.py

bzrlib/add.py

bzrlib/atomicfile.py

bzrlib/builtins.py

bzrlib/changeset.py

bzrlib/commit.py

bzrlib/delta.py

bzrlib/externalcommand.py

bzrlib/fetch.py

bzrlib/graph.py

bzrlib/hashcache.py

bzrlib/help.py

bzrlib/info.py

bzrlib/intset.py

bzrlib/lock.py

bzrlib/log.py

bzrlib/mdiff.py

bzrlib/merge.py

bzrlib/merge3.py

bzrlib/merge_core.py

bzrlib/meta_store.py

bzrlib/missing.py

bzrlib/msgeditor.py

bzrlib/newinventory.py

bzrlib/patch.py

bzrlib/plugin.py

bzrlib/plugins

bzrlib/plugins/__init__.py

bzrlib/progress.py

bzrlib/remotebranch.py

bzrlib/revfile.py

bzrlib/revisionspec.py

bzrlib/selftest

bzrlib/selftest/HTTPTestUtil.py

bzrlib/selftest/TestUtil.py

bzrlib/selftest/__init__.py

bzrlib/selftest/blackbox.py

bzrlib/selftest/plugins.py

bzrlib/selftest/test_bad_files.py

bzrlib/selftest/test_merge_core.py

bzrlib/selftest/test_parent.py

bzrlib/selftest/test_revision_info.py

bzrlib/selftest/test_smart_add.py

bzrlib/selftest/test_xml.py

bzrlib/selftest/testbranch.py

bzrlib/selftest/testdiff.py

bzrlib/selftest/testfetch.py

bzrlib/selftest/testgraph.py

bzrlib/selftest/testhashcache.py

bzrlib/selftest/testinv.py

bzrlib/selftest/testlog.py

bzrlib/selftest/testmerge.py

bzrlib/selftest/testmerge3.py

bzrlib/selftest/testremotebranch.py

bzrlib/selftest/testrevision.py

bzrlib/selftest/testrevisionnamespaces.py

bzrlib/selftest/teststatus.py

bzrlib/selftest/teststore.py

bzrlib/selftest/versioning.py

bzrlib/selftest/whitebox.py

bzrlib/shellcomplete.py

bzrlib/status.py

bzrlib/textinv.py

bzrlib/ui.py

bzrlib/upgrade.py

bzrlib/util

bzrlib/util/__init__.py

bzrlib/util/effbot

bzrlib/util/effbot/__init__.py

bzrlib/util/effbot/org

bzrlib/util/effbot/org/__init__.py

bzrlib/util/effbot/org/gzip_consumer.py

bzrlib/util/effbot/org/http_client.py

bzrlib/util/effbot/org/http_manager.py

bzrlib/util/elementtree

bzrlib/util/elementtree/ElementTree.py

bzrlib/util/elementtree/__init__.py

bzrlib/util/urlgrabber

bzrlib/util/urlgrabber/__init__.py

bzrlib/util/urlgrabber/byterange.py

bzrlib/util/urlgrabber/grabber.py

bzrlib/util/urlgrabber/keepalive.py

bzrlib/util/urlgrabber/mirror.py

bzrlib/util/urlgrabber/progress.py

bzrlib/weave.py

bzrlib/weavefile.py

bzrlib/workingtree.py

contrib

contrib/add-bzr-to-baz

contrib/bash

contrib/bash/bzr

contrib/bash/bzr.simple

contrib/create_bzr_rollup.py

contrib/emacs

contrib/emacs/bzr-mode.el

contrib/fortune

contrib/pwclient.full

contrib/pwk

contrib/upload-bzr.dev

contrib/zsh

contrib/zsh/_bzr

doc/Makefile

doc/adoption.txt

doc/bitkeeper.txt

doc/changelogs.txt

doc/cherry-picking.txt

doc/cmdref.txt

doc/common-format.txt

doc/compared-aegis.txt

doc/compared-codeville.txt

doc/compared-cvsnt.txt

doc/compared-opencm.txt

doc/compared-prcs.txt

doc/compared-teamware.txt

doc/compression.txt

doc/config-specs.txt

doc/conflicts.txt

doc/costs.txt

doc/darcs.txt

doc/deadly-sins.txt

doc/default.css

doc/design.txt

doc/extra-commands.txt

doc/formats.txt

doc/hashes.txt

doc/ignore.txt

doc/index.txt

doc/interrupted.txt

doc/intro.txt

doc/inventory.txt

doc/join-branches.txt

doc/kill-version.txt

doc/layers.txt

doc/library-interface.txt

doc/merge.txt

doc/mirroring.txt

doc/monotone.txt

doc/news.txt

doc/optional-edit.txt

doc/partial-commit.txt

doc/pool.txt

doc/purpose.txt

doc/python.txt

doc/quilt.txt

doc/quotes.txt

doc/random.txt

doc/requirements.txt

doc/revfile-annotation.txt

doc/revfile.txt

doc/revision-syntax.txt

doc/rollup.txt

doc/scalability.txt

doc/security.txt

doc/shared-branches.txt

doc/short-demo.txt

doc/split-join-files.txt

doc/supportability.txt

doc/svk.txt

doc/switch-in-branch.txt

doc/tagging.txt

doc/taxonomy.txt

doc/thanks.txt

doc/todo-from-arch.txt

doc/unchanged.txt

doc/unrelated-merge.txt

doc/usability.txt

doc/use-cases.txt

doc/web-interface.txt

doc/workflow.txt

doc/yaml.txt

notes

notes/inventory-v2-sample.xml

notes/inventory-v2.rnc

notes/new-inventory-sample.xml

notes/performance.txt

notes/revfile.txt

notes/schemas.xml

patches

patches/annotate3.patch

patches/annotate4.patch

patches/cache-remote-revisions.diff

patches/find-touching-from-seq.diff

patches/meta-data-in-inventory.patch

patches/ndiff.patch

patches/pending-merge.patch

patches/plugins-no-plugins.patch

patches/progress.diff

patches/symlink-support.patch

setup.py

testbzr

testsweet.py

tools

tools/convertfile.py

tools/convertinv.py

tools/history2revfiles.py

tools/history2weaves.py

tools/http_client.py

tools/testweave.py

tools/weavebench.py

tools/weavemerge.sh

tutorial.txt

files renamed:
bzrlib/commands.py => bzr.py

files modified:
README

bzrlib/__init__.py

bzrlib/branch.py

bzrlib/check.py

bzrlib/diff.py

bzrlib/errors.py

bzrlib/inventory.py

bzrlib/osutils.py

bzrlib/revision.py

bzrlib/store.py

bzrlib/textui.py

bzrlib/trace.py

bzrlib/tree.py

bzrlib/xml.py

Show diffs side-by-side

added added

removed removed

doc/todo-from-arch.txt

*****************************************

Opportunities for improvement on GNU Arch

*****************************************

[note that this document is rather out of date in 2005-08]

GNU Arch is one influence on bazaar-ng. There are several things we

would change from Arch in Bazaar to (we hope) improve the user

experience.

The core design of Arch is good, brilliant even. It can scale from

small projects too large ones, and is a good foundation for building

tools on top. However, the design is far too complex, both in

concepts and execution. So the plan is to cut out as many things as

we can, add a few other good concepts from other systems, and try to

make it into a whole that is consistent and understandable.

Good bits to keep

-----------------

* Roll-up changesets

No other system is able to express this valuable idea: "I merged all

these changes from other people; here is the result."

However, it should *also* be possible to bring in perfect-fit

patches without creating a new commit.

* Star-merge

Find a common ancestor on diverged and cross-merged branches.

* Apply isolated changesets.

We should extend this by having a good way to send changesets by

email, preferably readable even by people who are not using Arch.

* GPG signing of commits.

Open source hackers almost all have GPG keys already, and GPG deals

with a lot of PKI functions to do with propagating, signing and

revoking keys.

Signed commits are interesting in many ways, not least of which in

detecting intrusion to code servers.

* Anonymous downloads can be done without an active server.

Good for security; also very good for people who do not have a

permnanently-connected machine on which they can install their own

software, or which is very tightly secured.

It's neat that you can upload over only sftp/ftp, but I'm not sure

it's really worth the hassle; getting properly atomic operations

over remote-file protocols is hard.

* Clean and transparent storage format.

This is a neat hack, and gives people assurance that they can get

their data back out again even if the tool disappears. Very nice.

(Bazaar-NG won't keep the exact same format, but the ideas will be

similar.)

* Relatively easily parseable/scriptable shell interface. Good for

people writing web/emacs/editor/IDE interfaces, or scripts based it.

* Automatically build (and hardlink) revision libraries, with

consistency checks.

I don't know how many people want *every* revision in a library, but

it can be handy to have a few key ones.

In general making use of hardlinks when they are available and safe

is nice.

* Rely on ssh for remote access, authentication, and confidentiality.

* Patch headers separate from patch bodies. (Sometimes you only want

one.)

* Autogeneration of Changelogs -- but should be in GNU format, at

least optionally. I'm not convinced auto-updating them in the tree

is worthwhile; it makes merges weird.

* Sealing branches.

It seems useful to prevent accidental commits to things that are

meant to be stable. However, the set-once nature of sealing is

undesirable, because people can make mistakes or want to seal more

than once.

One possibility is to have a voluntary write-protect flag set on

branches that should not normally be updated. One can remove the

flag if it turns out it was set wrongly.

* ``resolved`` command in Bazaar-1.1

Good for preventing accidental breakage.

100

101

* Multi-level undo -- though could perhaps be more understandable,

102

perhaps through ``undo-history``.

103

104

105

Bits to cut out

106

---------------

107

108

One lesson from usability design is that it does not always work to

109

have a complex model and then try to hide complexity in the user

110

interface. If you want something to be a joy to use, that must be

111

designed in from the bottom up.

112

113

(Some developers may react to tla by thinking "eww, how gross" on

114

particular points. As much as possible we might like to fix these.)

115

116

* General impression that the tool is telling you how to run your life.

117

118

* Non-standard terminology

119

120

Arch uses terms like "version" and "category" in ways that are

121

confusing to people accustomed to other version control systems.

122

This is not helpful.

123

124

Therefore: development proceeds on a *branch*, which is a series of

125

*revisions*. Simple and obvious.

126

127

* Too many commands.

128

129

* Command-line options are wierdly inconsistent with both other

130

systems, with each others, and with what people would like to do.

131

For example, I would think the obvious usage is ``bzr diff [FILE]``,

132

but ``tla diff`` does not let you specify a file at all.

133

134

Most commands should take filenames as their argument: log, diff,

135

add, commit, etc.

136

137

* Despite having too many commands, there are massive and glaring

138

gaps, such reverting a single file or a tree.

139

140

* Commands are too different from what people are used to in CVS, and

141

often not for a good reason.

142

143

* Identifiers are too long. In part this is because Arch tries to

144

have identifiers which are both human-assigned and universally unique.

145

146

* Archive names are probably unnecessary.

147

148

* Part of the reason for complexity in archives is that the Arch

149

design wants to be able to go and find patches on other branches at

150

a later time. (This is not really implemented or used at the

151

moment.)

152

153

I think the complexity is unjustified: changesets and revisions have

154

universally unique names so they can simply be archived, either on

155

the machine of the person who wants them or on a central site like

156

supermirror.

157

158

* The tool is *unforgiving*; if people create a branch with the wrong

159

name it will be around forever.

160

161

* Branches are heaviweight; a record always persists in the archive.

162

Sometimes it is good to create micro-branches, try something out,

163

and then discard them. If nobody wants the changes, there is no

164

reason for the tool to keep them.

165

166

* Working offline requires creating a new branch and merging back and

167

forth. This is both more work than it should be, and also polutes

168

the "story" told by branching.

169

170

As much as possible, the *accidental* difference of the location of

171

the repository should not effect the *semantics* of branches.

172

173

(However, some merging may obviously be necessary when there is

174

divergence.)

175

176

* Archive registration. This causes confusion and is unnecessary.

177

178

Proposed solutions such as archive aliases or an additional command

179

to register-and-get make it worse.

180

181

* Wierd file names (``++`` and ``,,``, which persist in user

182

directories and cause breakage of many tools. Gives a bad

183

impression, and it's even worse when people have to interact with

184

them.

185

186

* Overly-long identifiers. (One advantage of pointing to branches

187

using filenames or URLs is that the length of the path depends on

188

how close it is to the users location, and they can more easily use

189

190

* Too slow by default.

191

192

Arch can be made fast, but in the hands of a nonexpert user it is

193

often slow. For most users, disk is cheaper than CPU time, which is

194

cheaper than network roundtrips. The performance model should be

195

transparent -- users should not be surprised that something is slow.

196

197

* Tagging onto branches.

198

199

Unifying tags and commits is interesting, but the result is hard to

200

mentally model; even Arch maintainers can't say exactly how it is

201

supposed to work in some cases.

202

203

* Reinventing the world from scratch in libhackerlab/frob/pika/xl.

204

205

Those are all fine projects and may be useful in the future, but

206

they are totally unnecessary to write a great version control

207

system. It is not an enormous project; it is not CPU-cycle

208

critical; something like Python will be fine.

209

210

* Lack (for the moment) of an active server.

211

212

Given that network traffic is the most expensive thing, we can

213

possibly get a better solution by having intelligence on both sides

214

of the link. Suppose we want to get just one file from a previous

215

revision...

216

217

* Poor Windows/Mac support.

218

219

Even though many developers only work on Linux, this still holds a

220

tool back. The reason is this: at least some projects have some

221

developers on Windows some of the time. Those projects can't switch

222

to Arch. Most people want to only learn one tool deeply, so it

223

won't be Arch.

224

225

Don't make any overly Unixy assumptions. Avoid too-cute filesystem

226

dependencies.

227

228

Being in Python should help with portability: people do need to

229

install it, but many developers will already have it and the total

230

burden is possibly less than that of installing C requisite

231

libraries.

232

233

* Quirky filename support.

234

235

Files with non-ascii names, or names containing whitespace tend to

236

be handled poorly, perhaps partly because of arch's shell heritage.

237

238

By swallowing XML we do at least get automatic quoting of wierd

239

strings, and we will always use UTF-8 for internal storage.

240

241

* Complex file-id-tagging

242

243

Nobody should be expected to understand this. There are two basic

244

cases: people want to auto-add everything, and want to add by hand.

245

Both can be reasonably accomodated in a simpler system.

246

247

* Complex naming-convention regexps in ``.arch-inventory`` and

248

``{arch}/id-tagging-method``. (The fact that there are two

249

overlapping mechanisms with very different names is also bad.)

250

251

All this complexity basically just comes down to versioned, ignored,

252

unknown, the same as in every other system. So we might as well

253

just have that.

254

255

There are relatively few cases where regexps help more than globs,

256

and people do find them more complex. Even experienced users can

257

forget to escape ``\.``. We can have a bit of flexibility with

258

(say) zsh-style extended globs like ``*.(pyo|pyc)``.

259

260

* Some files inside ``{arch}`` are meant to be edited by the user, and

261

some are not. This is a flaw common to other systems, including

262

Bitkeeper. The user should be clear on whether they should touch

263

things in a directory or not.

264

265

* Source-librarian function works poorly.

266

267

It is not the place of a tool to force people to stay organized; it

268

should just facilitate it. In any case, a library without

269

descriptive text is of little use. So bazaar-ng does not force

270

three-level naming but rather lets people arrange their own trees,

271

and put on their own descriptions (either within the tree, or by

272

e.g. having a wiki page listing branches, descriptions and URLs.)

273

274

* Whining about inode mismatches on pristines/revlibs.

275

276

It's fine that there is validation, but the tool should not show off

277

its limitations. Just do the right thing.

278

279

* More generally, not quite enough consistency/safety checking.

280

281

* Unclear what commands work on subdirs and what works on the whole

282

tree.

283

284

* Hard to share work on a single branch -- though still not really too

285

bad.

286

287

* Lack of partial commits of added/deleted files.

288

289

* Separate id tags for each file; simple implementation but probably

290

costs too much disk space.

291

292

* Way too many deeply-nested directories; should be just one.

293

294

* ``.listing`` files are ugly and a point of failure. They can cause

295

trouble on some servers which limit access to dot files.

296

297

Isn't it possible to have the top-level file be predictable and find

298

everything else needed from there?

299

300

* Summary separate from log message.

301

302

Simpler to just have one message, and let people extract the first

303

line/sentence if they wish.

304

305

Rather than 'keywords', let arbitrary properties be attached to the

306

revision at the time of commit.

307

308

309

310

Simpler disconnected operation

311

------------------------------

312

313

A basic distributed VCS operation is to make it easy to work on an

314

offline laptop. Arch can do this in a few ways, but none of them are

315

really simple.

316

317

http://wiki.gnuarch.org/moin.cgi/mini_5fTravellingOftenWithArch

318

319

Yaron Minsky writes (2005-01-18):

320

321

I was wondering what people considered to be a good setup for using

322

Arch on a laptop. Here's the basic situation. I have a few projects

323

that reside in arch repositories on my desktop computer. Basically,

324

I'd like to be able to do commits from my laptop, and have those

325

commits eventually migrate up to the main repository. I understand

326

that the right way of doing this is to set up archives on the laptop.

327

But what's the cleanest way of doing this? And is there some way of

328

making the commits I do on the laptop show up cleanly and individually

329

on the desktop once they are merged in?

330

331

332

Tagging-method

333

--------------

334

335

baz default is much less strict.

336

337

Much of tla depends on being able to categorize files. Some hangovers

338

from larch -- eg precious and backup are essentially the same. junk

339

is never deleted today.

340

341

Automatic version control with 'untagged-source source'. But this is

342

deprecated for baz?

343

344

Annoyed by

345

346

- defaults

347

- having the feature at all

348

- complex way to define it

349

350

Default of 166 lines.

351

352

Remove id-tagging-method command or at most make it read-only. If

353

people really want to use deprecated methods they can just edit the

354

file.

355

356

So we can ship a default id-tagging which works the same as CVS/Svn:

357

give warnings for files that are not known to be junk. This is the

358

default in baz right now.

359

360

Also we have .arch-inventory, which is per-directory.

361

362

363

364

Why not have 'baz ignore FILENAME'? To remove ignores, perhaps you

365

have to edit the .arch-inventory. Print "FILTER added to

366

PATH/.arch-inventory"; create and baz-add this file if it doesn't.

367

368

Docs should perhaps emphasize .arch-inventory as the basic method and

369

only mention =tagging-method as an advanced topic.

370

371

372

373

Should this really be regexps, or just file globs?

b'\\ No newline at end of file'

Older »