261
254
less work you do.
257
Avoiding work: avoiding reading parent data
258
-------------------------------------------
260
We would like to avoid the work of reading any data about the parent
261
revisions. We should at least try to avoid reading anything from the
262
repository; we can also consider whether it is possible or useful to hold
263
less parent information in the working tree.
265
When a commit of selected files is requested, the committed snapshot is a
266
composite of some directories from the parent revision and some from the
267
working tree. In this case it is logically necessary to have the parent
268
inventory information.
270
If file last-change information or per-file graph information is stored
271
then it must be available from the parent trees.
273
If the Branch's storage method does delta compression at commit time it
274
may need to retrieve file or inventory texts from the repository.
276
It is desirable to avoid roundtrips to the Repository during commit,
277
particularly because it may be remote. If the WorkingTree can determine
278
by itself that a text was in the parent and therefore should be in the
279
Repository that avoids one roundtrip per file.
281
There is a possibility here that the parent revision is not stored, or not
282
correctly stored, in the repository the tree is being committed into, and
283
so the committed tree would not be reconstructable. We could check that
284
the parent revision is present in the inventory and rely on the invariant
285
that if a revision is present, everything to reconstruct it will be
288
Complications of commit
289
-----------------------
291
Bazaar (as of 0.17) does not support selective-file commit of a merge;
292
this could be done if we decide how it should be recorded - is this to be
293
stored as an overall merge revision; as a preliminary non-merge revisions;
294
or will the per-file graph diverge from the revision graph.
296
There are several checks that may cause the commit to be refused, which
297
may be activated or deactivated by options.
299
* presence of conflicts in the tree
301
* presence of unknown files
303
* the working tree basis is up to date with the branch tip
305
* the local branch is up to date with the master branch, if there
306
is one and --local is not specified
308
* an empty commit message is given,
310
* a hook flags an error
312
* a "pointless" commit, with no inventory changes
314
Most of these require walking the tree and can be easily done while
315
recording the tree shape. This does require that it be possible to abort
316
the commit after the tree changes have been recorded. It could be ok to
317
either leave the unreachable partly-committed records in the repository,
322
* when automatically adding new files or deleting missing files during
323
commit, they must be noted during commit and written into the working
326
* refuse "pointless" commits with no file changes - should be easy by
327
just refusing to do the final step of storing a new overall inventory
330
* heuristic detection of renames between add and delete (out of scope for
333
* pushing changes to a master branch if any
335
* running hooks, pre and post commit
337
* prompting for a commit message if necessary, including a list of the
338
changes that have already been observed
340
* if there are tree references and recursing into them is enabled, then
343
Updates that need to be made in the working tree, either on conclusion
344
of commit or during the scan, include
346
* Changes made to the tree shape, including automatic adds, renames or
349
* For trees (eg dirstate) that cache parent inventories, the old parent
350
information must be removed and the new one inserted
352
* The tree hashcache information should be updated to reflect the stat
353
value at which the file was the same as the committed version, and the
354
content hash it was observed to have. This needs to be done carefully to
355
prevent inconsistencies if the file is modified during or shortly after
356
the commit. Perhaps it would work to read the mtime of the file before we
357
read its text to commit.
280
376
* option for local-only commit on a bound branch
281
377
* option for strict commits (fail if there are unknown or missing files)
282
378
* option to allow "pointless" commits (with no tree changes)
380
(This is rather a lot of options to pass individually and just for code tidyness maybe some of them should be combine into objects.)
284
>>> Branch.commit(from_tree, message, files_to_commit)
382
>>> Branch.commit(from_tree, message, files_to_commit, ...)
286
384
There will be different implementations of this for different Branch
287
385
classes, whether for foreign branches or Bazaar repositories using
298
396
For a dirstate tree the iteration of changes from the parent can easily be
299
397
done within its own iter_changes.
301
XXX: We currently don't support selective-file commit of a merge; this
302
could be done if we decide how it should be recorded - is this to be
303
stored as an overall merge revision; as a preliminary non-merge revisions;
304
or will the per-file graph diverge from the revision graph.
306
Other things commit needs to do:
308
* check if there are any conflicts in the tree - if so, commit cannot
311
* check if there are any unknown files, if --strict or automatic add is
314
* check the working tree basis version is up to date with the branch tip
316
* when automatically adding new files or deleting missing files during
317
commit, they must be noted during commit and written into the working
320
* refuse "pointless" commits with no file changes - should be easy by
321
just refusing to do the final step of storing a new overall inventory
324
* heuristic detection of renames between add and delete (out of scope for
327
* pushing changes to a master branch if any
329
* running hooks, pre and post commit
331
* prompting for a commit message if necessary, including a list of the
332
changes that have already been observed
334
* if there are tree references and recursing into them is enabled, then
337
Updates that need to be made in the working tree, either on conclusion
338
of commit or during the scan, include
340
* Changes made to the tree shape, including automatic adds, renames or
343
* For trees (eg dirstate) that cache parent inventories, the old parent
344
information must be removed and the new one inserted
346
* The tree hashcache information should be updated to reflect the stat
347
value at which the file was the same as the committed version. This
348
needs to be done carefully to prevent inconsistencies if the file is
349
modified during or shortly after the commit. Perhaps it would work to
350
read the mtime of the file before we read its text to commit.
352
399
Dirstate inventories may be most easily updated in a single operation at
353
400
the end; however it may be best to accumulate data as we proceed through
354
401
the tree rather than revisiting it at the end.
365
412
In the 0.17 model the commit operation needs to know the per-file parents
366
413
and per-file last-changed revision.
368
XXX: If we want to retain explicitly stored per-file graphs, it would seem
369
that we do need to record per-file parents. We have not yet finally
370
settled that we do want to remove them or treat them as a cache. This api
371
stack is still ok whether we do or not, but the internals of it may
374
415
(In this and other operations we must avoid having multiple layers walk
375
416
over the tree separately. For example, it is no good to have the Command
376
417
layer walk the tree to generate a list of all file ids to commit, because
377
418
the tree will also be walked later. The layers that do need to operate
378
419
per-file should probably be bound together in a per-dirblock iterator,
379
420
rather than each iterating independently.)
422
Branch->Tree interface
423
----------------------
425
The Branch commit code needs to ask the Tree what should be committed, in
426
terms of changes from the parent revisions. If the Tree holds all the
427
necessary parent tree information itself it can do it single handed;
428
otherwise it may need to ask the Repository for parent information.
430
This should be a streaming interface, probably like iter_changes returning
431
information per directory block.
433
The interface should not return a block for directories that are
434
recursively unchanged.
436
The tree's idea of what is possibly changed may be more conservative than that of the branch. For example the tree may report on merges of files where the text is identical to the parents: this must be recorded for Bazaar branches that record per-file ancestry but is not necessary for all branches.
437
If the tree is responsible for determining when directories have been recursively modified then it will report on all the parents of such files.
438
There are several implementation options:
440
1. Return all files and directories the branch might want to commit, even if the branch ends up taking no action on them.
441
2. When starting the iteration, the branch can specify what type of change is considered interesting.
443
Since these types of changes are probably (??) rare compared to files that are either completely unmodified or substantially modified, the first may be the best and simplest option.
446
Open question: per-file graphs
447
------------------------------
449
**XXX:** If we want to retain explicitly stored per-file graphs, it would
450
seem that we do need to record per-file parents. We have not yet finally
451
settled that we do want to remove them or treat them as a cache. This api
452
stack is still ok whether we do or not, but the internals of it may