28
28
There are several variants of serialised tree shape in use by bzr. To date
29
these have been mostly xml based, though plugins have offered non-xml versions.
29
these have been mostly XML-based, though plugins have offered non-XML versions.
35
35
for the working tree and one for each parent tree, interleaved to allow
36
36
efficient diff and status operations.
41
All the xml serialized forms write to and read from a single byte string, whose
41
All the XML serialized forms write to and read from a single byte string, whose
42
42
hash is then the inventory validator for the commit object.
49
49
that an inventory is stored as. We have a number of goals we want to achieve:
51
51
1. Allow commit to write less than the full tree's data in to the repository
53
53
2. Allow the data that is written to be calculated without examining every
54
54
versioned path in the tree.
55
55
3. Generate the exact same representation for a given inventory regardless of
85
The xml based implementation we use today layers the inventory as a bytestring
86
which is stored under a single key; the bytestring is then compressed as a
85
The XML-based implementation we use today layers the inventory as a bytestring
86
which is stored under a single key; the bytestring is then compressed as a
87
87
delta against the bytestring of its left hand parent by the knit code.
92
2. Fails - generating a new xml representation needs full tree data.
92
2. Fails - generating a new XML representation needs full tree data.
93
93
3. Succeeds - the inventory layer accesses the bytestring, which is
95
95
4. Fails - we have to reconstruct both inventories as trees and then delta
140
140
-------------------------------------------------------------------------
142
142
* Split up the logical document into smaller serialised fragements. For
143
instance hash buckets or nodes in a tree of some sort. By serialising in
144
smaller units, we can increase the number of smaller units rather than
143
instance hash buckets or nodes in a tree of some sort. By serialising in
144
smaller units, we can increase the number of smaller units rather than
145
145
their size as the tree grows; as long as two similar trees have similar
146
146
serialised forms, the amount of different content should be quite high.
167
167
* Working tree to arbitrary history revision deltas/comparisons can be scaled
168
168
up by doing a two-step (fixed at two!) delta combining - delta(tree, basis)
169
and then combine that with delta(basis, arbitrary_revision) using the
169
and then combine that with delta(basis, arbitrary_revision) using the
170
170
repositories ability to get a delta cheaply.
172
172
* The key primitives we need seem to be:
264
264
The path and content maps are populated simply by serialising every inventory
265
265
entry and inserting them into both the path map and the content map. The maps
266
start with just a single leaf node with an empty prefix.
266
start with just a single leaf node with an empty prefix.
439
439
formerly linked. (This will normally bubble down due to keeping densely
441
441
To shrink the prefix of a leaf node, create an internal node with the same
442
prefix, then choose a width for the internal node such that the contents
442
prefix, then choose a width for the internal node such that the contents
443
443
of the leaf all fit into new leaves obeying the min_size and max_size rules.
444
444
The largest prefix possible should be chosen, to obey the
445
higher-nodes-are-denser rule. That rule also gives room in leaf nodes for
445
higher-nodes-are-denser rule. That rule also gives room in leaf nodes for
446
446
growth without affecting the parent node packing.
447
447
#. Update the CHK pointers - serialise every altered node to generate a CHK,
448
448
and update the CHK placeholder in the nodes parent; then reserialise the
586
586
Currently we attempt to expand/interpret the request so that the user is not
587
587
required to understand all the internal constraints of the system: if they
588
request 'foo/bar' we automatically include foo. This works to an extent but on
589
review we are creating inconsistent deltas by the way we do this. We need to
590
avoid all the known causes of inconsistency in our delta creation logic.
588
request 'foo/bar' we automatically include foo. This works but can surprise
589
the user sometimes when things they didn't explicitly request are committed.
591
Different trees can use different algorithms to expand the request as long as
592
they produce consistent deltas. As part of getting a consistent UI we require
593
that all trees expand the paths requested downwards. Beyond that as long as
594
the delta is consistent it is up to the tree.
596
Given two trees, source and target, and a set of selected file ids to check for
597
changes and if changed in a delta between them, we have to expand that set by
598
the following rules, to get consistent deltas. The test for consistency is that
599
if the resulting delta is applied to source, to create a third tree 'output',
600
and the paths in the delta match the paths in source and output, only one file
601
id is at each path in output, and no file ids are missing parents, then the
604
Firstly, the parent ids to the root for all of the file ids that have actually
605
changed must be considered. Unless they are all examined the paths in the delta
608
Secondly, when an item included in the delta has a new path which is the same
609
as a path in source, the fileid of that path in source must be included.
610
Failing to do this leads to multiple ids tryin to share a path in output.
612
Thirdly, when an item changes its kind from 'directory' to anything else in the
613
delta, all of the direct children of the directory in source must be included.