1
============================================
2
Merge Directive format 2 and Bundle format 4
3
============================================
9
Format 4 is designed to be a compact format for storing revision metadata that
10
can be generated quickly and installed into a repository efficiently. It is
11
not intended to be human-readable; that responsibility has been given to merge
16
This is the fourth format to see public use. Previous versions were 0.7, 0.8,
17
and 0.9. Only 0.7's version number was aligned with a Bazaar release.
9
Merge Directive format 2 represents a request to perform a certain merge. It
10
provides access to all the data necessary to perform that merge, by including
11
a branch URL or a bundle payload. It typically will include a preview of
12
what applying the patch would do.
14
Bundle Format 4 is designed to be a compact format for storing revision
15
metadata that can be generated quickly and installed into a repository
16
efficiently. It is not intended to be human-readable.
20
These two formats, taken together, can be viewed as the successor of Bundle
21
format 0.9, so their specifications are combined. It is expected that in the
22
future, bundle and merge-directive formats will vary independently.
27
This is the fourth bundle format to see public use. Previous versions were
28
0.7, 0.8, and 0.9. Only 0.7's version number was aligned with a Bazaar
22
34
- Container format 1
23
35
- Multiparent diffs
29
This format was designed to trade human-readability for speed and compactness.
30
It does not contain a human-readable "prelude" patch.
32
Relationship to merge directives
33
--------------------------------
34
A merge directive specifies a merge command to apply and a preview of what that
35
command would do. Merge directives may contain a format-4 bundle. The
36
bundle's job is to provide the data needed to perform that merge command.
38
It is recommended that the bundle be provided in a bzip-compressed,
39
mime64-encoded format, to ensure compactness and resistance to email-transport
42
A preview/overview patch may be provided by the merge directive.
42
Merge Directives fulfil the role previous bundle formats had of requesting a
43
merge to be performed, but are a more flexible way of doing so. With the
44
introduction of these two formats, there is a clear split between "directive",
45
which is a request to merge (and therefore signable), and "bundle", which is
48
Merge Directive format 2 may provide a patch preview of the change being
49
requested. If a preview is supplied, the receiving client will verify that
50
the actual change matches the preview.
52
Merge Directive format 2 also includes a testament hash, to ensure that if a
53
branch is used, the branch cannot be subverted to cause the wrong changes to be
56
Bundle format 4 is designed to trade human-readability for speed and
57
compactness. It does not contain a human-readable "prelude" patch.
59
Merge Directive 2 Contents
60
--------------------------
61
This format consists of three sections, in the following order.
64
Patch-RIO command section
65
~~~~~~~~~~~~~~~~~~~~~~~~~
66
This section is identical to the corresponding section in Format 1 merge
67
directives, except as noted below. It is mandatory. It is terminated by a
68
line reading ``#`` that is not preceeded by a line ending with ``\``.
70
This format adds a new piece of information, the ``base_revision_id``. This is
71
a suggested base revision for merging. It may be supplied by the user. If
72
not, it is calculated using the standard merge base algorithm, with the
73
``revision_id`` and target branch's ``last_revision`` as its inputs.
75
When merging, clients should use the ``base_revision_id`` when it is not
76
already present in the ancestry of the ``last_revision`` of the target branch.
77
If it is already present, clients should calculate a merge base in the normal
83
This section is optional. It begins with the line ``# Begin patch``. It is
84
terminated by the end-of-file or by the beginning of a bundle section.
86
Its contents are a unified diff, as per the ``bzr diff`` command. The FROM
87
revision is the ``base_revision_id`` specified in the Patch-RIO section.
92
This section is optional, but if it is not supplied, a source_branch must be
93
supplied. It begins with the line ``# Begin bundle``, and is terminated by the
96
The contents are a base-64 encoded bundle. This may be any bundle format, but
97
formats 4+ are strongly recommended. The base revision is the newest revision
98
in the source branch which is an ancestor of all revisions not present in
99
target which are ancestors of revision_id.
101
This base revision may or may not be the same as the ``base_revision_id``. In
102
particular, the ``base_revision_id`` may specify a cherry-pick, but all the
103
ancestors of the ``base_revision_id`` should be installed in the target
104
repository before performing such a merge.
109
Bazaar revision bundles begin with a format marker that reads
110
``# Bazaar revision bundle v4`` in plaintext. The remainder of the file is a
111
``Bazaar pack format 1`` container. The container is compressed using bzip2.
113
Putting the format marker in plaintext ensures that old clients will give good
114
diagnostics, but renders the file unreadable by standard bzip2 utilities.
115
``bzr bundle-info -v`` can be used to dump the unencoded output.
47
119
Format 4 records revision and inventory records in their repository
48
120
serialization format. This minimizes translation and compression costs
49
121
in the common case, where the sender and receiver use the same serialization
50
122
format for their repository. Steps have been taken to ensure a faithful
51
123
conversion when serialization formats are mismatched.
128
The bundle format creates a single bundle-level record out of two container
129
records. The first container record contains metainfo as a Bencoded dict. The
130
second container record contains the body.
132
The bundle record name is associated with the metainfo record. The body record
55
The bundle format creates a single meta-record out of two. The first record
56
contains metainfo as a Bencoded dict. The second record contains the body.
58
:record_kind: The storage strategy of the record. May be "fulltext" (the
59
record body contains the full text of the value), "mpdiff" (the record body
60
contains a multi-parent diff of the value), or "header" (no record body).
139
:record_kind: The storage strategy of the record. May be ``fulltext`` (the
140
record body contains the full text of the value), ``mpdiff`` (the record
141
body contains a multi-parent diff of the value), or ``header`` (no record
61
143
:parents: Used in fulltext and mpdiff records. The revisions that should be
62
144
noted as parents of this revision in the repository. For mpdiffs, this is
63
145
also the list of build-parents.
64
146
:sha1: Used in mpdiff records. The sha-1 hash of the full-text value.
66
149
Bundle record naming
68
All bundle records have a single name, which is associated with the metainfo.
69
(The body records are anonymous). Records are named according to the body's
70
content-kind, revision-id, and file-id.
150
~~~~~~~~~~~~~~~~~~~~~
151
All bundle records have a single name, which is associated with the metainfo
152
container record. Records are named according to the body's content-kind,
153
revision-id, and file-id.
72
155
Content-kind may be one of:
93
175
The next records are revision and signature fulltexts. They are interleaved
94
176
and topologically sorted.
180
The info record has type ``header``. It has no revision_id or file_id.
181
Its metadata contains:
183
:serializer: A string describing the serialization format used for inventory
184
and revision data. May be ``xml5``, ``xml6`` or ``xml7``.
185
:supports_rich_root: 1 if the source repository supports rich roots,
187
:supports_tree_references: 1 if the source repository supports subtree
188
references, 0 otherwise.
96
191
Implementation notes
98
193
- knit deltas contain almost enough information to extract the original
99
194
SequenceMatcher.get_matching_blocks() call used to produce them. Combining
100
195
that information with the relevant fulltexts allows us to avoid performing
101
196
sequence matching on any fulltexts for which we have deltas.
103
- MultiParent deltas contain get_matching_blocks output almost verbatim, but
104
if there is more than one parent, the information about the leftmost parent
105
may be incomplete. However, for single-parent multiparent diffs, we can
106
extract the SequenceMatcher.get_matching_blocks output, and therefore
107
the SequenceMatcher.get_opcodes output used to create knit deltas.
198
- MultiParent deltas contain ``get_matching_blocks`` output almost verbatim,
199
but if there is more than one parent, the information about the leftmost
200
parent may be incomplete. However, for single-parent multiparent diffs, we
201
can extract the ``SequenceMatcher.get_matching_blocks`` output, and therefore
202
``the SequenceMatcher.get_opcodes`` output used to create knit deltas.
109
205
Installing data across serialization mismatches
110
206
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125
221
This is much slower, of course. But since the since the fulltext is verified
126
222
at step 5, it should be just as safe as any other conversion.
227
Note that there may be model differences requiring additional changes. These
228
differences are described by the "supports_rich_root" and
229
"supports_tree_references" values in the info record.
231
A subset of xml6 and xml7 records are compatible with xml5 (i.e. those that
232
were converted from xml5 originally).
234
When installing from a supports_tree_references bundle to a repository that
235
does not support tree references, clients should halt if they encounter a
236
record containing a tree reference.
238
When installing from a supports_rich_root bundle to a repository that does not
239
support rich roots, clients should halt if they encounter an inventory record
240
whose root directory revision-id does not match the inventory revision id.
242
When installing from a bundle that does not support rich roots to a repository
243
that does, additional knits should be added for the root directory, with a
244
revision for each inventory revision.