~bzr-pqm/bzr/bzr.dev

2499.4.2 by Andrew Bennetts
Use standard heading markers.
1
================
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
2
Container format
2499.4.2 by Andrew Bennetts
Use standard heading markers.
3
================
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
4
5
Status
6
======
7
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
8
:Date: 2007-06-07
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
9
10
This document describes the proposed container format for streaming and
11
storing collections of data in Bazaar.  Initially this will be used for
12
streaming revision data for incremental push/pull in the smart server for
2499.4.4 by Andrew Bennetts
Typos and clarfications thanks to Aaron.
13
0.18, but the intention is that this will be the basis for much more
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
14
than just that use case.
15
16
In particular, this document currently focuses almost exclusively on the
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
17
streaming case, and not the on-disk storage case.  It also does not
18
discuss the APIs used to manipulate containers and their records.
19
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
20
21
.. contents::
22
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
23
24
Motivation
25
==========
26
27
To create a low-level file format which is suitable for solving the smart
28
server latency problem and whose layout and requirements are extendable in
29
future versions of Bazaar, and with no requirements that the smart server
30
does not have today.
31
32
33
Terminology
34
===========
35
36
A **container** is a streamable file that contains a series of
37
**records**.  Records may have **names**, and consist of bytes.
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
38
39
40
Use Cases
41
=========
42
43
Here's a brief description of use cases this format is intended to
44
support.
45
46
Streaming data between a smart server and client
2499.4.2 by Andrew Bennetts
Use standard heading markers.
47
------------------------------------------------
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
48
49
It would be nice if we could combine multiple containers into a single
50
stream by something no more expensive than concatenation (e.g. by omitting
2499.4.4 by Andrew Bennetts
Typos and clarfications thanks to Aaron.
51
end/start marker pairs).
52
53
This doesn't imply that such a combination necessarily produces a valid
54
container (e.g. care must be taken to ensure that names are still unique
55
in the combined container), or even a useful container.  It is simply that
56
the cost of assembling a new combined container is practically as cheap as
57
simple concatenation.
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
58
59
Incremental push or pull
2499.4.2 by Andrew Bennetts
Use standard heading markers.
60
~~~~~~~~~~~~~~~~~~~~~~~~
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
61
62
Consider the use case of incremental push/pull, which is currently (0.16)
63
very slow on high-latency links due to the large number of round trips.
64
What we'd like is something like the following.
65
66
A client will make a request meaning "give me the knit contents for these
67
revision IDs" (how the client determines which revision IDs it needs is
68
unimportant here).  In response, the server streams a single container of:
69
70
  * one record per file-id:revision-id knit gzip contents and graph data,
71
  * one record per inventory:revision-id knit gzip contents and graph
72
    data,
73
  * one record per revision knit gzip contents,
74
  * one record per revision signature,
75
  * end marker record.
76
77
in that order.
78
79
Persistent storage on disk
2499.4.2 by Andrew Bennetts
Use standard heading markers.
80
--------------------------
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
81
82
We want a storage format that allows lock-free writes, which suggests a
83
format that uses *rename into place*, and *do not modify after writing*.
84
85
Usable before deep model changes to Bazaar
2499.4.2 by Andrew Bennetts
Use standard heading markers.
86
------------------------------------------
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
87
88
We want a format we can use and refine sooner rather than later.  So it
89
should be usable before the anticipated model changes for Bazaar "1.0"
90
land, while not conflicting with those changes either.
91
92
Specifically, we'd like to have this format in Bazaar 0.18.
93
94
Examples of possible record content
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
95
-----------------------------------
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
96
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
97
  * full texts of file versions
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
98
  * deltas of full texts
99
  * revisions
100
  * inventories
101
  * inventory as tree items e.g. the inventory data for 20 files
102
  * revision signatures
103
  * per-file graph data
104
  * annotation cache
105
106
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
107
Characteristics
108
===============
109
110
Some key aspects of the described format are discussed in this section.
111
112
No length-prefixing of entire container
113
---------------------------------------
114
2499.4.4 by Andrew Bennetts
Typos and clarfications thanks to Aaron.
115
The overall container is not length-prefixed.  Instead there is an end
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
116
marker so that readers can determine when they have read the entire
117
container.  This also does not conflict with the goal of allowing
118
single-pass writing.
119
120
Structured as a self-contained series of records
121
------------------------------------------------
122
123
The container contains a series of *records*.  Each record is
124
self-delimiting.  Record markers are lightweight.  The overhead in terms
125
of bytes and processing for records in this container vs. the raw contents
126
of those records is minimal.
127
128
Addressing records
129
------------------
130
131
There is a requirement that each object can be given an arbitrary name.
132
Some revision control systems address all content by the SHA-1 digest of
133
that content, but this scheme is unsatisfactory for Bazaar's revision
134
objects.  We can still allow addressing by SHA-1 digest for those content
135
types where it makes sense.
136
137
Some proposed object names:
138
139
  * to name a revision: "``revision:``\ *revision-id*".  e.g., 
140
    `revision:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5`.
141
  * to name an inventory delta: "``inventory.delta:``\ *revision-id*".  e.g.,
142
    `inventory.delta:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5`.
143
144
It seems likely that we may want to have multiple names for an object.
145
This format allows that (by allowing multiple ``name`` headers in a Bytes
146
record).
147
148
Although records are in principle addressable by name, this specification
149
alone doesn't provide for efficient access to a particular record given
4031.3.1 by Frank Aspell
Fixing various typos
150
its name.  It is intended that separate indexes will be maintained to
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
151
provide this.
152
153
It is acceptable to have records with no explicit name, if the expected
154
use of them does not require them.  For example:
155
156
  * a record's content could be self-describing in the context of a
157
    particular container, or
158
  * a record could be accessed via an index based on SHA-1, or 
159
  * when streaming, the first record could be treated specially.
160
161
Reasonably cheap for small records
162
----------------------------------
163
164
The overhead for storing fairly short records (tens of bytes, rather than
165
thousands or millions) is minimal.  The minimum overhead is 3 bytes plus
166
the length of the decimal representation of the *length* value (for a
167
record with no name).
168
169
170
Specification
171
=============
172
2499.4.4 by Andrew Bennetts
Typos and clarfications thanks to Aaron.
173
This describes just a basic layer for storing a simple series of
174
"records".  This layer has no intrinsic understanding of the contents of
175
those records.
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
176
177
The format is:
178
2506.2.11 by Andrew Bennetts
Keep container-format.txt up to date with changes to the code.
179
  * a **container lead-in**, "``Bazaar pack format 1 (introduced in 0.18)\n``",
2499.4.3 by Andrew Bennetts
Updates in response to feedback on mailing list and in person.
180
  * followed by one or more **records**.
181
182
A record is:
183
184
  * a 1 byte **kind marker**.
185
  * 0 or more bytes of record content, depending on the record type.
186
187
Record types
188
------------
189
190
End Marker
191
~~~~~~~~~~
192
193
An **End Marker** record:
194
195
  * has a kind marker of "``E``",
196
  * no content bytes.
197
198
End Marker records signal the end of a container.
199
200
Bytes
201
~~~~~
202
203
A **Bytes** record:
204
205
  * has a kind marker of "``B``",
206
  * followed by a mandatory **content length** [1]_: 
207
    "*number*\ ``\n``", where *number* is in decimal, e.g::
208
209
      1234
210
211
  * followed by zero or more optional **names**: 
212
    "*name*\ ``\n``", e.g.::
213
214
      revision:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5
215
216
  * followed by an **end of headers** byte: "``\n``",
217
  * followed by some **bytes**, exactly as many as specified by the length
218
    prefix header.
219
220
So a Bytes record is a series of lines encoding the length and names (if
221
any) followed by a body.
222
223
For example, this is a possible Bytes record (including the kind marker)::
224
225
  B26
226
  example-name1
227
  example-name2
228
229
  abcdefghijklmnopqrstuvwxyz
230
231
232
Names
233
-----
234
235
Names should be UTF-8 encoded strings, with no whitespace.  Names should
236
be unique within a single container, but no guarantee of uniqueness
237
outside of the container is made by this layer.  Names need to be at least
238
one character long.
239
240
241
.. [1] This requires that the writer of a record knows the full length of
242
  the record up front, which typically means it will need to buffer an
243
  entire record in memory.  For the first version of this format this is
244
  considered to be acceptable.
245
2499.4.1 by Andrew Bennetts
First draft of container format developer doc, based on discussion with Robert.
246
..
247
   vim: ft=rst tw=74 ai
248