~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to doc/developers/container-format.txt

Committer: Aaron Bentley
Date: 2007-06-14 04:36:50 UTC
mfrom: (2527 +trunk)
mto: This revision was merged to the branch mainline in revision 2528.
Revision ID: aaron.bentley@utoronto.ca-20070614043650-q2yb3mo03kcr8l2p

Merge bzr.dev

files added:
bzrlib/tests/branch_implementations/test_sprout.py

bzrlib/tests/test_lsprof.py

doc/developers/container-format.txt

doc/developers/profiling.txt

files modified:
Makefile

NEWS

README

bzrlib/__init__.py

bzrlib/branch.py

bzrlib/builtins.py

bzrlib/commands.py

bzrlib/dirstate.py

bzrlib/help_topics.py

bzrlib/lsprof.py

bzrlib/symbol_versioning.py

bzrlib/tests/__init__.py

bzrlib/tests/blackbox/test_init.py

bzrlib/tests/branch_implementations/__init__.py

bzrlib/tests/branch_implementations/test_branch.py

bzrlib/tests/test_dirstate.py

bzrlib/tests/test_lockdir.py

bzrlib/tests/test_selftest.py

bzrlib/tests/test_urlutils.py

bzrlib/urlutils.py

doc/developers/index.txt

doc/tutorial.txt

Show diffs side-by-side

added added

removed removed

doc/developers/container-format.txt

================

Container format

================

Status

======

:Date: 2007-06-07

This document describes the proposed container format for streaming and

storing collections of data in Bazaar. Initially this will be used for

streaming revision data for incremental push/pull in the smart server for

0.18, but the intention is that this will be the basis for much more

than just that use case.

In particular, this document currently focuses almost exclusively on the

streaming case, and not the on-disk storage case. It also does not

discuss the APIs used to manipulate containers and their records.

.. contents::

Motivation

==========

To create a low-level file format which is suitable for solving the smart

server latency problem and whose layout and requirements are extendable in

future versions of Bazaar, and with no requirements that the smart server

does not have today.

Terminology

===========

A **container** is a streamable file that contains a series of

**records**. Records may have **names**, and consist of bytes.

Use Cases

=========

Here's a brief description of use cases this format is intended to

support.

Streaming data between a smart server and client

------------------------------------------------

It would be nice if we could combine multiple containers into a single

stream by something no more expensive than concatenation (e.g. by omitting

end/start marker pairs).

This doesn't imply that such a combination necessarily produces a valid

container (e.g. care must be taken to ensure that names are still unique

in the combined container), or even a useful container. It is simply that

the cost of assembling a new combined container is practically as cheap as

simple concatenation.

Incremental push or pull

~~~~~~~~~~~~~~~~~~~~~~~~

Consider the use case of incremental push/pull, which is currently (0.16)

very slow on high-latency links due to the large number of round trips.

What we'd like is something like the following.

A client will make a request meaning "give me the knit contents for these

revision IDs" (how the client determines which revision IDs it needs is

unimportant here). In response, the server streams a single container of:

* one record per file-id:revision-id knit gzip contents and graph data,

* one record per inventory:revision-id knit gzip contents and graph

data,

* one record per revision knit gzip contents,

* one record per revision signature,

* end marker record.

in that order.

Persistent storage on disk

--------------------------

We want a storage format that allows lock-free writes, which suggests a

format that uses *rename into place*, and *do not modify after writing*.

Usable before deep model changes to Bazaar

------------------------------------------

We want a format we can use and refine sooner rather than later. So it

should be usable before the anticipated model changes for Bazaar "1.0"

land, while not conflicting with those changes either.

Specifically, we'd like to have this format in Bazaar 0.18.

Examples of possible record content

-----------------------------------

* full texts of file versions

* deltas of full texts

* revisions

100

* inventories

101

* inventory as tree items e.g. the inventory data for 20 files

102

* revision signatures

103

* per-file graph data

104

* annotation cache

105

106

107

Characteristics

108

===============

109

110

Some key aspects of the described format are discussed in this section.

111

112

No length-prefixing of entire container

113

---------------------------------------

114

115

The overall container is not length-prefixed. Instead there is an end

116

marker so that readers can determine when they have read the entire

117

container. This also does not conflict with the goal of allowing

118

single-pass writing.

119

120

Structured as a self-contained series of records

121

------------------------------------------------

122

123

The container contains a series of *records*. Each record is

124

self-delimiting. Record markers are lightweight. The overhead in terms

125

of bytes and processing for records in this container vs. the raw contents

126

of those records is minimal.

127

128

Addressing records

129

------------------

130

131

There is a requirement that each object can be given an arbitrary name.

132

Some revision control systems address all content by the SHA-1 digest of

133

that content, but this scheme is unsatisfactory for Bazaar's revision

134

objects. We can still allow addressing by SHA-1 digest for those content

135

types where it makes sense.

136

137

Some proposed object names:

138

139

* to name a revision: "``revision:``\ *revision-id*". e.g.,

140

`revision:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5`.

141

* to name an inventory delta: "``inventory.delta:``\ *revision-id*". e.g.,

142

`inventory.delta:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5`.

143

144

It seems likely that we may want to have multiple names for an object.

145

This format allows that (by allowing multiple ``name`` headers in a Bytes

146

record).

147

148

Although records are in principle addressable by name, this specification

149

alone doesn't provide for efficient access to a particular record given

150

its name. It is intended that seperate indexes will be maintained to

151

provide this.

152

153

It is acceptable to have records with no explicit name, if the expected

154

use of them does not require them. For example:

155

156

* a record's content could be self-describing in the context of a

157

particular container, or

158

* a record could be accessed via an index based on SHA-1, or

159

* when streaming, the first record could be treated specially.

160

161

Reasonably cheap for small records

162

----------------------------------

163

164

The overhead for storing fairly short records (tens of bytes, rather than

165

thousands or millions) is minimal. The minimum overhead is 3 bytes plus

166

the length of the decimal representation of the *length* value (for a

167

record with no name).

168

169

170

Specification

171

=============

172

173

This describes just a basic layer for storing a simple series of

174

"records". This layer has no intrinsic understanding of the contents of

175

those records.

176

177

The format is:

178

179

* a **container lead-in**, "``bzr pack format 1\n``",

180

* followed by one or more **records**.

181

182

A record is:

183

184

* a 1 byte **kind marker**.

185

* 0 or more bytes of record content, depending on the record type.

186

187

Record types

188

------------

189

190

End Marker

191

~~~~~~~~~~

192

193

An **End Marker** record:

194

195

* has a kind marker of "``E``",

196

* no content bytes.

197

198

End Marker records signal the end of a container.

199

200

Bytes

201

~~~~~

202

203

A **Bytes** record:

204

205

* has a kind marker of "``B``",

206

* followed by a mandatory **content length** [1]_:

207

"*number*\ ``\n``", where *number* is in decimal, e.g::

208

209

1234

210

211

* followed by zero or more optional **names**:

212

"*name*\ ``\n``", e.g.::

213

214

revision:pqm@pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5

215

216

* followed by an **end of headers** byte: "``\n``",

217

* followed by some **bytes**, exactly as many as specified by the length

218

prefix header.

219

220

So a Bytes record is a series of lines encoding the length and names (if

221

any) followed by a body.

222

223

For example, this is a possible Bytes record (including the kind marker)::

224

225

B26

226

example-name1

227

example-name2

228

229

abcdefghijklmnopqrstuvwxyz

230

231

232

Names

233

-----

234

235

Names should be UTF-8 encoded strings, with no whitespace. Names should

236

be unique within a single container, but no guarantee of uniqueness

237

outside of the container is made by this layer. Names need to be at least

238

one character long.

239

240

241

.. [1] This requires that the writer of a record knows the full length of

242

the record up front, which typically means it will need to buffer an

243

entire record in memory. For the first version of this format this is

244

considered to be acceptable.

245

246

247

vim: ft=rst tw=74 ai

248

Older »