~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to doc/revfile.txt

Committer: Martin Pool
Date: 2005-05-03 02:39:45 UTC
Revision ID: mbp@sourcefrog.net-20050503023945-542829ff748301e8

- more documentation of revfile+annotation

files modified:
bzrlib/revfile.py

doc/revfile.txt

Show diffs side-by-side

added added

removed removed

doc/revfile.txt

160

the regions of bytes changed into corresponding updates to the origin

161

annotations.

162

163

Annotations can also be delta-compressed; we only need to add new

164

annotation data when there is a text insertion.

165

166

(It is possible in a merge to have a change of annotation when

167

there is no text change, though this seems unlikely. This can

168

still be represented as a "pointless" delta, plus an update to the

169

annotations.)

170

171

172

173

Tools

174

-----

175

176

The revfile module can be invoked as a program to give low-level

177

access for data recovery, debugging, etc.

178

179

180

181

Format

182

======

183

184

Index file

185

----------

186

187

The index file is a series of fixed-length records::

188

189

byte[16] UUID of revision

190

byte[20] SHA-1 of expanded text (as binary, not hex)

191

uint32 flags: 1=zlib compressed

192

uint32 sequence number this is based on, or -1 for full text

193

uint32 offset in text file of start

194

uint32 length of compressed delta in text file

195

uint32[3] reserved

196

197

Total 64 bytes.

198

199

The header is also 64 bytes, for tidyness and easy calculation. For

200

this format the header must be ``bzr revfile v2\n`` padded with

201

``\xff`` to 64 bytes.

202

203

The first record after the header is index 0. A record's base index

204

must be less than its own index.

205

206

The SHA-1 is redundant with the inventory but stored just as a check

207

on the compression methods and so that the file can be validated

208

without reference to any other information.

209

210

Each byte in the text file should be included by at most one delta.

211

212

213

Deltas

214

------

215

216

Deltas to the text are stored as a series of variable-length records::

217

218

uint32 idx

219

uint32 m

220

uint32 n

221

uint32 l

222

byte[l] new

223

224

This describes a change originally introduced in the revision

225

described by *idx* in the index.

226

227

This indicates that the region [m:n] of the input file should be

228

replaced by the text *new*. If m==n this is a pure insertion of l

229

bytes. If l==0 this is a pure deletion of (n-m) bytes.

230

231

163

232

164

233

Open issues

165

234

===========

190

259

- It might be useful to directly indicate which mergers included

191

260

which lines. We do have that information in the revision history

192

261

though, so there seems no need to store it for every line.

262

263

* Should we also store full-texts as a transitional step?

264

265

* Storing the annotations with the text is reasonably simple and

266

compact, but means that we always need to process the annotation

267

structure even when we only want the text. In particular it means

268

that full-texts cannot just simply be copied out but rather composed

269

from chunks. That seems inefficient since it is probably common to

270

only want the text.

Older »