~bzr-pqm/bzr/bzr.dev

2592.1.1 by Robert Collins
Some repository needs documentation.
1
============
2
Repositories
3
============
4
5
Status
6
======
7
8
:Date: 2007-07-08
9
10
This document describes the services repositories offer and need to offer
11
within brlib.
12
13
14
.. contents::
15
16
17
Motivation
18
==========
19
20
To provide clarity to API and performance tradeoff decisions by
21
centralising the requirements placed upon repositories.
22
23
24
Terminology
25
===========
26
27
A **repository** is a store of historical data for bzr.
28
29
30
Command Requirements
31
====================
32
33
==================  ====================
34
Command             Needed services
35
==================  ====================
36
Add                 None
37
Annotate            Annotated file texts, revision details
38
Branch              Fetch, Revision parents, Inventory contents, All file texts
39
Bundle              Maximally compact diffs (file and inventory), Revision graph
40
                    difference, Revision texts.
41
Commit              Insert new texts, insert new inventory via delta, insert
42
                    revision, insert signature
43
Fetching            Revision graph difference, ghost identification, stream data
44
                    introduced by a set of revisions in some cheap form, insert
45
                    data from a stream, validate data during insertion.
46
Garbage Collection  Exclusive lock the repository preventing readers.
2592.1.45 by Robert Collins
Tweak documentation as per Aaron's review.
47
Revert              Delta from working tree to historical tree, and then 
48
                    arbitrary file access to obtain the texts of differing
49
                    files.
2592.1.1 by Robert Collins
Some repository needs documentation.
50
Uncommit            Revision graph access.
51
Status              Revision graph access, revision text access, file
52
                    fingerprint information, inventory differencing.
53
Diff                As status but also file text access.
54
Merge               As diff but needs up to twice as many file texts -
55
                    base and other for each changed file. Also an initial
56
                    fetch is needed.
57
Log                 Revision graph (entire at the moment) access,
58
                    sometimes status between adjacent revisions. Log of a
2592.1.45 by Robert Collins
Tweak documentation as per Aaron's review.
59
                    file needs per-file-graph. Dominator caching or
60
                    similar tools may be needed to prevent entire graph
61
                    access.
62
Missing             Revision graph access, and revision texts to show
63
                    output.
2592.1.1 by Robert Collins
Some repository needs documentation.
64
Update              As for merge, but twice.
65
==================  ====================
66
67
Data access patterns
68
====================
69
70
Ideally we can make our data access for commands such as branch to
71
dovetail well with the native storage in the repository, in the common
2592.1.45 by Robert Collins
Tweak documentation as per Aaron's review.
72
case. Doing this may require choosing the behaviour of some commands to
73
allow us to have a smaller range of access patterns which we can optimise
74
more heavily. Alternatively if each command is very predicable in its
75
data access pattern we may be able to hint to the low level layers which
76
pattern is needed on a per command basis to get efficient behaviour.
2592.1.1 by Robert Collins
Some repository needs documentation.
77
78
===================  ===================================================
79
Command              Data access pattern
80
===================  ===================================================
81
Annotate-cached      Find text name in an inventory, Recreate one text,
82
                     recreate annotation regions
83
Annotate-on demand   Find file id from name, then breadth-first pre-order
84
                     traversal of versions-of-the-file until the annotation
85
                     is complete.
86
Branch               Fetch, possibly taking a copy of any file present in a
87
                     nominated revision when it is validated during fetch.
88
Bundle               Revision-graph as for fetch; then inventories for
89
                     selected revision_ids to determine file texts, then
90
                     mp-parent deltas for all determined file texts.
91
Commit               Something like basis-inventories read to determine
92
                     per-file graphs, insertion of new texts (which may
93
                     be delta compressed), generation of annotation
94
                     regions if the repository is configured to do so,
95
                     finalisation of the inventory pointing at all the new
96
                     texts and finally a revision and possibly signature.
97
Fetching             Revision-graph searching to find the graph difference.
98
                     Scan the inventory data introduced during the selected
99
                     revisions, and grab the on disk data for the found
100
                     file texts, annotation region data, per-file-graph
101
                     data, piling all this into a stream. 
102
Garbage Collection   Basically a mass fetch of all the revisions which
103
                     branches point at, then a bait and switch with the old
104
                     repository thus removing unreferenced data.
105
Revert               Revision graph access for the revision being reverted
106
                     to, inventory extraction of that revision,
107
                     dirblock-order file text extract for files that were
108
                     different.
109
Uncommit             Revision graph access to synthesise pending-merges 
110
                     linear access down left-hand-side, with is_ancestor
111
                     checks between all the found non-left-hand-side
112
                     parents.
113
Status               Lookup the revisions added by pending merges and their
114
                     commit messages. Then an inventory difference between
115
                     the trees involved, which may include a working tree.
116
                     If there is a working tree involved then the file 
117
                     fingerprint for cache-misses on files will be needed.
118
                     Note that dirstate caches most of this making
119
                     repository performance largely irrelevant: but if it
120
                     was fast enough dirstate might be able to be simpler/
121
Diff                 As status but also file text access for every file
122
                     that is different - either one text (working tree
123
                     diff) or a diff of two (revision to revision diff).
124
Merge                As diff but needs up to twice as many file texts -
125
                     base and other for each changed file. Also an initial
126
                     fetch is needed. Note that the access pattern is
127
                     probably id-based at the moment, but that may be
128
                     'fixed' with the iter_changes based merge. Also note
129
                     that while the texts from OTHER are the ones accessed,
130
                     this is equivalent to the **newest** form of each text
131
                     changed from BASE to OTHER. And as the repository
132
                     looks at when data is introduced, this should be the
133
                     pattern we focus on for merge.
134
Log                  Revision graph (entire at the moment) access, log of a
135
                     file wants a per-file-graph. Log -v will want
136
                     newest-first inventory deltas between revisions.
137
Missing              Revision graph access, breadth-first pre-order.
138
Update               As for merge, but twice.
139
===================  ===================================================
140
141
Patterns used
142
-------------
143
2592.1.45 by Robert Collins
Tweak documentation as per Aaron's review.
144
Note that these are able to be changed by changing what we store. For
145
instance if the repository satisfies mpdiff requests, then bundle can be
146
defined in terms of mpdiff lookups rather than file text lookups
147
appropriate to create mpdiffs. If the repository satisfies full text
148
requests only, then you need the topological access to build up the
149
desired mpdiffs.
150
2592.1.1 by Robert Collins
Some repository needs documentation.
151
=========================================== =========
152
Pattern                                     Commands
153
=========================================== =========
154
Single file text                            annotate, diff
155
Files present in one revision               branch
156
Newest form of files altered by revisions   merge, update?
157
Topological access to file versions/deltas  annotate-uncached
158
Stream all data required to recreate revs   branch (lightweight)
159
Stream file texts in topological order      bundle
160
Write full versions of files, inv, rev, sig commit
161
Write deltas of files, inv for one tree     commit
162
Stream all data introduced by revs          fetch
163
Regenerate/combine deltas of many trees     fetch, pack
164
Reconstruct all texts and validate trees    check, fetch
165
Revision graph walk                         fetch, pack, uncommit,
166
                                            annotate-uncached,
167
                                            merge, log, missing
168
Top down access multiple invs concurrently  status, diff, merge?, update?
169
Concurrent access to N file texts           diff, merge
170
Iteration of inventory deltas               log -v, fetch?
171
=========================================== =========
172
173
Facilities to scale well
174
========================
175
176
Indices
177
-------
178
179
We want < linear access to all data in the repository. This suggests
180
everything is indexed to some degree.
181
182
Often we know the kind of data we are accessing; which allows us to
183
partition our indices if that will help (e.g. by reducing the total index
184
size for queries that only care about the revision graph).
185
186
Indices that support our data access patterns will usually display
187
increased locality of reference, reducing the impact of a large indices
188
without needing careful page size management or other tricks.
189
2592.1.2 by Robert Collins
More repository doco.
190
We need repository wide indices. For the current repositories this is
191
achieved by dividing the keyspace (revisions, signatures, inventories,
192
per-fileid) and then having an append only index within each keyspace.
193
For pack based repositories we will want some means to query the index of
194
each component pack, presumably as a single logical index.
195
196
It would be nice if indexing was made cleanly separate from storage. So
197
that suggests indices don't know the meaning of the lookup; indices which
198
offer particular ordering, or graph walking facilities will clearly need
199
that information, but perhaps they don't need to know the semantics ?
200
201
Index size
202
~~~~~~~~~~
203
204
Smaller indexes are good. We could go with one big index, or a different
205
index for different operation styles. As multiple indices will occupy more
206
space in total we should consider carefully about adding indices.
207
208
Index ordering
209
~~~~~~~~~~~~~~
210
211
Looking at the data access patterns some operations such as graph walking
212
can clearly be made more efficient by offering direct iteration rather
213
than repeated reentry into the index - so having indices that support
214
iteration in such a style would be useful eventually.
215
216
Changing our current indexes
217
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
2592.1.3 by Robert Collins
More speculation and repository docs.
219
We can consider introducing cleaner indices in advance of a full pack
220
based repository.
2592.1.2 by Robert Collins
More repository doco.
221
222
There are many possibilities for this, but I've chosen one that seems ok
223
to me for illustration.
224
225
A key element is to consider when indices are updated. I think that the
226
update style proposed for pack based repositories - write once, then when
2592.1.3 by Robert Collins
More speculation and repository docs.
227
we group data again rewrite a new single index - is sufficent.
228
229
Replace .kndx
230
^^^^^^^^^^^^^
231
232
We could discard the per-knit .kndx by writing a new index at the end of
233
every bzr transaction indexing the new data introduced by the bzr
2592.1.34 by Robert Collins
Cleanup docs.
234
operation. e.g. at the end of fetch. This can be based on the new
235
``GraphIndex`` index type.
236
237
Encoding a knit entry into a ``GraphIndex`` can be done as follows:
238
239
* Change the key to include a prefix of the knit name, to allow filtering
240
  out of data from different knits.
241
* Encode the parents from the knit as the zeroth node reference list.
242
* If the knit hunk was delta compressed encode the node it was delta
243
  compressed against as the 1st node reference list (otherwise the 1st
244
  node reference list will be empty to indicate no compression parents).
245
* For the value encode similarly to the current knit format the byte
246
  offset for the data record in the knit, the byte length for the data
247
  record in the knit and the no-end-of-line flag.
2592.1.2 by Robert Collins
More repository doco.
248
249
Its important to note that knit repositories cannot be regenerated by
2592.1.34 by Robert Collins
Cleanup docs.
250
scanning .knits, so a mapped index is still irreplaceable and must be
251
transmitted on push/pull.
2592.1.3 by Robert Collins
More speculation and repository docs.
252
253
A potential improvement exists by specialising this further to not record
254
data that is not needed - e.g. an index of revisions does not need to
255
support a pointer to a parent compressed text as revisions.knit is not
256
delta-compressed ever. Likewise signatures do not need the parent pointers
2592.1.34 by Robert Collins
Cleanup docs.
257
at all as there is no 'signature graph'.
2592.1.2 by Robert Collins
More repository doco.
258
2592.1.1 by Robert Collins
Some repository needs documentation.
259
Data 
2592.1.2 by Robert Collins
More repository doco.
260
----
261
262
Moving to pack based repositories
263
---------------------------------
264
265
We have a number of challenges to solve.
266
267
Naming of files
268
~~~~~~~~~~~~~~~
269
270
As long as the file name is unique it does not really matter. It might be
2592.1.3 by Robert Collins
More speculation and repository docs.
271
interesting to have it be deterministic based on content, but there are no
272
specific problems we have solved by doing that, and doing so would require
273
hashing the full file. OTOH hashing the full file is a cheap way to detect
2592.3.84 by Robert Collins
Start of autopacking logic.
274
bit-errors in transfer (such as windows corruption). Non-reused file names
275
are required for data integrity, as clients having read an index will
276
readv at arbitrary times later. 
2592.1.3 by Robert Collins
More speculation and repository docs.
277
278
Discovery of files
279
~~~~~~~~~~~~~~~~~~
280
2592.1.44 by Robert Collins
Remove some unneeded index iteration by checking if we have found all keys, and grammar improvements from Aaron's review.
281
With non-listable transports how should the collection of pack/index files
2592.1.3 by Robert Collins
More speculation and repository docs.
282
be found ? Initially record a list of all the pack/index files from
283
write actions. (Require writable transports to be listable). We can then
284
use a heuristic to statically combine pack/index files later.
2592.1.2 by Robert Collins
More repository doco.
285
286
Housing files
287
~~~~~~~~~~~~~
288
289
Combining indices on demand
290
~~~~~~~~~~~~~~~~~~~~~~~~~~~
291
292
Merging data on push
293
~~~~~~~~~~~~~~~~~~~~
294
295
A trivial implementation would be to make a pack which has just the data
296
needed for the push, then send that. More sophisticated things would be
297
streaming single-pass creation, and also using this as an opportunity to
298
increase the packedness of the local repo.
299
300
Choosing compression/delta support
301
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
302
2617.6.8 by Robert Collins
Review feedback and documentation.
303
Caching and writeing of data
304
============================
305
306
Repositories try to provide a consistent view of the data within them
307
within a 'lock context'. 
308
309
Locks
310
-----
311
312
Locks come in two flavours - read locks and write locks. Read locks allow
313
data to be read from the repository. Write locks allow data to be read and
314
signal that you intend to write data at some point. The actual writing of
315
data must take place within a Write Group.
316
317
Write locks provide a cache of repository data during the period of the
318
write lock, and allow write_groups to be acquired. For some repositories
319
the presence of a write lock is exclusive to a single client, for others
320
which are lock free or use server side locks (e.g.  svn), the write lock
321
simply provides the cache context. 
322
323
Write Groups
324
------------
325
326
Write groups are the only allowed means for inserting data into a
327
repository.  These are created by ``start_write_group``, and concluded by
328
either ``commit_write_group`` or ``abort_write_group``.  A write lock must
329
be held on the repository for the entire duration.  At most one write
330
group can be active on a repository at a time.
331
332
Write groups signal to the repository the window during which data is
333
actively being inserted. Several write groups could be committed during a
334
single lock.
335
336
There is no guarantee that data inserted during a write group will be
337
invisible in the repository if the write group is not committed.
338
Specifically repositories without atomic insertion facilities will be
339
writing data as it is inserted within the write group, and may not be able
340
to revert that data - e.g. in the event of a dropped SFTP connection in a
341
knit repository, inserted file data will be visible in the repository. Some
342
repositories have an atomic insertion facility, and for those
343
all-or-nothing will apply.
344
345
The precise meaning of a write group is format specific. For instance a
346
knit based repository treats the write group methods as dummy calls,
347
simply meeting the api that clients will use. A pack based repository will
348
open a new pack container at the start of a write group, and rename it
349
into place at commit time.
2592.1.2 by Robert Collins
More repository doco.
350
2592.1.1 by Robert Collins
Some repository needs documentation.
351
352
..
353
   vim: ft=rst tw=74 ai
354