10
This document describes the services repositories offer and need to offer
20
To provide clarity to API and performance tradeoff decisions by
21
centralising the requirements placed upon repositories.
27
A **repository** is a store of historical data for bzr.
33
================== ====================
34
Command Needed services
35
================== ====================
37
Annotate Annotated file texts, revision details
38
Branch Fetch, Revision parents, Inventory contents, All file texts
39
Bundle Maximally compact diffs (file and inventory), Revision graph
40
difference, Revision texts.
41
Commit Insert new texts, insert new inventory via delta, insert
42
revision, insert signature
43
Fetching Revision graph difference, ghost identification, stream data
44
introduced by a set of revisions in some cheap form, insert
45
data from a stream, validate data during insertion.
46
Garbage Collection Exclusive lock the repository preventing readers.
47
Revert Delta from working tree to historical tree, and then
48
arbitrary file access to obtain the texts of differing
50
Uncommit Revision graph access.
51
Status Revision graph access, revision text access, file
52
fingerprint information, inventory differencing.
53
Diff As status but also file text access.
54
Merge As diff but needs up to twice as many file texts -
55
base and other for each changed file. Also an initial
57
Log Revision graph (entire at the moment) access,
58
sometimes status between adjacent revisions. Log of a
59
file needs per-file-graph. Dominator caching or
60
similar tools may be needed to prevent entire graph
62
Missing Revision graph access, and revision texts to show
64
Update As for merge, but twice.
65
================== ====================
70
Ideally we can make our data access for commands such as branch to
71
dovetail well with the native storage in the repository, in the common
72
case. Doing this may require choosing the behaviour of some commands to
73
allow us to have a smaller range of access patterns which we can optimise
74
more heavily. Alternatively if each command is very predicable in its
75
data access pattern we may be able to hint to the low level layers which
76
pattern is needed on a per command basis to get efficient behaviour.
78
=================== ===================================================
79
Command Data access pattern
80
=================== ===================================================
81
Annotate-cached Find text name in an inventory, Recreate one text,
82
recreate annotation regions
83
Annotate-on demand Find file id from name, then breadth-first pre-order
84
traversal of versions-of-the-file until the annotation
86
Branch Fetch, possibly taking a copy of any file present in a
87
nominated revision when it is validated during fetch.
88
Bundle Revision-graph as for fetch; then inventories for
89
selected revision_ids to determine file texts, then
90
mp-parent deltas for all determined file texts.
91
Commit Something like basis-inventories read to determine
92
per-file graphs, insertion of new texts (which may
93
be delta compressed), generation of annotation
94
regions if the repository is configured to do so,
95
finalisation of the inventory pointing at all the new
96
texts and finally a revision and possibly signature.
97
Fetching Revision-graph searching to find the graph difference.
98
Scan the inventory data introduced during the selected
99
revisions, and grab the on disk data for the found
100
file texts, annotation region data, per-file-graph
101
data, piling all this into a stream.
102
Garbage Collection Basically a mass fetch of all the revisions which
103
branches point at, then a bait and switch with the old
104
repository thus removing unreferenced data.
105
Revert Revision graph access for the revision being reverted
106
to, inventory extraction of that revision,
107
dirblock-order file text extract for files that were
109
Uncommit Revision graph access to synthesise pending-merges
110
linear access down left-hand-side, with is_ancestor
111
checks between all the found non-left-hand-side
113
Status Lookup the revisions added by pending merges and their
114
commit messages. Then an inventory difference between
115
the trees involved, which may include a working tree.
116
If there is a working tree involved then the file
117
fingerprint for cache-misses on files will be needed.
118
Note that dirstate caches most of this making
119
repository performance largely irrelevant: but if it
120
was fast enough dirstate might be able to be simpler/
121
Diff As status but also file text access for every file
122
that is different - either one text (working tree
123
diff) or a diff of two (revision to revision diff).
124
Merge As diff but needs up to twice as many file texts -
125
base and other for each changed file. Also an initial
126
fetch is needed. Note that the access pattern is
127
probably id-based at the moment, but that may be
128
'fixed' with the iter_changes based merge. Also note
129
that while the texts from OTHER are the ones accessed,
130
this is equivalent to the **newest** form of each text
131
changed from BASE to OTHER. And as the repository
132
looks at when data is introduced, this should be the
133
pattern we focus on for merge.
134
Log Revision graph (entire at the moment) access, log of a
135
file wants a per-file-graph. Log -v will want
136
newest-first inventory deltas between revisions.
137
Missing Revision graph access, breadth-first pre-order.
138
Update As for merge, but twice.
139
=================== ===================================================
144
Note that these are able to be changed by changing what we store. For
145
instance if the repository satisfies mpdiff requests, then bundle can be
146
defined in terms of mpdiff lookups rather than file text lookups
147
appropriate to create mpdiffs. If the repository satisfies full text
148
requests only, then you need the topological access to build up the
151
=========================================== =========
153
=========================================== =========
154
Single file text annotate, diff
155
Files present in one revision branch
156
Newest form of files altered by revisions merge, update?
157
Topological access to file versions/deltas annotate-uncached
158
Stream all data required to recreate revs branch (lightweight)
159
Stream file texts in topological order bundle
160
Write full versions of files, inv, rev, sig commit
161
Write deltas of files, inv for one tree commit
162
Stream all data introduced by revs fetch
163
Regenerate/combine deltas of many trees fetch, pack
164
Reconstruct all texts and validate trees check, fetch
165
Revision graph walk fetch, pack, uncommit,
168
Top down access multiple invs concurrently status, diff, merge?, update?
169
Concurrent access to N file texts diff, merge
170
Iteration of inventory deltas log -v, fetch?
171
=========================================== =========
173
Facilities to scale well
174
========================
179
We want < linear access to all data in the repository. This suggests
180
everything is indexed to some degree.
182
Often we know the kind of data we are accessing; which allows us to
183
partition our indices if that will help (e.g. by reducing the total index
184
size for queries that only care about the revision graph).
186
Indices that support our data access patterns will usually display
187
increased locality of reference, reducing the impact of a large indices
188
without needing careful page size management or other tricks.
190
We need repository wide indices. For the current repositories this is
191
achieved by dividing the keyspace (revisions, signatures, inventories,
192
per-fileid) and then having an append only index within each keyspace.
193
For pack based repositories we will want some means to query the index of
194
each component pack, presumably as a single logical index.
196
It would be nice if indexing was made cleanly separate from storage. So
197
that suggests indices don't know the meaning of the lookup; indices which
198
offer particular ordering, or graph walking facilities will clearly need
199
that information, but perhaps they don't need to know the semantics ?
204
Smaller indexes are good. We could go with one big index, or a different
205
index for different operation styles. As multiple indices will occupy more
206
space in total we should consider carefully about adding indices.
211
Looking at the data access patterns some operations such as graph walking
212
can clearly be made more efficient by offering direct iteration rather
213
than repeated reentry into the index - so having indices that support
214
iteration in such a style would be useful eventually.
216
Changing our current indexes
217
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219
We can consider introducing cleaner indices in advance of a full pack
222
There are many possibilities for this, but I've chosen one that seems ok
223
to me for illustration.
225
A key element is to consider when indices are updated. I think that the
226
update style proposed for pack based repositories - write once, then when
227
we group data again rewrite a new single index - is sufficent.
232
We could discard the per-knit .kndx by writing a new index at the end of
233
every bzr transaction indexing the new data introduced by the bzr
234
operation. e.g. at the end of fetch. This can be based on the new
235
``GraphIndex`` index type.
237
Encoding a knit entry into a ``GraphIndex`` can be done as follows:
239
* Change the key to include a prefix of the knit name, to allow filtering
240
out of data from different knits.
241
* Encode the parents from the knit as the zeroth node reference list.
242
* If the knit hunk was delta compressed encode the node it was delta
243
compressed against as the 1st node reference list (otherwise the 1st
244
node reference list will be empty to indicate no compression parents).
245
* For the value encode similarly to the current knit format the byte
246
offset for the data record in the knit, the byte length for the data
247
record in the knit and the no-end-of-line flag.
249
It's important to note that knit repositories cannot be regenerated by
250
scanning .knits, so a mapped index is still irreplaceable and must be
251
transmitted on push/pull.
253
A potential improvement exists by specialising this further to not record
254
data that is not needed - e.g. an index of revisions does not need to
255
support a pointer to a parent compressed text as revisions.knit is not
256
delta-compressed ever. Likewise signatures do not need the parent pointers
257
at all as there is no 'signature graph'.
262
Moving to pack based repositories
263
---------------------------------
265
We have a number of challenges to solve.
270
As long as the file name is unique it does not really matter. It might be
271
interesting to have it be deterministic based on content, but there are no
272
specific problems we have solved by doing that, and doing so would require
273
hashing the full file. OTOH hashing the full file is a cheap way to detect
274
bit-errors in transfer (such as windows corruption). Non-reused file names
275
are required for data integrity, as clients having read an index will
276
readv at arbitrary times later.
281
With non-listable transports how should the collection of pack/index files
282
be found ? Initially record a list of all the pack/index files from
283
write actions. (Require writable transports to be listable). We can then
284
use a heuristic to statically combine pack/index files later.
289
Combining indices on demand
290
~~~~~~~~~~~~~~~~~~~~~~~~~~~
295
A trivial implementation would be to make a pack which has just the data
296
needed for the push, then send that. More sophisticated things would be
297
streaming single-pass creation, and also using this as an opportunity to
298
increase the packedness of the local repo.
300
Choosing compression/delta support
301
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
303
Caching and writeing of data
304
============================
306
Repositories try to provide a consistent view of the data within them
307
within a 'lock context'.
312
Locks come in two flavours - read locks and write locks. Read locks allow
313
data to be read from the repository. Write locks allow data to be read and
314
signal that you intend to write data at some point. The actual writing of
315
data must take place within a Write Group.
317
Write locks provide a cache of repository data during the period of the
318
write lock, and allow write_groups to be acquired. For some repositories
319
the presence of a write lock is exclusive to a single client, for others
320
which are lock free or use server side locks (e.g. svn), the write lock
321
simply provides the cache context.
326
Write groups are the only allowed means for inserting data into a
327
repository. These are created by ``start_write_group``, and concluded by
328
either ``commit_write_group`` or ``abort_write_group``. A write lock must
329
be held on the repository for the entire duration. At most one write
330
group can be active on a repository at a time.
332
Write groups signal to the repository the window during which data is
333
actively being inserted. Several write groups could be committed during a
336
There is no guarantee that data inserted during a write group will be
337
invisible in the repository if the write group is not committed.
338
Specifically repositories without atomic insertion facilities will be
339
writing data as it is inserted within the write group, and may not be able
340
to revert that data - e.g. in the event of a dropped SFTP connection in a
341
knit repository, inserted file data will be visible in the repository. Some
342
repositories have an atomic insertion facility, and for those
343
all-or-nothing will apply.
345
The precise meaning of a write group is format specific. For instance a
346
knit based repository treats the write group methods as dummy calls,
347
simply meeting the api that clients will use. A pack based repository will
348
open a new pack container at the start of a write group, and rename it
349
into place at commit time.