~bzr-pqm/bzr/bzr.dev

2495.2.2 by Aaron Bentley
Add initial push/pull analysis
1
Initial push / pull
2506.1.1 by Alexander Belchenko
sanitize developers docs
2
===================
2495.2.2 by Aaron Bentley
Add initial push/pull analysis
3
4
Optimal case
5
------------
6
(a motivating example of ultimate performance)
7
Assume there is a file with exactly the right data in compressed form.  This
8
may be a tarred branch, a bundle, or a blob format.  Performance in this case
9
scales with the size of the file.
10
11
Disk case
12
---------
13
Assume current repo format.  Attempt to achieve parity with ``cp -r``.  Read
14
each file only 1 time.
15
16
- read knit graph for revisions
17
- write filtered copy of revision knit O(d+a)
18
- write filtered copy of knit index O(d)
19
- Open knit index for inventory
20
- Write a filtered copy of inventory knit and simultaneously not all referenced
21
  file-ids O(b+d)
22
- Write filtered copy of inventory knit index O(d)
23
- For each referenced file-id:
24
25
  - Open knit index for each file knit O(e)
26
  - If acceptable threshold of irrelevant data hard-link O(f)
27
  - Otherwise write filtered copy of text knit and simultaneously write
28
    the fulltext to tree transform O(h)
29
30
- Write format markers O(1)
31
32
:a: size of aggregate revision metadata
33
:b: size of inventory changes for all revisions
34
:c: size of text changes for all files and all revisions (e * g)
35
:d: number of relevant revisions
36
:e: number of relevant versioned files
37
:f: size of the particular versioned file knit index
38
:g: size of the filtered versioned file knit
39
:h: size of the versioned file fulltext
40
:i: size of the largest file fulltext
41
42
Smart Network Case
43
------------------
44
45
Phase 1
2506.1.1 by Alexander Belchenko
sanitize developers docs
46
~~~~~~~
2495.2.2 by Aaron Bentley
Add initial push/pull analysis
47
Push: ask if there is a repository, and if not, what formats are okay
48
Pull: Nothing
49
50
Phase 2
2506.1.1 by Alexander Belchenko
sanitize developers docs
51
~~~~~~~
2495.2.2 by Aaron Bentley
Add initial push/pull analysis
52
Push: send initial push command, streaming data in acceptable format, following
53
disk case strategy
54
Pull: receive initial pull command, specifying format
55
56
Pull client complexity: O(a), memory cost O(1)
57
Push client complexity: procesing and memory cost same as disk case
58
59
Dumb Network Case
60
-----------------
61
Pull: same as disk case, but request all file knit indices at once and request
62
al file knits at once.
63
Push: same as disk case, but write all files at once.
64
65
Wants
66
-----
67
- Read partial graph
68
- Read multiple segments of multiple files on http and sftp
69
- Write multiple files over SFTP