2481.1.3
by Robert Collins
Add the performance roadmap rationale. |
1 |
What should be in the roadmap? |
2506.1.1
by Alexander Belchenko
sanitize developers docs |
2 |
============================== |
2481.1.3
by Robert Collins
Add the performance roadmap rationale. |
3 |
|
4 |
A good roadmap provides a place for contributors to look for tasks, it |
|
5 |
provides users with a sense of when we will fix things that are |
|
6 |
affecting them, and it also allows us all to agree about where we are |
|
7 |
headed. So the roadmap should contain enough things to let all this |
|
8 |
happen. |
|
9 |
||
10 |
I think that it needs to contain the analysis work which is required, a |
|
11 |
list of the use cases to be optimised, the disk changes required, and |
|
12 |
the broad sense of the api changes required. It also needs to list the |
|
13 |
inter-dependencies between these things: we should aim for a large |
|
14 |
surface area of 'ready to be worked on' items, that makes it easy to |
|
15 |
improve performance without having to work in lockstep with other |
|
16 |
developers. |
|
17 |
||
18 |
Clearly the analysis step is an immediate bottleneck - we cannot tell if |
|
19 |
an optimisation for use case A is a pessimism for use case B until we |
|
20 |
have analysed both A and B. I propose that we complete the analysis of |
|
21 |
say a dozen core use cases end to end during the upcoming sprint in |
|
22 |
London. We should then be able to fork() for much of the detailed design |
|
23 |
work and regroup with disk and api changes shortly thereafter. |
|
24 |
||
25 |
I suspect that clarity of layering will make a big difference to |
|
26 |
developer parallelism, so another proposal I have is for us to look at |
|
27 |
the APIs for Branch and Repository in London in the light of what we |
|
28 |
have learnt over the last years. |
|
29 |
||
30 |
What should the final system look like, how is it different to what we have today? |
|
2506.1.1
by Alexander Belchenko
sanitize developers docs |
31 |
================================================================================== |
2481.1.3
by Robert Collins
Add the performance roadmap rationale. |
32 |
|
33 |
One of the things I like the most about bzr is its rich library API, and |
|
34 |
I've heard this from numerous other folk. So anything that will remove |
|
35 |
that should be considered a last resort. |
|
36 |
||
37 |
Similarly our relatively excellent cross platform support is critical |
|
38 |
for projects that are themselves cross platform, and thats a |
|
39 |
considerable number these days. |
|
40 |
||
41 |
And of course, our focus on doing the right thing is what differentiates |
|
42 |
us from some of the other VCS's, so we should be focusing on doing the |
|
43 |
right thing quickly :). |
|
44 |
||
45 |
What we have today though has grown organically in response to us |
|
46 |
identifying bottlenecks over several iterations of back end storage, |
|
47 |
branch metadata and the local tree representation. I think we are |
|
48 |
largely past that and able to describe the ideal characteristics of the |
|
49 |
major actors in the system - primarily Tree, Branch, Repository - based |
|
50 |
on what we have learnt. |
|
51 |
||
52 |
What use cases should be covered? |
|
2506.1.1
by Alexander Belchenko
sanitize developers docs |
53 |
================================= |
2481.1.3
by Robert Collins
Add the performance roadmap rationale. |
54 |
|
55 |
My list of use cases is probably not complete - its just the ones I |
|
56 |
happen to see a lot :). I think each should be analysed comprehensively |
|
57 |
so we dont need to say 'push over the network' - its implied in the |
|
58 |
scaling analysis that both semantic and file operation latency will be |
|
59 |
considered. |
|
60 |
||
61 |
These use cases are ordered by roughly the ease of benchmarking, and the |
|
62 |
frequency of use. This ordering is so that when people are comparing bzr |
|
63 |
they are going to get use cases we have optimised; and so that as we |
|
64 |
speed things up our existing users will have the things they do the most |
|
65 |
optimised. |
|
66 |
||
67 |
* status tree |
|
68 |
* status subtree |
|
69 |
* commit |
|
70 |
* commit to a bound branch |
|
71 |
* incremental push/pull |
|
72 |
* log |
|
73 |
* log path |
|
74 |
* add |
|
75 |
* initial push or pull [both to a new repo and an existing repo with |
|
76 |
different data in it] |
|
77 |
* diff tree |
|
78 |
* diff subtree |
|
79 |
||
80 |
* revert tree |
|
81 |
* revert subtree |
|
82 |
* merge from a branch |
|
83 |
* merge from a bundle |
|
84 |
* annotate |
|
85 |
* create a bundle against a branch |
|
86 |
* uncommit |
|
87 |
* missing |
|
88 |
* update |
|
89 |
* cbranch |
|
90 |
||
2506.1.1
by Alexander Belchenko
sanitize developers docs |
91 |
How is development on the roadmap coordinated? |
92 |
============================================== |
|
2481.1.3
by Robert Collins
Add the performance roadmap rationale. |
93 |
|
94 |
I think we should hold regular get-togethers (on IRC) to coordinate on |
|
95 |
our progress, because this is a big task and its a lot easier to start |
|
96 |
helping out some area which is having trouble if we have kept in contact |
|
97 |
about each areas progress. This might be weekly or fortnightly or some |
|
98 |
such. |
|
99 |
||
100 |
we need a shared space to record the results of the analysis and the |
|
101 |
roadmap as we go forward. Given that we'll need to update these as new |
|
102 |
features are considered, I propose that we use doc/design as a working |
|
103 |
space, and as we analyse use cases we include them in there - including |
|
104 |
the normal review process for each patch. We also need documentation |
|
105 |
about doing performance tuning - not the minutiae, though that is |
|
106 |
needed, but about how to effective choose things to optimise which will |
|
107 |
give the best return on time spent - that is what the roadmap should |
|
108 |
help with, but this looks to be a large project and an overview will be |
|
109 |
of great assistance I think. We want to help everyone that wishes to |
|
110 |
contribute to performance to do so effectively. |
|
111 |
||
112 |
Finally, its important to note that coding is not the only contribution |
|
113 |
- testing, giving feedback on current performance, helping with the |
|
114 |
analysis are all extremely important tasks too and we probably want to |
|
115 |
have clear markers of where that should be done to encourage such |
|
116 |
contributions. |