~bzr-pqm/bzr/bzr.dev

3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
1
Bazaar Windows Shell Extension Options
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
2
======================================
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
3
4
.. contents:: :local:
5
6
Introduction
7
------------
8
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
9
This document details the implementation strategy chosen for the
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
10
Bazaar Windows Shell Extensions, otherwise known as TortoiseBzr, or TBZR.
11
As justification for the strategy, it also describes the general architecture
12
of Windows Shell Extensions, then looks at the C++ implemented TortoiseSvn
13
and the Python implemented TortoiseBzr, and discusses alternative
14
implementation strategies, and the reasons they were not chosen.
15
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
16
The following points summarize the  strategy:
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
17
18
* Main shell extension code will be implemented in C++, and be as thin as
19
  possible.  It will not directly do any VCS work, but instead will perform
20
  all operations via either external applications or an RPC server.
21
22
* Most VCS operations will be performed by external applications.  For
23
  example, committing changes or viewing history will spawn a child
24
  process that provides its own UI.
25
26
* For operations where spawning a child process is not practical, an
27
  external RPC server will be implemented in Python and will directly use
28
  the VCS library. In the short term, there will be no attempt to create a
29
  general purpose RPC mechanism, but instead will be focused on keeping the
30
  C++ RPC client as thin, fast and dumb as possible.
31
32
Background Information
33
----------------------
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
34
35
The facts about shell extensions
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
36
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
37
38
Well - the facts as I understand them :)
39
40
Shell Extensions are COM objects. They are implemented as DLLs which are
41
loaded by the Windows shell. There is no facility for shell extensions to
42
exist in a separate process - DLLs are the only option, and they are loaded
43
into other processes which take advantage of the Windows shell (although
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
44
obviously this DLL is free to do whatever it likes).
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
45
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
46
For the sake of this discussion, there are 2 categories of shell extensions:
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
47
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
48
* Ones that create a new "namespace". The file-system itself is an example of
49
  such a namespace, as is the "Recycle Bin". For a user-created example,
50
  picture a new tree under "My Computer" which allows you to browse a remote
51
  server - it creates a new, stand-alone tree that doesn't really interact
52
  with the existing namespaces.
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
53
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
54
* Ones that enhance existing namespaces, including the filesystem. An example
55
  would be an extension which uses Icon Overlays to modify how existing files
56
  on disk are displayed or add items to their context menu, for example.
57
58
The latter category is the kind of shell extension relevant for TortoiseBzr,
59
and it has an important implication - it will be pulled into any process
60
which uses the shell to display a list of files. While this is somewhat
61
obvious for Windows Explorer (which many people consider the shell), every
62
other process that shows a FileOpen/FileSave dialog will have these shell
63
extensions loaded into its process space. This may surprise many people - the
64
simple fact of allowing the user to select a filename will result in an
65
unknown number of DLLs being loaded into your process. For a concrete
66
example, when notepad.exe first starts with an empty file it is using around
67
3.5MB of RAM. As soon as the FileOpen dialog is loaded, TortoiseSvn loads
68
well over 20 additional DLLs, including the MSVC8 runtime, into the Notepad
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
69
process causing its memory usage (as reported by task manager) to more than
70
double - all without doing anything tortoise specific at all. (In fairness,
71
this illustration is contrived - the code from these DLLs are already in
72
memory and there is no reason to suggest TSVN adds any other unreasonable
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
73
burden - but the general point remains valid.)
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
74
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
75
This has wide-ranging implications. It means that such shell extensions
76
should be developed using a tool which can never cause conflict with
77
arbitrary processes. For this very reason, MS recommend against using .NET
78
to write shell extensions[1], as there is a significant risk of being loaded
79
into a process that uses a different version of the .NET runtime, and this
80
will kill the process. Similarly, Python implemented shell extension may well
81
conflict badly with other Python implemented applications (and will certainly
82
kill them in some situations). A similar issue exists with GUI toolkits used
83
- using (say) PyGTK directly in the shell extension would need to be avoided
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
84
(which it currently is best I can tell). It should also be obvious that the
85
shell extension will be in many processes simultaneously, meaning use of a
86
simple log-file (for example) is problematic.
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
87
88
In practice, there is only 1 truly safe option - a low-level language (such
89
as C/C++) which makes use of only the win32 API, and a static version of the
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
90
C runtime library if necessary. Obviously, this sucks from our POV. :)
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
91
92
[1]: http://blogs.msdn.com/oldnewthing/archive/2006/12/18/1317290.aspx
93
94
Analysis of TortoiseSVN code
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
95
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
96
97
TortoiseSVN is implemented in C++. It relies on an external process to
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
98
perform most UI (such as diff, log, commit etc.) commands, but it appears to
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
99
directly embed the SVN C libraries for the purposes of obtaining status for
100
icon overlays, context menu, drag&drop, etc.
101
102
The use of an external process to perform commands is fairly simplistic in
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
103
terms of parent and modal windows. For example, when selecting "Commit", a
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
104
new process starts and *usually* ends up as the foreground window, but it may
105
occasionally be lost underneath the window which created it, and the user may
106
accidently start many processes when they only need 1. Best I can tell, this
107
isn't necessarily a limitation of the approach, just the implementation.
108
109
Advantages of using the external process is that it keeps all the UI code
110
outside Windows explorer - only the minimum needed to perform operations
111
directly needed by the shell are part of the "shell extension" and the rest
112
of TortoiseSvn is "just" a fairly large GUI application implementing many
113
commands. The command-line to the app has even been documented for people who
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
114
wish to automate tasks using that GUI. This GUI is also implemented in C++
115
using Windows resource files.
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
116
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
117
TortoiseSvn has an option (enabled by default) which enabled a cache using a
118
separate process, aptly named TSVNCache.exe. It uses a named pipe to accept
119
connections from other processes for various operations. When enabled, TSVN
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
120
fetches most (all?) status information from this process, but it also has the
121
option to talk directly to the VCS, along with options to disable functionality
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
122
in various cases.
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
123
124
There doesn't seem to be a good story for logging or debugging - which is
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
125
what you expect from C++ based apps. :( Most of the heavy lifting is done by
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
126
the external application, which might offer better facilities.
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
127
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
128
Analysis of existing TortoiseBzr code
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
129
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
130
131
The existing code is actually quite cool given its history (SoC student,
132
etc), so this should not be taken as criticism of the implementer nor of the
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
133
implementation. Indeed, many criticisms are also true of the TortoiseSvn
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
134
implementation - see above. However, I have attempted to list the bad things
135
rather than the good things so a clear future strategy can be agreed, with
136
all limitations understood.
137
138
The existing TortoiseBzr code has been ported into Python from other tortoise
139
implementations (probably svn). This means it is very nice to implement and
140
develop, but suffers the problems described above - it is likely to conflict
141
with other Python based processes, and it means the entire CPython runtime
142
and libraries are pulled into many arbitrary processes.
143
144
The existing TortoiseBzr code pulls in the bzrlib library to determine the
145
path of the bzr library, and also to determine the status of files, but uses
146
an external process for most GUI commands - ie, very similar to TortoiseSvn
147
as described above - and as such, all comments above apply equally here - but
148
note that the bzr library *is* pulled into the shell, and therefore every
149
application using the shell. The GUI in the external application is written
150
in PyGTK, which may not offer the best Windows "look and feel", but that
151
discussion is beyond the scope of this document.
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
152
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
153
It has a better story for logging and debugging for the developer - but not
154
for diagnosing issues in the field - although again, much of the heavy
155
lifting remains done by the external application.
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
156
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
157
It uses a rudimentary in-memory cache for the status of files and
158
directories, the implementation of which isn't really suitable (ie, no
159
theoretical upper bound on cache size), and also means that there is no
160
sharing of cached information between processes, which is unfortunate (eg,
161
imagine a user using Windows explorer, then switching back to their editor)
162
and also error prone (it's possible the editor will check the file in,
163
meaning Windows explorer will be showing stale data). This may be possible to
164
address via file-system notifications, but a shared cache would be preferred
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
165
(although clearly more difficult to implement).
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
166
167
One tortoise port recently announced a technique for all tortoise ports to
168
share the same icon overlays to help work around a limitation in Windows on
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
169
the total number of overlays (it's limited to 15, due to the number of bits
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
170
reserved in a 32bit int for overlays). TBZR needs to take advantage of that
171
(but to be fair, this overlay sharing technique was probably done after the
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
172
TBZR implementation).
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
173
174
The current code appears to recursively walk a tree to check if *any* file in
175
the tree has changed, so it can reflect this in the parent directory status.
176
This is almost certainly an evil thing to do (Shell Extensions are optimized
177
so that a folder doesn't even need to look in its direct children for another
178
folder, let alone recurse for any reason at all. It may be a network mounted
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
179
drive that doesn't perform at all.)
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
180
181
Although somewhat dependent on bzr itself, we need a strategy for binary
182
releases (ie, it assumes python.exe, etc) and integration into an existing
183
"blessed" installer.
184
185
Trivially, the code is not PEP8 compliant and was written by someone fairly
186
inexperienced with the language.
187
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
188
Detailed Implementation Strategy
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
189
--------------------------------
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
190
191
We will create a hybrid Python and C++ implementation.  In this model, we
192
would still use something like TSVNCache.exe (this external
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
193
process doesn't have the same restrictions as the shell extension itself) but
194
go one step further - use this remote process for *all* interactions with
195
bzr, including status and other "must be fast" operations. This would allow
196
the shell extension itself to be implemented in C++, but still take advantage
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
197
of Python for much of the logic.
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
198
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
199
A pragmatic implementation strategy will be used to work towards the above
200
infrastructure - we will keep the shell extension implemented in Python - but
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
201
without using bzrlib. This allows us to focus on this
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
202
shared-cache/remote-process infrastructure without immediately
203
re-implementing a shell extension in C++. Longer term, once the
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
204
infrastructure is in place and as optimized as possible, we can move to C++
205
code in the shell calling our remote Python process. This port should try and
206
share as much code as possible from TortoiseSvn, including overlay handlers.
207
208
External Command Processor
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
209
~~~~~~~~~~~~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
210
211
The external command application (ie, the app invoked by the shell extension
212
to perform commands) can remain as-is, and remain a "shell" for other
213
external commands. The implementation of this application is not particularly
214
relevant to the shell extension, just the interface to the application (ie,
215
its command-line) is. In the short term this will remain PyGTK and will only
216
change if there is compelling reason - cross-platform GUI tools are a better
217
for bazaar than Windows specific ones, although native look-and-feel is
218
important. Either way, this can change independently from the shell
219
extension.
220
221
Performance considerations
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
222
~~~~~~~~~~~~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
223
224
As discussed above, the model used by Tortoise is that most "interesting"
3327.1.3 by Mark Hammond
Typos spotted by Alexander Belchenko.
225
things are done by external applications. Most Tortoise implementations
226
show read-only columns in the "detail" view, and shows a few read only
227
properties in the "Properties" dialog - but most of these properties are
228
"state" related (eg, revision number), or editing of others is done by
229
launching an external application. This means that the shell extension itself
230
really has 2 basic requirements WRT RPC: 1) get the local state of a file and
231
2) get some named state-related "properties" for a file. Everything else can
232
be built on that.
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
233
234
There are 2 aspects of the shell integration which are performance critical -
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
235
the "icon overlays" and "column providers".
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
236
237
The short-story with Icon Overlays is that we need to register 12 global
238
"overlay providers" - one for each state we show. Each provider is called for
239
every icon ever shown in Windows explorer or in any application's FileOpen
240
dialog. While most versions of Windows update icons in the background, we
241
still need to perform well. On the positive side, this just needs the simple
242
"local state" of a file - information that can probably be carried in a
243
single byte. On the negative side, it is the shell which makes a synchronous
244
call to us with a single filename as an arg, which makes it difficult to
245
"batch" multiple status requests into a single RPC call.
246
247
The story with columns is messier - these have changed significantly for
248
Vista and the new system may not work with the VCS model (see below).
249
However, if we implement this, it will be fairly critical to have
250
high-performance name/value pairs implemented, as described above.
251
252
Note that the nature of the shell implementation means we will have a large
253
number of "unrelated" handlers, each called somewhat independently by the
254
shell, often for information about the same file (eg, imagine each of our
255
overlay providers all called in turn with the same filename, followed by our
256
column providers called in turn with the same filename. However, that isn't
257
exactly what happens!). This means we will need a kind of cache, geared
258
towards reducing the number of status or property requests we make to the RPC
259
server.
260
261
We will also allow all of the above to be disabled via user preferences.
262
Thus, Icon Overlays could be disabled if it did cause a problem for some
263
people, for example.
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
264
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
265
RPC options
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
266
~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
267
268
Due to the high number of calls for icon overlays, the RPC overhead must be
269
kept as low as possible. Due to the client side being implemented in C++,
270
reducing complexity is also a goal. Our requirements are quite simple and no
271
existing RPC options exist we can leverage. It does not seen prudent to build
272
an XMLRPC solution for tbzr - which is not to preclude the use of such a
273
server in the future, but tbzr need not become the "pilot" project for an
274
XMLRPC server given these constraints.
275
276
I propose that a custom RPC mechanism, built initially using windows-specific
277
named-pipes, be used. A binary format, designed with an eye towards
278
implementation speed and C++ simplicity, will be used. If we succeed here, we
279
can build on that infrastructure, and even replace it should other more
280
general frameworks materialize.
281
282
FWIW, with a Python process at each end, my P4 2.4G machine can achieve
283
around 25000 "calls" per-second across an open named pipe. C++ at one end
284
should increase this a little, but obviously any real work done by the Python
285
side of the process will be the bottle-neck. However, this throughput would
286
appear sufficient to implement a prototype.
287
288
Vista versus XP
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
289
~~~~~~~~~~~~~~~
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
290
291
Let's try and avoid an OS advocacy debate :) But it is probably true that
292
TBZR will, over its life, be used by more Vista computers than XP ones. In
293
short, Vista has changed a number of shell related interfaces, and while TSVN
294
is slowly catching up (http://tortoisesvn.net/vistaproblems) they are a pain.
295
296
XP has IColumnProvider (as implemented by Tortoise), but Vista changes this
297
model. The new model is based around "file types" (eg, .jpg files) and it
298
appears each file type can only have 1 provider! TSVN also seems to think the
299
Vista model isn't going to work (see previous link). It's not clear how much
300
effort we should expend on a column system that has already been abandoned by
301
MS. I would argue we spend effort on other parts of the system (ie, the
302
external GUI apps themselves, etc) and see if a path forward does emerge for
3327.1.3 by Mark Hammond
Typos spotted by Alexander Belchenko.
303
Vista. We can re-evaluate this based on user feedback and more information
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
304
about features of the Vista property system.
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
305
306
Reuse of TSVNCache?
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
307
~~~~~~~~~~~~~~~~~~~
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
308
5538.2.4 by Zearin
Fixed capitalization (RPC, XML-RPC).
309
The RPC mechanism and the tasks performed by the RPC server (RPC, file system
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
310
crawling and watching, device notifications, caching) are very similar to
311
those already implemented for TSVN and analysis of that code shows that
312
it is not particularly tied to any VCS model.  As a result, consideration
313
should be given to making the best use of this existing debugged and
314
optimized technology.
315
316
Discussions with the TSVN developers have indicated that they would prefer us
317
to fork their code rather than introduce complexity and instability into
318
their code by attempting to share it. See the follow-ups to
319
http://thread.gmane.org/gmane.comp.version-control.subversion.tortoisesvn.devel/32635/focus=32651
320
for details.
321
322
For background, the TSVNCache process is fairly sophisticated - but mainly in
323
areas not related to source control. It has had various performance tweaks
324
and is smart in terms of minimizing its use of resources when possible. The
325
'cloc' utility counts ~5000 lines of C++ code and weighs in just under 200KB
326
on disk (not including headers), so this is not a trivial application.
327
However, the code that is of most interest (the crawlers, watchers and cache)
328
are roughly ~2500 lines of C++. Most of the source files only depend lightly
329
on SVN specifics, so it would not be a huge job to make the existing code
330
talk to Bazaar. The code is thread-safe, but not particularly thread-friendly
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
331
(ie, fairly coarse-grained locks are taken in most cases).
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
332
333
In practice, this give us 2 options - "fork" or "port":
334
335
* Fork the existing C++ code, replacing the existing source-control code with
336
  code that talks to Bazaar. This would involve introducing a Python layer,
337
  but only at the layers where we need to talk to bzrlib. The bulk of the
338
  code would remain in C++.
339
340
  This would have the following benefits:
341
342
  - May offer significant performance advantages in some cases (eg, a
343
    cache-hit would never enter Python at all.)
344
345
  - Quickest time to a prototype working - the existing working code can be
346
    used quicker.
347
348
  And the following drawbacks:
349
350
  - More complex to develop. People wishing to hack on it must be on Windows,
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
351
    know C++ and own the most recent MSVC8.
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
352
353
  - More complex to build and package: people making binaries must be on
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
354
    Windows and have the most recent MSVC8.
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
355
356
  - Is tied to Windows - it would be impractical for this to be
357
    cross-platform, even just for test purposes (although parts of it
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
358
    obviously could).
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
359
360
* Port the existing C++ code to Python. We would do this almost
361
  "line-for-line", and attempt to keep many optimizations in place (or at
362
  least document what the optimizations were for ones we consider dubious).
363
  For the windows versions, pywin32 and ctypes would be leaned on - there
364
  would be no C++ at all.
365
366
  This would have the following benefits:
367
368
  - Only need Python and Python skills to hack on it.
369
370
  - No C++ compiler needed means easier to cut releases
371
372
  - Python makes it easier to understand and maintain - it should appear much
373
    less complex than the C++ version.
374
375
  And the following drawbacks:
376
377
  - Will be slower in some cases - eg, a cache-hit will involve executing
378
    Python code.
379
380
  - Will take longer to get a minimal system working. In practice this
381
    probably means the initial versions will not be as sophisticated.
382
383
Given the above, there are two issues which prevent Python being the clear
384
winner: (1) will it perform OK? (2) How much longer to a prototype?
385
386
My gut feeling on (1) is that it will perform fine, given a suitable Python
387
implementation. For example, Python code that simply looked up a dictionary
388
would be fast enough - so it all depends on how fast we can make our cache.
389
Re (2), it should be possible to have a "stub" process (did almost nothing in
390
terms of caching or crawling, but could be connected to by the shell) in a 8
391
hours, and some crawling and caching in 40. Note that this is separate from
392
the work included for the shell extension itself (the implementation of which
393
is largely independent of the TBZRCache implementation). So given the lack of
394
a deadline for any particular feature and the better long-term fit of using
395
Python, the conclusion is that we should "port" TSVN for bazaar.
396
397
Reuse of this code by Mercurial or other Python based VCS systems?
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
398
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
399
400
Incidentally, the hope is that this work can be picked up by the Mercurial
401
project (or anyone else who thinks it is of use). However, we will limit
402
ourselves to attempting to find a clean abstraction for the parts that talk
403
to the VCS (as good design would dictate regardless) and then try and assist
404
other projects in providing patches which work for both of us. In other
405
words, supporting multiple VCS systems is not an explicit goal at this stage,
406
but we would hope it is possible in the future.
407
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
408
Implementation plan
409
-------------------
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
410
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
411
The following is a high-level set of milestones for the implementation:
412
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
413
* Design the RPC mechanism used for icon overlays (ie, binary format used for
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
414
  communication).
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
415
416
* Create Python prototype of the C++ "shim": modify the existing TBZR Python
417
  code so that all references to "bzrlib" are removed. Implement the client
418
  side of the RPC mechanism and implement icon overlays using this RPC
419
  mechanism.
420
421
* Create initial implementation of RPC server in Python. This will use
422
  bzrlib, but will also maintain a local cache to achieve the required
3394.4.2 by Mark Hammond
Fix bad edit.
423
  performance. File crawling and watching will not be implemented at this
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
424
  stage, but caching will (although cache persistence might be skipped).
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
425
426
* Analyze performance of prototype. Verify that technique is feasible and
427
  will offer reasonable performance and user experience.
428
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
429
* Implement file watching, crawling etc by "porting" TSVNCache code to
430
  Python, as described above.
431
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
432
* Implement C++ shim: replace the Python prototype with a light-weight C++
3394.4.1 by Mark Hammond
* Update TSVN notes based on comments from Stefan Kung
433
  version. We will fork the current TSVN sources, including its new
434
  support for sharing icon overlays (although advice on how to setup this
435
  fork is needed!)
4853.1.1 by Patrick Regan
Removed trailing whitespace from files in doc directory
436
3327.1.1 by Mark Hammond
Add document outlining strategies for TortoiseBzr.
437
* Implement property pages and context menus in C++. Expand RPC server as
438
  necessary.
439
5538.1.1 by Zearin
Fixed “its” vs “it's”.
440
* Create binary for alpha releases, then go round-and-round until it's baked.
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
441
442
Alternative Implementation Strategies
443
-------------------------------------
444
445
Only one credible alternative strategy was identified, as discussed below. No
446
languages other than Python and C++ were considered; Python as the bzr
447
library and existing extensions are written in Python and otherwise only C++
448
for reasons outlined in the background on shell extensions above.
449
450
Implement Completely in Python
451
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
452
453
This would keep the basic structure of the existing TBZR code, with the
454
shell extension continuing to pull in Python and all libraries used by Bzr
455
into various processes.
456
457
Although implementation simplicity is a key benefit to this option, it was
3441.1.1 by Ian Clatworthy
added ianc tweaks to tbzr strategy doc
458
not chosen for various reasons, e.g. the use of Python means that there is a
3327.1.4 by Mark Hammond
based on the suggestions of a few people, make the tone of the document
459
larger chance of conflicting with existing applications, or even existing
460
Python implemented shell extensions. It will also increase the memory usage
461
of all applications which use the shell. While this may create problems for a
462
small number of users, it may create a wider perception of instability or
463
resource hogging.