186
186
The real fix is perhaps to use some kind of weave, not so much for
187
187
storage efficiency as for fast annotation and therefore possible
188
annotation-based merge.
b'\\ No newline at end of file'
188
annotation-based merge.
195
Now we have recursive add, add is much faster. Adding all of the
196
linux 2.4.19 kernel tree takes only
198
finished, 5.460u/0.610s cpu, 0.010u/0.000s cum, 6.710 elapsed
201
However, the store code currently flushes to disk after every write,
202
which is probably excessive. So a commit takes
204
finished, 8.740u/3.950s cpu, 0.010u/0.000s cum, 156.420 elapsed
207
Status is now also quite fast, depsite that it still has to read all
210
mbp@hope% bzr status ~/work/linux-2.4.19
211
bzr status 5.51s user 0.79s system 99% cpu 6.337 total
213
strace shows much of this is in write(2), probably because of
214
logging. With more buffering on that file, removing all the explicit
215
flushes, that is reduced to
217
mbp@hope% time bzr status
218
bzr status 5.23s user 0.42s system 97% cpu 5.780 total
220
which is mostly opening, stating and reading files, as it should be.
221
Still a few too many stat calls.
223
Now fixed up handling of root directory.
225
Without flushing everything to disk as it goes into the store:
227
mbp@hope% bzr commit -m 'import linux 2.4.19'
228
bzr commit -m 'import linux 2.4.19' 8.15s user 2.09s system 53% cpu 19.295 total
230
mbp@hope% time bzr diff
231
bzr diff 5.80s user 0.52s system 69% cpu 9.128 total
232
mbp@hope% time bzr status
233
bzr status 5.64s user 0.43s system 68% cpu 8.848 total
235
patch -p1 < ../linux.pkg/patch-2.4.20 1.67s user 0.96s system 90% cpu 2.905 total
237
The diff changes 3462 files according to diffstat.
239
branch format: Bazaar-NG branch, format 0.0.4
249
614 versioned subdirectories
251
That is, 3510 entries have changed, but there are 48 changed
252
directories so the count is exactly right!
254
bzr commit -v -m 'import 2.4.20' 8.23s user 1.09s system 48% cpu 19.411 total
256
Kind of strange that this takes as much time as committing the whole
257
thing; I suppose it has to read every file.
259
This shows many files as being renamed; I don't know why that would
265
2969 files changed, 366643 insertions(+), 147759 deletions(-)
269
2969 files changed, 372168 insertions(+), 153284 deletions(-)
271
I wonder why it is not exactly the same? Maybe because the python
272
diff algorithm is a bit differnt to GNU diff.
278
full check, retrieving all file texts once for the 2.4 kernel branch
279
takes 10m elapsed, 1m cpu time. lots of random IO and seeking.
284
mbp@hope% time python =bzr deleted --show-ids
285
README README-fa1d8447b4fd0140-adbf4342752f0fc3
286
python =bzr deleted --show-ids 1.55s user 0.09s system 96% cpu 1.701 total
287
mbp@hope% time python -O =bzr deleted --show-ids
288
README README-fa1d8447b4fd0140-adbf4342752f0fc3
289
python -O =bzr deleted --show-ids 1.47s user 0.10s system 101% cpu 1.547 total
290
mbp@hope% time python -O =bzr deleted --show-ids
291
README README-fa1d8447b4fd0140-adbf4342752f0fc3
292
python -O =bzr deleted --show-ids 1.49s user 0.07s system 99% cpu 1.565 total
293
mbp@hope% time python =bzr deleted --show-ids
294
README README-fa1d8447b4fd0140-adbf4342752f0fc3
295
python =bzr deleted --show-ids 1.55s user 0.08s system 99% cpu 1.637 total
297
small but significant improvement from Python -O
301
Loading a large inventory through cElementTree is pretty quick; only
302
about 0.117s. By contrast reading the inventory into our data
303
structure takes about 0.7s.
305
So I think the problem must be in converting everything to
306
InventoryEntries and back again every time.
308
Thought about that way it seems pretty inefficient: why create all
309
those objects when most of them aren't called on most invocations?
310
Instead perhaps the Inventory object should hold the ElementTree and
311
pull things out of it only as necessary? We can even have an index
312
pointing into the ElementTree by id, path, etc.
317
bzr deleted 1.46s user 0.08s system 98% cpu 1.561 total
320
Alternatively maybe keep an id2path and path2id cache? Keeping it
321
coherent may be hard...