269
269
2969 files changed, 372168 insertions(+), 153284 deletions(-)
271
271
I wonder why it is not exactly the same? Maybe because the python
272
diff algorithm is a bit differnt to GNU diff.
278
full check, retrieving all file texts once for the 2.4 kernel branch
279
takes 10m elapsed, 1m cpu time. lots of random IO and seeking.
284
mbp@hope% time python =bzr deleted --show-ids
285
README README-fa1d8447b4fd0140-adbf4342752f0fc3
286
python =bzr deleted --show-ids 1.55s user 0.09s system 96% cpu 1.701 total
287
mbp@hope% time python -O =bzr deleted --show-ids
288
README README-fa1d8447b4fd0140-adbf4342752f0fc3
289
python -O =bzr deleted --show-ids 1.47s user 0.10s system 101% cpu 1.547 total
290
mbp@hope% time python -O =bzr deleted --show-ids
291
README README-fa1d8447b4fd0140-adbf4342752f0fc3
292
python -O =bzr deleted --show-ids 1.49s user 0.07s system 99% cpu 1.565 total
293
mbp@hope% time python =bzr deleted --show-ids
294
README README-fa1d8447b4fd0140-adbf4342752f0fc3
295
python =bzr deleted --show-ids 1.55s user 0.08s system 99% cpu 1.637 total
297
small but significant improvement from Python -O
301
Loading a large inventory through cElementTree is pretty quick; only
302
about 0.117s. By contrast reading the inventory into our data
303
structure takes about 0.7s.
305
So I think the problem must be in converting everything to
306
InventoryEntries and back again every time.
308
Thought about that way it seems pretty inefficient: why create all
309
those objects when most of them aren't called on most invocations?
310
Instead perhaps the Inventory object should hold the ElementTree and
311
pull things out of it only as necessary? We can even have an index
312
pointing into the ElementTree by id, path, etc.
317
bzr deleted 1.46s user 0.08s system 98% cpu 1.561 total
320
Alternatively maybe keep an id2path and path2id cache? Keeping it
321
coherent may be hard...
272
diff algorithm is a bit differnt to GNU diff.
b'\\ No newline at end of file'