For a tree holding 2.4.18 (two copies), 2.4.19, 2.4.20 With gzip -9: mbp@hope% du .bzr 195110 .bzr/text-store 20 .bzr/revision-store 12355 .bzr/inventory-store 216325 .bzr mbp@hope% du -s . 523128 . Without gzip: This is actually a pretty bad example because of deleting and re-importing 2.4.18, but still not totally unreasonable. ---- linux-2.4.0: 116399 kB after addding everything: 119505kB bzr status 2.68s user 0.13s system 84% cpu 3.330 total bzr commit 'import 2.4.0' 4.41s user 2.15s system 11% cpu 59.490 total 242446 . 122068 .bzr ---- Performance (2005-03-01) To add all files from linux-2.4.18: about 70s, mostly inventory serialization/deserialization. To commit: - finished, 6.520u/3.870s cpu, 33.940u/10.730s cum - 134.040 elapsed Interesting that it spends so long on external processing! I wonder if this is for running uuidgen? Let's try generating things internally. Great, this cuts it to 17.15s user 0.61s system 83% cpu 21.365 total to add, with no external command time. The commit now seems to spend most of its time copying to disk. - finished, 6.550u/3.320s cpu, 35.050u/9.870s cum - 89.650 elapsed I wonder where the external time is now? We were also using uuids() for revisions. Let's remove everything and re-add. Detecting everything was removed takes - finished, 2.460u/0.110s cpu, 0.000u/0.000s cum - 3.430 elapsed which may be mostly XML deserialization? Just getting the previous revision takes about this long: bzr invoked at Tue 2005-03-01 15:53:05.183741 EST +1100 by mbp@sourcefrog.net on hope arguments: ['/home/mbp/bin/bzr', 'get-revision-inventory', 'mbp@sourcefrog.net-20050301044608-8513202ab179aff4-44e8cd52a41aa705'] platform: Linux-2.6.10-4-686-i686-with-debian-3.1 - finished, 3.910u/0.390s cpu, 0.000u/0.000s cum - 6.690 elapsed Now committing the revision which removes all files should be fast. - finished, 1.280u/0.030s cpu, 0.000u/0.000s cum - 1.320 elapsed Now re-add with new code that doesn't call uuidgen: - finished, 1.990u/0.030s cpu, 0.000u/0.000s cum - 2.040 elapsed 16.61s user 0.55s system 74% cpu 22.965 total Status:: - finished, 2.500u/0.110s cpu, 0.010u/0.000s cum - 3.350 elapsed And commit:: Now patch up to 2.4.19. There were some bugs in handling missing directories, but with that fixed we do much better:: bzr status 5.86s user 1.06s system 10% cpu 1:05.55 total This is slow because it's diffing every file; we should use mtimes etc to make this faster. The cpu time is reasonable. I see difflib is pure Python; it might be faster to shell out to GNU diff when we need it. Export is very fast:: - finished, 4.220u/1.480s cpu, 0.010u/0.000s cum - 10.810 elapsed bzr export 1 ../linux-2.4.18.export1 3.92s user 1.72s system 21% cpu 26.030 total Now to find and add the new changes:: - finished, 2.190u/0.030s cpu, 0.000u/0.000s cum - 2.300 elapsed :: bzr commit 'import 2.4.19' 9.36s user 1.91s system 23% cpu 47.127 total And the result is exactly right. Try exporting:: mbp@hope% bzr export 4 ../linux-2.4.19.export4 bzr export 4 ../linux-2.4.19.export4 4.21s user 1.70s system 18% cpu 32.304 total and the export is exactly the same as the tarball. Now we can optimize the diff a bit more by not comparing files that have the right SHA-1 from within the commit For comparison:: patch -p1 < ../kernel.pkg/patch-2.4.20 1.61s user 1.03s system 13% cpu 19.106 total Now status after applying the .20 patch. With full-text verification:: bzr status 7.07s user 1.32s system 13% cpu 1:04.29 total with that turned off:: bzr status 5.86s user 0.56s system 25% cpu 25.577 total After adding: bzr status 6.14s user 0.61s system 25% cpu 26.583 total Should add some kind of profile counter for quick compares vs slow compares. bzr commit 'import 2.4.20' 7.57s user 1.36s system 20% cpu 43.568 total export: finished, 3.940u/1.820s cpu, 0.000u/0.000s cum, 50.990 elapsed also exports correctly now .21 bzr commit 'import 2.4.1' 5.59s user 0.51s system 60% cpu 10.122 total 265520 . 137704 .bzr import 2.4.2 317758 . 183463 .bzr with everything through to 2.4.29 imported, the .bzr directory is 1132MB, compared to 185MB for one tree. The .bzr.log is 100MB!. So the storage is 6.1 times larger, although we're holding 30 versions. It's pretty large but I think not ridiculous. By contrast the tarball for 2.4.0 is 104MB, and the tarball plus uncompressed patches are 315MB. Uncompressed, the text store is 1041MB. So it is only three times worse than patches, and could be compressed at presumably roughly equal efficiency. It is large, but also a very simple design and perhaps adequate for the moment. The text store with each file individually gziped is 264MB, which is also a very simple format and makes it less than twice the size of the source tree. This is actually rather pessimistic because I think there are some orphaned texts in there. Measured by du, the compressed full-text store is 363MB; also probably tolerable. The real fix is perhaps to use some kind of weave, not so much for storage efficiency as for fast annotation and therefore possible annotation-based merge. ----- 2005-03-25 Now we have recursive add, add is much faster. Adding all of the linux 2.4.19 kernel tree takes only finished, 5.460u/0.610s cpu, 0.010u/0.000s cum, 6.710 elapsed However, the store code currently flushes to disk after every write, which is probably excessive. So a commit takes finished, 8.740u/3.950s cpu, 0.010u/0.000s cum, 156.420 elapsed Status is now also quite fast, depsite that it still has to read all the working copies: mbp@hope% bzr status ~/work/linux-2.4.19 bzr status 5.51s user 0.79s system 99% cpu 6.337 total strace shows much of this is in write(2), probably because of logging. With more buffering on that file, removing all the explicit flushes, that is reduced to mbp@hope% time bzr status bzr status 5.23s user 0.42s system 97% cpu 5.780 total which is mostly opening, stating and reading files, as it should be. Still a few too many stat calls. Now fixed up handling of root directory. Without flushing everything to disk as it goes into the store: mbp@hope% bzr commit -m 'import linux 2.4.19' bzr commit -m 'import linux 2.4.19' 8.15s user 2.09s system 53% cpu 19.295 total mbp@hope% time bzr diff bzr diff 5.80s user 0.52s system 69% cpu 9.128 total mbp@hope% time bzr status bzr status 5.64s user 0.43s system 68% cpu 8.848 total patch -p1 < ../linux.pkg/patch-2.4.20 1.67s user 0.96s system 90% cpu 2.905 total The diff changes 3462 files according to diffstat. branch format: Bazaar-NG branch, format 0.0.4 in the working tree: 8674 unchanged 2463 modified 818 added 229 removed 0 renamed 0 unknown 4 ignored 614 versioned subdirectories That is, 3510 entries have changed, but there are 48 changed directories so the count is exactly right! bzr commit -v -m 'import 2.4.20' 8.23s user 1.09s system 48% cpu 19.411 total Kind of strange that this takes as much time as committing the whole thing; I suppose it has to read every file. This shows many files as being renamed; I don't know why that would be. Patch to 2.4.21: 2969 files changed, 366643 insertions(+), 147759 deletions(-) After auto-add: 2969 files changed, 372168 insertions(+), 153284 deletions(-) I wonder why it is not exactly the same? Maybe because the python diff algorithm is a bit differnt to GNU diff. ---- 2005-03-29 full check, retrieving all file texts once for the 2.4 kernel branch takes 10m elapsed, 1m cpu time. lots of random IO and seeking. ---- mbp@hope% time python =bzr deleted --show-ids README README-fa1d8447b4fd0140-adbf4342752f0fc3 python =bzr deleted --show-ids 1.55s user 0.09s system 96% cpu 1.701 total mbp@hope% time python -O =bzr deleted --show-ids README README-fa1d8447b4fd0140-adbf4342752f0fc3 python -O =bzr deleted --show-ids 1.47s user 0.10s system 101% cpu 1.547 total mbp@hope% time python -O =bzr deleted --show-ids README README-fa1d8447b4fd0140-adbf4342752f0fc3 python -O =bzr deleted --show-ids 1.49s user 0.07s system 99% cpu 1.565 total mbp@hope% time python =bzr deleted --show-ids README README-fa1d8447b4fd0140-adbf4342752f0fc3 python =bzr deleted --show-ids 1.55s user 0.08s system 99% cpu 1.637 total small but significant improvement from Python -O ---- Loading a large inventory through cElementTree is pretty quick; only about 0.117s. By contrast reading the inventory into our data structure takes about 0.7s. So I think the problem must be in converting everything to InventoryEntries and back again every time. Thought about that way it seems pretty inefficient: why create all those objects when most of them aren't called on most invocations? Instead perhaps the Inventory object should hold the ElementTree and pull things out of it only as necessary? We can even have an index pointing into the ElementTree by id, path, etc. as of r148 bzr deleted 1.46s user 0.08s system 98% cpu 1.561 total Alternatively maybe keep an id2path and path2id cache? Keeping it coherent may be hard...