10
by mbp at sourcefrog
import more files from baz |
1 |
For a tree holding 2.4.18 (two copies), 2.4.19, 2.4.20 |
2 |
||
3 |
With gzip -9: |
|
4 |
||
5 |
mbp@hope% du .bzr |
|
6 |
195110 .bzr/text-store |
|
7 |
20 .bzr/revision-store |
|
8 |
12355 .bzr/inventory-store |
|
9 |
216325 .bzr |
|
10 |
mbp@hope% du -s . |
|
11 |
523128 . |
|
12 |
||
13 |
Without gzip: |
|
14 |
||
15 |
This is actually a pretty bad example because of deleting and |
|
16 |
re-importing 2.4.18, but still not totally unreasonable. |
|
17 |
||
18 |
---- |
|
19 |
||
20 |
linux-2.4.0: 116399 kB |
|
21 |
after addding everything: 119505kB |
|
22 |
bzr status 2.68s user 0.13s system 84% cpu 3.330 total |
|
23 |
bzr commit 'import 2.4.0' 4.41s user 2.15s system 11% cpu 59.490 total |
|
24 |
||
25 |
242446 . |
|
26 |
122068 .bzr |
|
27 |
||
28 |
||
29 |
---- |
|
30 |
||
31 |
Performance (2005-03-01) |
|
32 |
||
33 |
To add all files from linux-2.4.18: about 70s, mostly inventory |
|
34 |
serialization/deserialization. |
|
35 |
||
36 |
To commit: |
|
37 |
- finished, 6.520u/3.870s cpu, 33.940u/10.730s cum |
|
38 |
- 134.040 elapsed |
|
39 |
||
40 |
Interesting that it spends so long on external processing! I wonder |
|
41 |
if this is for running uuidgen? Let's try generating things |
|
42 |
internally. |
|
43 |
||
44 |
Great, this cuts it to 17.15s user 0.61s system 83% cpu 21.365 total |
|
45 |
to add, with no external command time. The commit now seems to spend |
|
46 |
most of its time copying to disk. |
|
47 |
||
48 |
- finished, 6.550u/3.320s cpu, 35.050u/9.870s cum |
|
49 |
- 89.650 elapsed |
|
50 |
||
51 |
I wonder where the external time is now? We were also using uuids() |
|
52 |
for revisions. |
|
53 |
||
54 |
Let's remove everything and re-add. Detecting everything was removed |
|
55 |
takes |
|
56 |
- finished, 2.460u/0.110s cpu, 0.000u/0.000s cum |
|
57 |
- 3.430 elapsed |
|
58 |
||
59 |
which may be mostly XML deserialization? |
|
60 |
||
61 |
Just getting the previous revision takes about this long: |
|
62 |
||
63 |
bzr invoked at Tue 2005-03-01 15:53:05.183741 EST +1100 |
|
64 |
by mbp@sourcefrog.net on hope |
|
65 |
arguments: ['/home/mbp/bin/bzr', 'get-revision-inventory', 'mbp@sourcefrog.net-20050301044608-8513202ab179aff4-44e8cd52a41aa705'] |
|
66 |
platform: Linux-2.6.10-4-686-i686-with-debian-3.1 |
|
67 |
- finished, 3.910u/0.390s cpu, 0.000u/0.000s cum |
|
68 |
- 6.690 elapsed |
|
69 |
||
70 |
Now committing the revision which removes all files should be fast. |
|
71 |
||
72 |
- finished, 1.280u/0.030s cpu, 0.000u/0.000s cum |
|
73 |
- 1.320 elapsed |
|
74 |
||
75 |
Now re-add with new code that doesn't call uuidgen: |
|
76 |
||
77 |
- finished, 1.990u/0.030s cpu, 0.000u/0.000s cum |
|
78 |
- 2.040 elapsed |
|
79 |
||
80 |
16.61s user 0.55s system 74% cpu 22.965 total |
|
81 |
||
82 |
Status:: |
|
83 |
||
84 |
- finished, 2.500u/0.110s cpu, 0.010u/0.000s cum |
|
85 |
- 3.350 elapsed |
|
86 |
||
87 |
And commit:: |
|
88 |
||
89 |
Now patch up to 2.4.19. There were some bugs in handling missing |
|
90 |
directories, but with that fixed we do much better:: |
|
91 |
||
92 |
bzr status 5.86s user 1.06s system 10% cpu 1:05.55 total |
|
93 |
||
94 |
This is slow because it's diffing every file; we should use mtimes etc |
|
95 |
to make this faster. The cpu time is reasonable. |
|
96 |
||
97 |
I see difflib is pure Python; it might be faster to shell out to GNU |
|
98 |
diff when we need it. |
|
99 |
||
100 |
Export is very fast:: |
|
101 |
||
102 |
- finished, 4.220u/1.480s cpu, 0.010u/0.000s cum |
|
103 |
- 10.810 elapsed |
|
104 |
||
105 |
bzr export 1 ../linux-2.4.18.export1 3.92s user 1.72s system 21% cpu 26.030 total |
|
106 |
||
107 |
||
108 |
Now to find and add the new changes:: |
|
109 |
||
110 |
- finished, 2.190u/0.030s cpu, 0.000u/0.000s cum |
|
111 |
- 2.300 elapsed |
|
112 |
||
113 |
||
114 |
:: |
|
115 |
bzr commit 'import 2.4.19' 9.36s user 1.91s system 23% cpu 47.127 total |
|
116 |
||
117 |
And the result is exactly right. Try exporting:: |
|
118 |
||
119 |
mbp@hope% bzr export 4 ../linux-2.4.19.export4 |
|
120 |
bzr export 4 ../linux-2.4.19.export4 4.21s user 1.70s system 18% cpu 32.304 total |
|
121 |
||
122 |
and the export is exactly the same as the tarball. |
|
123 |
||
124 |
Now we can optimize the diff a bit more by not comparing files that |
|
125 |
have the right SHA-1 from within the commit |
|
126 |
||
127 |
For comparison:: |
|
128 |
||
129 |
patch -p1 < ../kernel.pkg/patch-2.4.20 1.61s user 1.03s system 13% cpu 19.106 total |
|
130 |
||
131 |
||
132 |
Now status after applying the .20 patch. With full-text verification:: |
|
133 |
||
134 |
bzr status 7.07s user 1.32s system 13% cpu 1:04.29 total |
|
135 |
||
136 |
with that turned off:: |
|
137 |
||
138 |
bzr status 5.86s user 0.56s system 25% cpu 25.577 total |
|
139 |
||
140 |
After adding: |
|
141 |
||
142 |
bzr status 6.14s user 0.61s system 25% cpu 26.583 total |
|
143 |
||
144 |
Should add some kind of profile counter for quick compares vs slow |
|
145 |
compares. |
|
146 |
||
147 |
bzr commit 'import 2.4.20' 7.57s user 1.36s system 20% cpu 43.568 |
|
148 |
total |
|
149 |
||
150 |
export: finished, 3.940u/1.820s cpu, 0.000u/0.000s cum, 50.990 elapsed |
|
151 |
||
152 |
also exports correctly |
|
153 |
||
154 |
now .21 |
|
155 |
||
156 |
bzr commit 'import 2.4.1' 5.59s user 0.51s system 60% cpu 10.122 total |
|
157 |
||
158 |
265520 . |
|
159 |
137704 .bzr |
|
160 |
||
161 |
import 2.4.2 |
|
162 |
317758 . |
|
163 |
183463 .bzr |
|
164 |
||
165 |
||
166 |
with everything through to 2.4.29 imported, the .bzr directory is |
|
167 |
1132MB, compared to 185MB for one tree. The .bzr.log is 100MB!. So |
|
168 |
the storage is 6.1 times larger, although we're holding 30 versions. |
|
169 |
It's pretty large but I think not ridiculous. By contrast the tarball |
|
170 |
for 2.4.0 is 104MB, and the tarball plus uncompressed patches are |
|
171 |
315MB. |
|
172 |
||
173 |
Uncompressed, the text store is 1041MB. So it is only three times |
|
174 |
worse than patches, and could be compressed at presumably roughly |
|
175 |
equal efficiency. It is large, but also a very simple design and |
|
176 |
perhaps adequate for the moment. The text store with each file |
|
177 |
individually gziped is 264MB, which is also a very simple format and |
|
178 |
makes it less than twice the size of the source tree. |
|
179 |
||
180 |
This is actually rather pessimistic because I think there are some |
|
181 |
orphaned texts in there. |
|
182 |
||
183 |
Measured by du, the compressed full-text store is 363MB; also probably |
|
184 |
tolerable. |
|
185 |
||
186 |
The real fix is perhaps to use some kind of weave, not so much for |
|
187 |
storage efficiency as for fast annotation and therefore possible |
|
188 |
annotation-based merge. |