10
by mbp at sourcefrog
import more files from baz |
1 |
For a tree holding 2.4.18 (two copies), 2.4.19, 2.4.20 |
2 |
||
3 |
With gzip -9: |
|
4 |
||
5 |
mbp@hope% du .bzr |
|
6 |
195110 .bzr/text-store |
|
7 |
20 .bzr/revision-store |
|
8 |
12355 .bzr/inventory-store |
|
9 |
216325 .bzr |
|
10 |
mbp@hope% du -s . |
|
11 |
523128 . |
|
12 |
||
13 |
Without gzip: |
|
14 |
||
15 |
This is actually a pretty bad example because of deleting and |
|
16 |
re-importing 2.4.18, but still not totally unreasonable. |
|
17 |
||
18 |
---- |
|
19 |
||
20 |
linux-2.4.0: 116399 kB |
|
21 |
after addding everything: 119505kB |
|
22 |
bzr status 2.68s user 0.13s system 84% cpu 3.330 total |
|
23 |
bzr commit 'import 2.4.0' 4.41s user 2.15s system 11% cpu 59.490 total |
|
24 |
||
25 |
242446 . |
|
26 |
122068 .bzr |
|
27 |
||
28 |
||
29 |
---- |
|
30 |
||
31 |
Performance (2005-03-01) |
|
32 |
||
33 |
To add all files from linux-2.4.18: about 70s, mostly inventory |
|
34 |
serialization/deserialization. |
|
35 |
||
36 |
To commit: |
|
37 |
- finished, 6.520u/3.870s cpu, 33.940u/10.730s cum |
|
38 |
- 134.040 elapsed |
|
39 |
||
40 |
Interesting that it spends so long on external processing! I wonder |
|
41 |
if this is for running uuidgen? Let's try generating things |
|
42 |
internally. |
|
43 |
||
44 |
Great, this cuts it to 17.15s user 0.61s system 83% cpu 21.365 total |
|
45 |
to add, with no external command time. The commit now seems to spend |
|
46 |
most of its time copying to disk. |
|
47 |
||
48 |
- finished, 6.550u/3.320s cpu, 35.050u/9.870s cum |
|
49 |
- 89.650 elapsed |
|
50 |
||
51 |
I wonder where the external time is now? We were also using uuids() |
|
52 |
for revisions. |
|
53 |
||
54 |
Let's remove everything and re-add. Detecting everything was removed |
|
55 |
takes |
|
56 |
- finished, 2.460u/0.110s cpu, 0.000u/0.000s cum |
|
57 |
- 3.430 elapsed |
|
58 |
||
59 |
which may be mostly XML deserialization? |
|
60 |
||
61 |
Just getting the previous revision takes about this long: |
|
62 |
||
63 |
bzr invoked at Tue 2005-03-01 15:53:05.183741 EST +1100 |
|
64 |
by mbp@sourcefrog.net on hope |
|
65 |
arguments: ['/home/mbp/bin/bzr', 'get-revision-inventory', 'mbp@sourcefrog.net-20050301044608-8513202ab179aff4-44e8cd52a41aa705'] |
|
66 |
platform: Linux-2.6.10-4-686-i686-with-debian-3.1 |
|
67 |
- finished, 3.910u/0.390s cpu, 0.000u/0.000s cum |
|
68 |
- 6.690 elapsed |
|
69 |
||
70 |
Now committing the revision which removes all files should be fast. |
|
71 |
||
72 |
- finished, 1.280u/0.030s cpu, 0.000u/0.000s cum |
|
73 |
- 1.320 elapsed |
|
74 |
||
75 |
Now re-add with new code that doesn't call uuidgen: |
|
76 |
||
77 |
- finished, 1.990u/0.030s cpu, 0.000u/0.000s cum |
|
78 |
- 2.040 elapsed |
|
79 |
||
80 |
16.61s user 0.55s system 74% cpu 22.965 total |
|
81 |
||
82 |
Status:: |
|
83 |
||
84 |
- finished, 2.500u/0.110s cpu, 0.010u/0.000s cum |
|
85 |
- 3.350 elapsed |
|
86 |
||
87 |
And commit:: |
|
88 |
||
89 |
Now patch up to 2.4.19. There were some bugs in handling missing |
|
90 |
directories, but with that fixed we do much better:: |
|
91 |
||
92 |
bzr status 5.86s user 1.06s system 10% cpu 1:05.55 total |
|
93 |
||
94 |
This is slow because it's diffing every file; we should use mtimes etc |
|
95 |
to make this faster. The cpu time is reasonable. |
|
96 |
||
97 |
I see difflib is pure Python; it might be faster to shell out to GNU |
|
98 |
diff when we need it. |
|
99 |
||
100 |
Export is very fast:: |
|
101 |
||
102 |
- finished, 4.220u/1.480s cpu, 0.010u/0.000s cum |
|
103 |
- 10.810 elapsed |
|
104 |
||
105 |
bzr export 1 ../linux-2.4.18.export1 3.92s user 1.72s system 21% cpu 26.030 total |
|
106 |
||
107 |
||
108 |
Now to find and add the new changes:: |
|
109 |
||
110 |
- finished, 2.190u/0.030s cpu, 0.000u/0.000s cum |
|
111 |
- 2.300 elapsed |
|
112 |
||
113 |
||
114 |
:: |
|
115 |
bzr commit 'import 2.4.19' 9.36s user 1.91s system 23% cpu 47.127 total |
|
116 |
||
117 |
And the result is exactly right. Try exporting:: |
|
118 |
||
119 |
mbp@hope% bzr export 4 ../linux-2.4.19.export4 |
|
120 |
bzr export 4 ../linux-2.4.19.export4 4.21s user 1.70s system 18% cpu 32.304 total |
|
121 |
||
122 |
and the export is exactly the same as the tarball. |
|
123 |
||
124 |
Now we can optimize the diff a bit more by not comparing files that |
|
125 |
have the right SHA-1 from within the commit |
|
126 |
||
127 |
For comparison:: |
|
128 |
||
129 |
patch -p1 < ../kernel.pkg/patch-2.4.20 1.61s user 1.03s system 13% cpu 19.106 total |
|
130 |
||
131 |
||
132 |
Now status after applying the .20 patch. With full-text verification:: |
|
133 |
||
134 |
bzr status 7.07s user 1.32s system 13% cpu 1:04.29 total |
|
135 |
||
136 |
with that turned off:: |
|
137 |
||
138 |
bzr status 5.86s user 0.56s system 25% cpu 25.577 total |
|
139 |
||
140 |
After adding: |
|
141 |
||
142 |
bzr status 6.14s user 0.61s system 25% cpu 26.583 total |
|
143 |
||
144 |
Should add some kind of profile counter for quick compares vs slow |
|
145 |
compares. |
|
146 |
||
147 |
bzr commit 'import 2.4.20' 7.57s user 1.36s system 20% cpu 43.568 |
|
148 |
total |
|
149 |
||
150 |
export: finished, 3.940u/1.820s cpu, 0.000u/0.000s cum, 50.990 elapsed |
|
151 |
||
152 |
also exports correctly |
|
153 |
||
154 |
now .21 |
|
155 |
||
156 |
bzr commit 'import 2.4.1' 5.59s user 0.51s system 60% cpu 10.122 total |
|
157 |
||
158 |
265520 . |
|
159 |
137704 .bzr |
|
160 |
||
161 |
import 2.4.2 |
|
162 |
317758 . |
|
163 |
183463 .bzr |
|
164 |
||
165 |
||
166 |
with everything through to 2.4.29 imported, the .bzr directory is |
|
167 |
1132MB, compared to 185MB for one tree. The .bzr.log is 100MB!. So |
|
168 |
the storage is 6.1 times larger, although we're holding 30 versions. |
|
169 |
It's pretty large but I think not ridiculous. By contrast the tarball |
|
170 |
for 2.4.0 is 104MB, and the tarball plus uncompressed patches are |
|
171 |
315MB. |
|
172 |
||
173 |
Uncompressed, the text store is 1041MB. So it is only three times |
|
174 |
worse than patches, and could be compressed at presumably roughly |
|
175 |
equal efficiency. It is large, but also a very simple design and |
|
176 |
perhaps adequate for the moment. The text store with each file |
|
177 |
individually gziped is 264MB, which is also a very simple format and |
|
178 |
makes it less than twice the size of the source tree. |
|
179 |
||
180 |
This is actually rather pessimistic because I think there are some |
|
181 |
orphaned texts in there. |
|
182 |
||
183 |
Measured by du, the compressed full-text store is 363MB; also probably |
|
184 |
tolerable. |
|
185 |
||
186 |
The real fix is perhaps to use some kind of weave, not so much for |
|
187 |
storage efficiency as for fast annotation and therefore possible |
|
89
by mbp at sourcefrog
- don't flush the debug log file so often |
188 |
annotation-based merge. |
189 |
||
190 |
----- |
|
191 |
||
192 |
||
193 |
2005-03-25 |
|
194 |
||
195 |
Now we have recursive add, add is much faster. Adding all of the |
|
196 |
linux 2.4.19 kernel tree takes only |
|
197 |
||
198 |
finished, 5.460u/0.610s cpu, 0.010u/0.000s cum, 6.710 elapsed |
|
199 |
||
200 |
||
201 |
However, the store code currently flushes to disk after every write, |
|
202 |
which is probably excessive. So a commit takes |
|
203 |
||
204 |
finished, 8.740u/3.950s cpu, 0.010u/0.000s cum, 156.420 elapsed |
|
205 |
||
206 |
||
207 |
Status is now also quite fast, depsite that it still has to read all |
|
208 |
the working copies: |
|
209 |
||
210 |
mbp@hope% bzr status ~/work/linux-2.4.19 |
|
211 |
bzr status 5.51s user 0.79s system 99% cpu 6.337 total |
|
212 |
||
213 |
strace shows much of this is in write(2), probably because of |
|
214 |
logging. With more buffering on that file, removing all the explicit |
|
215 |
flushes, that is reduced to |
|
216 |
||
217 |
mbp@hope% time bzr status |
|
218 |
bzr status 5.23s user 0.42s system 97% cpu 5.780 total |
|
219 |
||
220 |
which is mostly opening, stating and reading files, as it should be. |
|
92
by mbp at sourcefrog
more performance measurement |
221 |
Still a few too many stat calls. |
222 |
||
223 |
Now fixed up handling of root directory. |
|
224 |
||
225 |
Without flushing everything to disk as it goes into the store: |
|
226 |
||
227 |
mbp@hope% bzr commit -m 'import linux 2.4.19' |
|
228 |
bzr commit -m 'import linux 2.4.19' 8.15s user 2.09s system 53% cpu 19.295 total |
|
229 |
||
230 |
mbp@hope% time bzr diff |
|
231 |
bzr diff 5.80s user 0.52s system 69% cpu 9.128 total |
|
232 |
mbp@hope% time bzr status |
|
233 |
bzr status 5.64s user 0.43s system 68% cpu 8.848 total |
|
234 |
||
235 |
patch -p1 < ../linux.pkg/patch-2.4.20 1.67s user 0.96s system 90% cpu 2.905 total |
|
236 |
||
237 |
The diff changes 3462 files according to diffstat. |
|
238 |
||
239 |
branch format: Bazaar-NG branch, format 0.0.4 |
|
240 |
||
241 |
in the working tree: |
|
242 |
8674 unchanged |
|
243 |
2463 modified |
|
244 |
818 added |
|
245 |
229 removed |
|
246 |
0 renamed |
|
247 |
0 unknown |
|
248 |
4 ignored |
|
249 |
614 versioned subdirectories |
|
250 |
||
251 |
That is, 3510 entries have changed, but there are 48 changed |
|
252 |
directories so the count is exactly right! |
|
253 |
||
254 |
bzr commit -v -m 'import 2.4.20' 8.23s user 1.09s system 48% cpu 19.411 total |
|
255 |
||
256 |
Kind of strange that this takes as much time as committing the whole |
|
257 |
thing; I suppose it has to read every file. |
|
258 |
||
259 |
This shows many files as being renamed; I don't know why that would |
|
260 |
be. |
|
94
by mbp at sourcefrog
more performance notes |
261 |
|
262 |
||
263 |
Patch to 2.4.21: |
|
264 |
||
265 |
2969 files changed, 366643 insertions(+), 147759 deletions(-) |
|
266 |
||
267 |
After auto-add: |
|
268 |
||
269 |
2969 files changed, 372168 insertions(+), 153284 deletions(-) |
|
270 |
||
271 |
I wonder why it is not exactly the same? Maybe because the python |
|
127
by mbp at sourcefrog
- store support for retrieving compressed files |
272 |
diff algorithm is a bit differnt to GNU diff. |
273 |
||
274 |
---- |
|
275 |
||
276 |
2005-03-29 |
|
277 |
||
278 |
full check, retrieving all file texts once for the 2.4 kernel branch |
|
279 |
takes 10m elapsed, 1m cpu time. lots of random IO and seeking. |
|
280 |
||
138
by mbp at sourcefrog
remove parallel tree from inventory; |
281 |
---- |
282 |
||
283 |
||
284 |
mbp@hope% time python =bzr deleted --show-ids |
|
285 |
README README-fa1d8447b4fd0140-adbf4342752f0fc3 |
|
286 |
python =bzr deleted --show-ids 1.55s user 0.09s system 96% cpu 1.701 total |
|
287 |
mbp@hope% time python -O =bzr deleted --show-ids |
|
288 |
README README-fa1d8447b4fd0140-adbf4342752f0fc3 |
|
289 |
python -O =bzr deleted --show-ids 1.47s user 0.10s system 101% cpu 1.547 total |
|
290 |
mbp@hope% time python -O =bzr deleted --show-ids |
|
291 |
README README-fa1d8447b4fd0140-adbf4342752f0fc3 |
|
292 |
python -O =bzr deleted --show-ids 1.49s user 0.07s system 99% cpu 1.565 total |
|
293 |
mbp@hope% time python =bzr deleted --show-ids |
|
294 |
README README-fa1d8447b4fd0140-adbf4342752f0fc3 |
|
295 |
python =bzr deleted --show-ids 1.55s user 0.08s system 99% cpu 1.637 total |
|
296 |
||
297 |
small but significant improvement from Python -O |
|
298 |
||
148
by mbp at sourcefrog
performance notes and measurements |
299 |
---- |
300 |
||
301 |
Loading a large inventory through cElementTree is pretty quick; only |
|
302 |
about 0.117s. By contrast reading the inventory into our data |
|
303 |
structure takes about 0.7s. |
|
304 |
||
305 |
So I think the problem must be in converting everything to |
|
306 |
InventoryEntries and back again every time. |
|
307 |
||
308 |
Thought about that way it seems pretty inefficient: why create all |
|
309 |
those objects when most of them aren't called on most invocations? |
|
310 |
Instead perhaps the Inventory object should hold the ElementTree and |
|
311 |
pull things out of it only as necessary? We can even have an index |
|
312 |
pointing into the ElementTree by id, path, etc. |
|
149
by mbp at sourcefrog
experiment with new nested inventory file format |
313 |
|
314 |
||
315 |
as of r148 |
|
316 |
||
317 |
bzr deleted 1.46s user 0.08s system 98% cpu 1.561 total |
|
318 |
||
319 |
||
320 |
Alternatively maybe keep an id2path and path2id cache? Keeping it |
|
321 |
coherent may be hard... |