1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
Garbage Collection
==================
Garbage collection is used to remove data from a repository that is no longer referenced.
Generally this involves locking the repository and scanning all its branches
then generating a new repository with less data.
Least work we can hope to perform
---------------------------------
* Read all branches to get initial references - tips + tags.
* Read through the revision graph to find unreferenced revisions. A cheap HEADS
list might help here by allowing comparison of the initial references to the
HEADS - any unreferenced head is garbage.
* Walk out via inventory deltas to get the full set of texts and signatures to preserve.
* Copy to a new repository
* Bait and switch back to the original
* Remove the old repository.
A possibility to reduce this would be to have a set of grouped 'known garbage
free' data - 'ancient history' which can be preserved in total should its HEADS
be fully referenced - and where the HEADS list is deliberate cheap (e.g. at the
top of some index).
possibly - null data in place without saving size.
|