4
Garbage collection is used to remove data from a repository that is no longer referenced.
6
Generally this involves locking the repository and scanning all its branches
7
then generating a new repository with less data.
9
Least work we can hope to perform
10
---------------------------------
12
* Read all branches to get initial references - tips + tags.
13
* Read through the revision graph to find unreferenced revisions. A cheap HEADS
14
list might help here by allowing comparison of the initial references to the
15
HEADS - any unreferenced head is garbage.
16
* Walk out via inventory deltas to get the full set of texts and signatures to preserve.
17
* Copy to a new repository
18
* Bait and switch back to the original
19
* Remove the old repository.
21
A possibility to reduce this would be to have a set of grouped 'known garbage
22
free' data - 'ancient history' which can be preserved in total should its HEADS
23
be fully referenced - and where the HEADS list is deliberate cheap (e.g. at the
26
possibly - null data in place without saving size.