~bzr-pqm/bzr/bzr.dev

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Garbage Collection
==================

Garbage collection is used to remove data from a repository that is no longer referenced.

Generally this involves locking the repository and scanning all its branches
then generating a new repository with less data.

Least work we can hope to perform
---------------------------------

* Read all branches to get initial references - tips + tags.
* Read through the revision graph to find unreferenced revisions. A cheap HEADS
  list might help here by allowing comparison of the initial references to the
  HEADS - any unreferenced head is garbage.
* Walk out via inventory deltas to get the full set of texts and signatures to preserve.
* Copy to a new repository
* Bait and switch back to the original
* Remove the old repository.

A possibility to reduce this would be to have a set of grouped 'known garbage
free' data - 'ancient history' which can be preserved in total should its HEADS
be fully referenced - and where the HEADS list is deliberate cheap (e.g. at the
top of some index).

possibly - null data in place without saving size.