~bzr-pqm/bzr/bzr.dev

2485.4.2 by Robert Collins
Add gc analysis
1
Garbage Collection
2506.1.1 by Alexander Belchenko
sanitize developers docs
2
==================
2485.4.2 by Robert Collins
Add gc analysis
3
4
Garbage collection is used to remove data from a repository that is no longer referenced.
5
6
Generally this involves locking the repository and scanning all its branches
7
then generating a new repository with less data.
8
9
Least work we can hope to perform
2506.1.1 by Alexander Belchenko
sanitize developers docs
10
---------------------------------
2485.4.2 by Robert Collins
Add gc analysis
11
12
* Read all branches to get initial references - tips + tags.
13
* Read through the revision graph to find unreferenced revisions. A cheap HEADS
14
  list might help here by allowing comparison of the initial references to the
15
  HEADS - any unreferenced head is garbage.
16
* Walk out via inventory deltas to get the full set of texts and signatures to preserve.
17
* Copy to a new repository
18
* Bait and switch back to the original
19
* Remove the old repository.
20
21
A possibility to reduce this would be to have a set of grouped 'known garbage
22
free' data - 'ancient history' which can be preserved in total should its HEADS
23
be fully referenced - and where the HEADS list is deliberate cheap (e.g. at the
24
top of some index).
2485.4.4 by Robert Collins
Add annotate roadmap.
25
26
possibly - null data in place without saving size.