~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to doc/developers/btree_index_request_expansion.rst

Committer: John Arbash Meinel
Date: 2008-10-04 16:11:37 UTC
mto: This revision was merged to the branch mainline in revision 3805.
Revision ID: john@arbash-meinel.com-20081004161137-66rj5cqt4r2dxw32

A bit more info

files modified:
doc/developers/btree_index_request_expansion.rst

Show diffs side-by-side

added added

removed removed

doc/developers/btree_index_request_expansion.rst

213

be too big to read all at once. However, until we've read the root, we don't

214

know the layout, so all we have to go on is the size of the index, though

215

that also gives us the explicit total number of pages.

216

So it doesn't help to read the root page and then decide. However, on the

217

flip side, if we read *before* the split, then we don't gain much, as we are

218

reading pages we aren't likely to be interested in.

219

220

For example:

221

222

We have 100 keys, which fits onto 100 pages, with a single root node. At

223

1,100 keys, it would be 101 leaf pages, which would then cause us to need 2

224

index pages, triggering an extra layer. However, this is very sensitive to

225

the number of keys we fit per-page, which depends on the compression.

226

Although, we could consider 2,000 keys. Which would be 200 leaf nodes, and

227

2 intermediate nodes, and a single root node. It is unlikely that we would

228

ever be able to fit 200 references into a single root node.

229

230

So if we pretend that we split at 1 page, 100 pages, and 10,000 pages. We

231

might be able to say, at 1-5 pages, read all pages, for 5-100 pages, read

232

only the root. At 100 - 500 pages, read 1-5 pages, for 500-10,000 read only

233

the root. At 10,000-50,000 read 1-5 pages again, but above 50,000 read only

234

the root. We could bias this a bit smaller, say at powers of 80, instead of

235

powers of 100, etc. The basic idea is that if we are *close* to a layer

236

split, go ahead and read a small number of extra pages.

216

237

217

238

218

239

Suggested Algorithm

229

250

when a layer has "recently split", so we may want to revisit this.

230

251

231

252

3. Part (2) hints that when reading the root node, we never request any other

232

nodes, as the root is always a layer-unto-itself.

253

nodes, as the root is always a layer-unto-itself. The only exception is when

254

all pages could fit in a single width request.

255

256

4. When expanding, add nodes that are "next" to the node in question, which

257

have not been read yet.

Older »