213
213
be too big to read all at once. However, until we've read the root, we don't
214
214
know the layout, so all we have to go on is the size of the index, though
215
215
that also gives us the explicit total number of pages.
216
So it doesn't help to read the root page and then decide. However, on the
217
flip side, if we read *before* the split, then we don't gain much, as we are
218
reading pages we aren't likely to be interested in.
222
We have 100 keys, which fits onto 100 pages, with a single root node. At
223
1,100 keys, it would be 101 leaf pages, which would then cause us to need 2
224
index pages, triggering an extra layer. However, this is very sensitive to
225
the number of keys we fit per-page, which depends on the compression.
226
Although, we could consider 2,000 keys. Which would be 200 leaf nodes, and
227
2 intermediate nodes, and a single root node. It is unlikely that we would
228
ever be able to fit 200 references into a single root node.
230
So if we pretend that we split at 1 page, 100 pages, and 10,000 pages. We
231
might be able to say, at 1-5 pages, read all pages, for 5-100 pages, read
232
only the root. At 100 - 500 pages, read 1-5 pages, for 500-10,000 read only
233
the root. At 10,000-50,000 read 1-5 pages again, but above 50,000 read only
234
the root. We could bias this a bit smaller, say at powers of 80, instead of
235
powers of 100, etc. The basic idea is that if we are *close* to a layer
236
split, go ahead and read a small number of extra pages.
218
239
Suggested Algorithm
229
250
when a layer has "recently split", so we may want to revisit this.
231
252
3. Part (2) hints that when reading the root node, we never request any other
232
nodes, as the root is always a layer-unto-itself.
253
nodes, as the root is always a layer-unto-itself. The only exception is when
254
all pages could fit in a single width request.
256
4. When expanding, add nodes that are "next" to the node in question, which
257
have not been read yet.