~bzr-pqm/bzr/bzr.dev

Committer: John Arbash Meinel
Date: 2009-03-18 22:45:24 UTC
mto: (3735.2.157 brisbane-core)
mto: This revision was merged to the branch mainline in revision 4280.
Revision ID: john@arbash-meinel.com-20090318224524-ve32it3ddqfzvi6q

Reverted back to the same hash width, and bumped EXTRA_NULLS to 3.
Most entries in a hash bucket are genuinely random, so they don't trigger
extra comparisons. So walking 4-7 nodes is fairly cheap at that level.
My guess is that bumping EXTRA_NULL has a bigger effect when you get the
occassional non-random data, that forces expansion because it gets a
collision.
Data with repetition a multiple of 16 (but not 16) will cause this, as
you can get a large insertion, with lots of dupes.
We filter out when the dupe is exactly a multiple of 16, we may want to
do something similar at larger ranges (or use limit_hash_table on the data
possibly with a much smaller value than 64.)
Most important (next) is to handle the large update case.

files modified:
bzrlib/diff-delta.c

Show diffs side-by-side

added added

removed removed

bzrlib/diff-delta.c

/* The hash map is sized to put 4 entries per bucket, this gives us 2 blank

* spaces.

#define EXTRA_NULLS 1

#define EXTRA_NULLS 3

static const unsigned int T[256] = {

0x00000000, 0xab59b4d1, 0x56b369a2, 0xfdeadd73, 0x063f6795, 0xad66d344,

447

total_num_entries = num_entries + old->num_entries;

448

else

449

total_num_entries = num_entries;

450

hsize = total_num_entries / 2;

450

hsize = total_num_entries / 4;

451

for (i = 4; (1u << i) < hsize && i < 31; i++);

452

hsize = 1 << i;

453

hmask = hsize - 1;

531

first byte to allow for optimizing the Rabin's polynomial

532

initialization in create_delta(). */

533

total_num_entries = num_entries + old_index->num_entries;

534

hsize = total_num_entries / 2;

534

hsize = total_num_entries / 4;

535

for (i = 4; (1u << i) < hsize && i < 31; i++);

536

hsize = 1 << i;

537

if (hsize < old_index->hash_mask) {

Older »