-
Committer:
John Arbash Meinel
-
Date:
2009-04-22 20:49:51 UTC
-
mto:
This revision was merged to the branch mainline in
revision
4301.
-
Revision ID:
john@arbash-meinel.com-20090422204951-xykrubpy1zehhr9p
Change the pure-python compressor a bit.
Specifically, change how we encode insertions, but factor out that code into
another class.
The primary change is trying to get better line-based alignment for inserts,
subject to the 127 character insert limit.
The old code would take a long insert, split it into 127 byte chunks, and then
split those chunks into lines.
However, that tends to leave hunks that can't be indexed, because they aren't
a complete line.
So now we iterate over the lines, fitting them into 127-byte insertions as
possible, so we get proper indexing.
Note that it means any line > 127 bytes will never be matched, which is
a fairly serious issue in the pure-python matcher, but not worth fixing,
because you can just use the compiled matcher instead.