Change GroupCompressor.compress() to return the start_point.
Also, mark empty content with start=end=0.
This also gives us a good starting point to handle duplicate entries (if we
find that makes a difference.)
From experimentation, using 0,0 for empty entries actually makes a big difference
in the text index. Mostly because about 1/2 of all entries have no content,
(all of the directory records, for example), so it allows the compression
to shrink the index a bit.