~bzr-pqm/bzr/bzr.dev

0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
24
# TODO: Perhaps have copy and comparison methods of Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
25
0.1.34 by Martin Pool
remove dead code
26
0.1.17 by Martin Pool
Use objects rather than tuples for tracking VerInfo for
27
class VerInfo(object):
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
28
    """Information about a version in a Weave."""
0.1.17 by Martin Pool
Use objects rather than tuples for tracking VerInfo for
29
    included = frozenset()
30
    def __init__(self, included=None):
31
        if included:
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
32
            self.included = frozenset(included)
0.1.17 by Martin Pool
Use objects rather than tuples for tracking VerInfo for
33
0.1.18 by Martin Pool
Better Knit.dump method
34
    def __repr__(self):
35
        s = self.__class__.__name__ + '('
36
        if self.included:
37
            s += 'included=%r' % (list(self.included))
38
        s += ')'
39
        return s
40
0.1.17 by Martin Pool
Use objects rather than tuples for tracking VerInfo for
41
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
42
class Weave(object):
43
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
44
    
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
45
    A Weave manages versions of line-based text files, keeping track of the
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
46
    originating version for each line.
47
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
48
    Texts can be identified in either of two ways:
49
50
    * a nonnegative index number.
51
52
    * a version-id string.
53
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
54
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
55
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
56
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
57
    The weave is represented as a list mixing edit instructions and
58
    literal text.  Each entry in _l can be either a string (or
59
    unicode), or a tuple.  If a string, it means that the given line
60
    should be output in the currently active revisions.
61
62
    If a tuple, it gives a processing instruction saying in which
63
    revisions the enclosed lines are active.  The tuple has the form
64
    (instruction, version).
65
66
    The instruction can be '{' or '}' for an insertion block, and '['
67
    and ']' for a deletion block respectively.  The version is the
68
    integer version index.
69
0.1.41 by Martin Pool
Doc
70
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
71
72
    * A later version can delete lines that were introduced by any
73
      number of ancestor versions; this implies that deletion
74
      instructions can span insertion blocks without regard to the
75
      insertion block's nesting.
76
0.1.41 by Martin Pool
Doc
77
    * Similarly, deletions need not be properly nested with regard to
78
      each other, because they might have been generated by
79
      independent revisions.
80
81
    * It doesn't seem very useful to have an active insertion
82
      inside an inactive insertion, but it might happen.
83
84
    * Therefore, all instructions are always"considered"; that
85
      is passed onto and off the stack.  An outer inactive block
86
      doesn't disable an inner block.
87
88
    * Lines are enabled if the most recent enclosing insertion is
89
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
90
91
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
92
    _l
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
93
        Text of the weave. 
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
94
95
    _v
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
96
        List of versions, indexed by index number.
97
98
        For each version we store the tuple (included_versions), which
99
        lists the previous versions also considered active.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
100
    """
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
101
    def __init__(self):
102
        self._l = []
103
        self._v = []
0.1.5 by Martin Pool
Add test for storing two text versions.
104
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
105
        
0.1.26 by Martin Pool
Refactor parameters to add command
106
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
107
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
108
  
0.1.26 by Martin Pool
Refactor parameters to add command
109
        Returns the index number of the newly added version.
110
111
        parents
112
            List or set of parent version numbers.
113
114
        text
115
            Sequence of lines to be added in the new version."""
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
116
        self._check_versions(parents)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
117
        self._check_lines(text)
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
118
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
119
        idx = len(self._v)
0.1.5 by Martin Pool
Add test for storing two text versions.
120
0.1.26 by Martin Pool
Refactor parameters to add command
121
        if parents:
122
            parents = frozenset(parents)
123
            delta = self._delta(parents, text)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
124
0.1.31 by Martin Pool
Fix insertion of multiple regions, calculating the right line offset as we go.
125
            # offset gives the number of lines that have been inserted
126
            # into the weave up to the current point; if the original edit instruction
127
            # says to change line A then we actually change (A+offset)
128
            offset = 0
129
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
130
            for i1, i2, newlines in delta:
0.1.29 by Martin Pool
Better internal error
131
                assert 0 <= i1
132
                assert i1 <= i2
133
                assert i2 <= len(self._l)
134
                
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
135
                if i1 != i2:
0.1.29 by Martin Pool
Better internal error
136
                    raise NotImplementedError("can't handle replacing weave [%d:%d] yet"
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
137
                                              % (i1, i2))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
138
139
                self._l.insert(i1 + offset, ('{', idx))
140
                i = i1 + offset + 1
141
                self._l[i:i] = newlines
142
                self._l.insert(i + 1, ('}', idx))
143
                offset += 2 + len(newlines)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
144
0.1.26 by Martin Pool
Refactor parameters to add command
145
            self._v.append(VerInfo(parents))
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
146
        else:
0.1.26 by Martin Pool
Refactor parameters to add command
147
            # special case; adding with no parents revision; can do this
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
148
            # more quickly by just appending unconditionally
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
149
            self._l.append(('{', idx))
150
            self._l += text
151
            self._l.append(('}', idx))
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
152
153
            self._v.append(VerInfo())
154
            
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
155
        return idx
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
156
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
157
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
158
    def _check_lines(self, text):
159
        if not isinstance(text, list):
160
            raise ValueError("text should be a list, not %s" % type(text))
161
162
        for l in text:
163
            if not isinstance(l, basestring):
164
                raise ValueError("text line should be a string or unicode, not %s" % type(l))
165
        
166
167
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
168
    def _check_versions(self, indexes):
169
        """Check everything in the sequence of indexes is valid"""
170
        for i in indexes:
171
            try:
172
                self._v[i]
173
            except IndexError:
174
                raise IndexError("invalid version number %r" % i)
175
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
176
    
0.1.7 by Martin Pool
Add trivial annotate text
177
    def annotate(self, index):
178
        return list(self.annotate_iter(index))
179
180
181
    def annotate_iter(self, index):
182
        """Yield list of (index-id, line) pairs for the specified version.
183
184
        The index indicates when the line originated in the weave."""
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
185
        try:
186
            vi = self._v[index]
187
        except IndexError:
188
            raise IndexError('version index %d out of range' % index)
0.1.20 by Martin Pool
Factor out Knit.extract() method
189
        included = set(vi.included)
190
        included.add(index)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
191
        for origin, lineno, text in self._extract(included):
192
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
193
194
195
    def _extract(self, included):
0.1.20 by Martin Pool
Factor out Knit.extract() method
196
        """Yield annotation of lines in included set.
197
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
198
        Yields a sequence of tuples (origin, lineno, text), where
199
        origin is the origin version, lineno the index in the weave,
200
        and text the text of the line.
201
0.1.20 by Martin Pool
Factor out Knit.extract() method
202
        The set typically but not necessarily corresponds to a version.
203
        """
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
204
        stack = []
205
        isactive = False
206
        lineno = 0
0.1.20 by Martin Pool
Factor out Knit.extract() method
207
        
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
208
        for l in self._l:
209
            if isinstance(l, tuple):
210
                c, v = l
211
                if c == '{':
212
                    stack.append(l)
213
                    isactive = (v in included)
214
                elif c == '}':
215
                    oldc, oldv = stack.pop()
216
                    assert oldc == '{'
217
                    assert oldv == v
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
218
                    isactive = stack and (stack[-1][1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
219
                else:
0.1.43 by Martin Pool
More assertions on weave extraction
220
                    raise ValueError("invalid processing instruction %r on line %d"
221
                                     % (l, lineno))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
222
            else:
223
                assert isinstance(l, basestring)
0.1.43 by Martin Pool
More assertions on weave extraction
224
                if not stack:
225
                    raise ValueError("literal at top level on line %d"
226
                                     % lineno)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
227
                if isactive:
228
                    origin = stack[-1][1]
229
                    yield origin, lineno, l
230
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
231
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
232
        if stack:
233
            raise ValueError("unclosed blocks at end of weave",
234
                             stack)
235
0.1.7 by Martin Pool
Add trivial annotate text
236
0.1.5 by Martin Pool
Add test for storing two text versions.
237
    def getiter(self, index):
238
        """Yield lines for the specified version."""
0.1.8 by Martin Pool
Unify get/annotate code
239
        for origin, line in self.annotate_iter(index):
240
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
241
242
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
243
    def get(self, index):
0.1.5 by Martin Pool
Add test for storing two text versions.
244
        return list(self.getiter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
245
246
0.1.11 by Martin Pool
Add Knit.dump method
247
    def dump(self, to_file):
248
        from pprint import pprint
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
249
        print >>to_file, "Weave._l = ",
0.1.11 by Martin Pool
Add Knit.dump method
250
        pprint(self._l, to_file)
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
251
        print >>to_file, "Weave._v = ",
0.1.18 by Martin Pool
Better Knit.dump method
252
        pprint(self._v, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
253
254
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
255
    def check(self):
256
        for vers_info in self._v:
257
            included = set()
258
            for vi in vers_info[0]:
259
                if vi < 0 or vi >= index:
0.1.16 by Martin Pool
formatting
260
                    raise ValueError("invalid included version %d for index %d"
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
261
                                     % (vi, index))
262
                if vi in included:
0.1.16 by Martin Pool
formatting
263
                    raise ValueError("repeated included version %d for index %d"
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
264
                                     % (vi, index))
265
                included.add(vi)
0.1.18 by Martin Pool
Better Knit.dump method
266
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
267
268
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
269
    def _delta(self, included, lines):
270
        """Return changes from basis to new revision.
271
272
        The old text for comparison is the union of included revisions.
273
274
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
275
276
        Delta is returned as a sequence of (line1, line2, newlines),
277
        indicating that line1 through line2 of the old weave should be
278
        replaced by the sequence of lines in newlines.  Note that
279
        these line numbers are positions in the total weave and don't
280
        correspond to the lines in any extracted version, or even the
281
        extracted union of included versions.
282
283
        If line1=line2, this is a pure insert; if newlines=[] this is a
284
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
285
        """
286
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
287
        self._check_versions(included)
288
0.1.23 by Martin Pool
tidy up
289
        ##from pprint import pprint
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
290
291
        # first get basis for comparison
292
        # basis holds (lineno, origin, line)
293
        basis = []
294
0.1.23 by Martin Pool
tidy up
295
        ##print 'my lines:'
296
        ##pprint(self._l)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
297
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
298
        basis = list(self._extract(included))
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
299
300
        # now make a parallel list with only the text, to pass to the differ
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
301
        basis_lines = [line for (origin, lineno, line) in basis]
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
302
303
        # add a sentinal, because we can also match against the final line
304
        basis.append((len(self._l), None))
305
306
        # XXX: which line of the weave should we really consider matches the end of the file?
307
        # the current code says it's the last line of the weave?
308
309
        from difflib import SequenceMatcher
310
        s = SequenceMatcher(None, basis_lines, lines)
311
0.1.23 by Martin Pool
tidy up
312
        ##print 'basis sequence:'
313
        ##pprint(basis)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
314
315
        for tag, i1, i2, j1, j2 in s.get_opcodes():
0.1.23 by Martin Pool
tidy up
316
            ##print tag, i1, i2, j1, j2
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
317
318
            if tag == 'equal':
319
                continue
320
321
            # i1,i2 are given in offsets within basis_lines; we need to map them
322
            # back to offsets within the entire weave
323
            real_i1 = basis[i1][0]
324
            real_i2 = basis[i2][0]
325
0.1.35 by Martin Pool
Clean up Knit._delta method
326
            assert 0 <= j1
327
            assert j1 <= j2
328
            assert j2 <= len(lines)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
329
0.1.35 by Martin Pool
Clean up Knit._delta method
330
            yield real_i1, real_i2, lines[j1:j2]
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
331
0.1.1 by Martin Pool
Check in old existing knit code.
332