~bzr-pqm/bzr/bzr.dev

0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
938 by Martin Pool
- various optimizations to weave add code
24
0.1.58 by Martin Pool
doc
25
# XXX: If we do weaves this way, will a merge still behave the same
26
# way if it's done in a different order?  That's a pretty desirable
27
# property.
28
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
29
# TODO: Nothing here so far assumes the lines are really \n newlines,
30
# rather than being split up in some other way.  We could accomodate
31
# binaries, perhaps by naively splitting on \n or perhaps using
32
# something like a rolling checksum.
33
0.1.85 by Martin Pool
doc
34
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
35
36
# TODO: Check that no insertion occurs inside a deletion that was
37
# active in the version of the insertion.
38
912 by Martin Pool
- update todos for weave
39
# TODO: In addition to the SHA-1 check, perhaps have some code that
40
# checks structural constraints of the weave: ie that insertions are
41
# properly nested, that there is no text outside of an insertion, that
42
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
43
918 by Martin Pool
- start doing new weave-merge algorithm
44
# TODO: Parallel-extract that passes back each line along with a
45
# description of which revisions include it.  Nice for checking all
1393.1.25 by Martin Pool
- comments in weave code
46
# shas or calculating stats in parallel.
918 by Martin Pool
- start doing new weave-merge algorithm
47
1082 by Martin Pool
- lift imports
48
# TODO: Using a single _extract routine and then processing the output
49
# is probably inefficient.  It's simple enough that we can afford to
50
# have slight specializations for different ways its used: annotate,
51
# basis for add, get, etc.
52
1185.16.13 by Martin Pool
todo
53
# TODO: Probably the API should work only in names to hide the integer
54
# indexes from the user.
1083 by Martin Pool
- add space to store revision-id in weave files
55
1328 by Martin Pool
- slight refactoring of weave code
56
# TODO: Is there any potential performance win by having an add()
57
# variant that is passed a pre-cooked version of the single basis
58
# version?
59
1185.16.13 by Martin Pool
todo
60
# TODO: Reweave can possibly be made faster by remembering diffs
61
# where the basis and destination are unchanged.
62
63
# FIXME: Sometimes we will be given a parents list for a revision
64
# that includes some redundant parents (i.e. already a parent of 
65
# something in the list.)  We should eliminate them.  This can 
66
# be done fairly efficiently because the sequence numbers constrain
67
# the possible relationships.
1083 by Martin Pool
- add space to store revision-id in weave files
68
1082 by Martin Pool
- lift imports
69
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
70
import os
1082 by Martin Pool
- lift imports
71
import sha
1322 by Martin Pool
- tidy up import
72
from difflib import SequenceMatcher
1237 by Martin Pool
- allow the same version to be repeatedly added to a weave
73
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
74
from bzrlib.trace import mutter
1185.50.23 by John Arbash Meinel
Adding sha1 check when weave extracts a text.
75
from bzrlib.errors import (WeaveError, WeaveFormatError, WeaveParentMismatch,
76
        WeaveRevisionNotPresent, WeaveRevisionAlreadyPresent)
77
import bzrlib.errors as errors
1185.16.117 by mbp at sourcefrog
Reweave in topological-sort order.
78
from bzrlib.tsort import topo_sort
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
79
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
80
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
81
class Weave(object):
82
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
83
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
84
    A Weave manages versions of line-based text files, keeping track
85
    of the originating version for each line.
86
87
    To clients the "lines" of the file are represented as a list of strings.
88
    These strings  will typically have terminal newline characters, but
89
    this is not required.  In particular files commonly do not have a newline
90
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
91
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
92
    Texts can be identified in either of two ways:
93
94
    * a nonnegative index number.
95
1324 by Martin Pool
- doc
96
    * a version-id string.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
97
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
98
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
99
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
100
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
101
    The weave is represented as a list mixing edit instructions and
944 by Martin Pool
- refactor member names in Weave code
102
    literal text.  Each entry in _weave can be either a string (or
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
103
    unicode), or a tuple.  If a string, it means that the given line
104
    should be output in the currently active revisions.
105
106
    If a tuple, it gives a processing instruction saying in which
107
    revisions the enclosed lines are active.  The tuple has the form
108
    (instruction, version).
109
110
    The instruction can be '{' or '}' for an insertion block, and '['
111
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
112
    integer version index.  There is no replace operator, only deletes
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
113
    and inserts.  For '}', the end of an insertion, there is no
114
    version parameter because it always closes the most recently
115
    opened insertion.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
116
0.1.41 by Martin Pool
Doc
117
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
118
119
    * A later version can delete lines that were introduced by any
120
      number of ancestor versions; this implies that deletion
121
      instructions can span insertion blocks without regard to the
122
      insertion block's nesting.
123
0.1.41 by Martin Pool
Doc
124
    * Similarly, deletions need not be properly nested with regard to
125
      each other, because they might have been generated by
126
      independent revisions.
127
0.1.45 by Martin Pool
doc
128
    * Insertions are always made by inserting a new bracketed block
129
      into a single point in the previous weave.  This implies they
130
      can nest but not overlap, and the nesting must always have later
131
      insertions on the inside.
132
0.1.41 by Martin Pool
Doc
133
    * It doesn't seem very useful to have an active insertion
134
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
135
      
0.1.41 by Martin Pool
Doc
136
    * Therefore, all instructions are always"considered"; that
137
      is passed onto and off the stack.  An outer inactive block
138
      doesn't disable an inner block.
139
140
    * Lines are enabled if the most recent enclosing insertion is
141
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
142
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
143
    * There is no point having a deletion directly inside its own
144
      insertion; you might as well just not write it.  And there
145
      should be no way to get an earlier version deleting a later
146
      version.
147
944 by Martin Pool
- refactor member names in Weave code
148
    _weave
149
        Text of the weave; list of control instruction tuples and strings.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
150
944 by Martin Pool
- refactor member names in Weave code
151
    _parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
152
        List of parents, indexed by version number.
153
        It is only necessary to store the minimal set of parents for
154
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
155
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
156
    _sha1s
1083 by Martin Pool
- add space to store revision-id in weave files
157
        List of hex SHA-1 of each version.
158
159
    _names
160
        List of symbolic names for each version.  Each should be unique.
161
162
    _name_map
163
        For each name, the version number.
1209 by Martin Pool
- Add Weave._weave_name for debugging purposes
164
165
    _weave_name
166
        Descriptive name of this weave; typically the filename if known.
167
        Set by read_weave.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
168
    """
938 by Martin Pool
- various optimizations to weave add code
169
1209 by Martin Pool
- Add Weave._weave_name for debugging purposes
170
    __slots__ = ['_weave', '_parents', '_sha1s', '_names', '_name_map',
171
                 '_weave_name']
938 by Martin Pool
- various optimizations to weave add code
172
    
1209 by Martin Pool
- Add Weave._weave_name for debugging purposes
173
    def __init__(self, weave_name=None):
944 by Martin Pool
- refactor member names in Weave code
174
        self._weave = []
175
        self._parents = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
176
        self._sha1s = []
1083 by Martin Pool
- add space to store revision-id in weave files
177
        self._names = []
178
        self._name_map = {}
1209 by Martin Pool
- Add Weave._weave_name for debugging purposes
179
        self._weave_name = weave_name
0.1.60 by Martin Pool
Weave eq and ne methods
180
1185.35.41 by Aaron Bentley
Added repr for Weave
181
    def __repr__(self):
182
        return "Weave(%r)" % self._weave_name
183
0.1.60 by Martin Pool
Weave eq and ne methods
184
1393.1.51 by Martin Pool
- new Weave.copy()
185
    def copy(self):
186
        """Return a deep copy of self.
187
        
188
        The copy can be modified without affecting the original weave."""
189
        other = Weave()
190
        other._weave = self._weave[:]
191
        other._parents = self._parents[:]
192
        other._sha1s = self._sha1s[:]
193
        other._names = self._names[:]
194
        other._name_map = self._name_map.copy()
195
        other._weave_name = self._weave_name
196
        return other
197
0.1.60 by Martin Pool
Weave eq and ne methods
198
    def __eq__(self, other):
199
        if not isinstance(other, Weave):
200
            return False
944 by Martin Pool
- refactor member names in Weave code
201
        return self._parents == other._parents \
1083 by Martin Pool
- add space to store revision-id in weave files
202
               and self._weave == other._weave \
203
               and self._sha1s == other._sha1s 
204
0.1.60 by Martin Pool
Weave eq and ne methods
205
    
206
    def __ne__(self, other):
207
        return not self.__eq__(other)
208
1185.16.124 by Martin Pool
Add Weave.__contains__.
209
    def __contains__(self, name):
210
        return self._name_map.has_key(name)
1083 by Martin Pool
- add space to store revision-id in weave files
211
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
212
    def maybe_lookup(self, name_or_index):
213
        """Convert possible symbolic name to index, or pass through indexes."""
214
        if isinstance(name_or_index, (int, long)):
215
            return name_or_index
216
        else:
217
            return self.lookup(name_or_index)
218
219
        
1083 by Martin Pool
- add space to store revision-id in weave files
220
    def lookup(self, name):
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
221
        """Convert symbolic version name to index."""
1083 by Martin Pool
- add space to store revision-id in weave files
222
        try:
223
            return self._name_map[name]
224
        except KeyError:
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
225
            raise WeaveRevisionNotPresent(name, self)
1083 by Martin Pool
- add space to store revision-id in weave files
226
1185.16.104 by mbp at sourcefrog
Add Weave.names()
227
    def names(self):
1185.16.105 by mbp at sourcefrog
- fix Weave.names()
228
        return self._names[:]
1229 by Martin Pool
- new Weave.idx_to_name and .parents methods
229
1393.1.49 by Martin Pool
- add Weave._check_consistent_with
230
    def iter_names(self):
231
        """Yield a list of all names in this weave."""
232
        return iter(self._names)
233
1229 by Martin Pool
- new Weave.idx_to_name and .parents methods
234
    def idx_to_name(self, version):
235
        return self._names[version]
236
1323 by Martin Pool
- caller can pass SHA-1 to Weave.add for efficiency
237
    def _check_repeated_add(self, name, parents, text, sha1):
1237 by Martin Pool
- allow the same version to be repeatedly added to a weave
238
        """Check that a duplicated add is OK.
239
240
        If it is, return the (old) index; otherwise raise an exception.
241
        """
242
        idx = self.lookup(name)
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
243
        if sorted(self._parents[idx]) != sorted(parents) \
244
            or sha1 != self._sha1s[idx]:
245
            raise WeaveRevisionAlreadyPresent(name, self)
1237 by Martin Pool
- allow the same version to be repeatedly added to a weave
246
        return idx
247
        
1323 by Martin Pool
- caller can pass SHA-1 to Weave.add for efficiency
248
    def add(self, name, parents, text, sha1=None):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
249
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
250
  
0.1.26 by Martin Pool
Refactor parameters to add command
251
        Returns the index number of the newly added version.
252
1083 by Martin Pool
- add space to store revision-id in weave files
253
        name
254
            Symbolic name for this version.
255
            (Typically the revision-id of the revision that added it.)
256
0.1.26 by Martin Pool
Refactor parameters to add command
257
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
258
            List or set of direct parent version numbers.
259
            
0.1.26 by Martin Pool
Refactor parameters to add command
260
        text
1323 by Martin Pool
- caller can pass SHA-1 to Weave.add for efficiency
261
            Sequence of lines to be added in the new version.
262
263
        sha -- SHA-1 of the file, if known.  This is trusted to be
264
            correct if supplied.
265
        """
1393.1.4 by Martin Pool
- weave tool now finds bzrlib if it's not on the default path
266
        from bzrlib.osutils import sha_strings
938 by Martin Pool
- various optimizations to weave add code
267
1083 by Martin Pool
- add space to store revision-id in weave files
268
        assert isinstance(name, basestring)
1323 by Martin Pool
- caller can pass SHA-1 to Weave.add for efficiency
269
        if sha1 is None:
270
            sha1 = sha_strings(text)
1083 by Martin Pool
- add space to store revision-id in weave files
271
        if name in self._name_map:
1323 by Martin Pool
- caller can pass SHA-1 to Weave.add for efficiency
272
            return self._check_repeated_add(name, parents, text, sha1)
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
273
274
        parents = map(self.maybe_lookup, parents)
938 by Martin Pool
- various optimizations to weave add code
275
        self._check_versions(parents)
0.1.82 by Martin Pool
Small weave optimizations
276
        ## self._check_lines(text)
944 by Martin Pool
- refactor member names in Weave code
277
        new_version = len(self._parents)
0.1.5 by Martin Pool
Add test for storing two text versions.
278
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
279
1083 by Martin Pool
- add space to store revision-id in weave files
280
        # if we abort after here the (in-memory) weave will be corrupt because only
281
        # some fields are updated
1570.1.12 by Robert Collins
Add a FIXME for weave.add and remove some extraneous whitespace.
282
        # XXX: FIXME implement a succeed-or-fail of the rest of this routine.
283
        #      - Robert Collins 20060226
1083 by Martin Pool
- add space to store revision-id in weave files
284
        self._parents.append(parents[:])
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
285
        self._sha1s.append(sha1)
1083 by Martin Pool
- add space to store revision-id in weave files
286
        self._names.append(name)
287
        self._name_map[name] = new_version
938 by Martin Pool
- various optimizations to weave add code
288
289
            
290
        if not parents:
291
            # special case; adding with no parents revision; can do
292
            # this more quickly by just appending unconditionally.
293
            # even more specially, if we're adding an empty text we
294
            # need do nothing at all.
295
            if text:
944 by Martin Pool
- refactor member names in Weave code
296
                self._weave.append(('{', new_version))
297
                self._weave.extend(text)
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
298
                self._weave.append(('}', None))
938 by Martin Pool
- various optimizations to weave add code
299
        
300
            return new_version
301
941 by Martin Pool
- allow for parents specified to Weave.add to be a set
302
        if len(parents) == 1:
303
            pv = list(parents)[0]
304
            if sha1 == self._sha1s[pv]:
305
                # special case: same as the single parent
306
                return new_version
938 by Martin Pool
- various optimizations to weave add code
307
            
308
309
        ancestors = self.inclusions(parents)
310
944 by Martin Pool
- refactor member names in Weave code
311
        l = self._weave
938 by Martin Pool
- various optimizations to weave add code
312
313
        # basis a list of (origin, lineno, line)
314
        basis_lineno = []
315
        basis_lines = []
316
        for origin, lineno, line in self._extract(ancestors):
317
            basis_lineno.append(lineno)
318
            basis_lines.append(line)
319
974.1.26 by aaron.bentley at utoronto
merged mbp@sourcefrog.net-20050817233101-0939da1cf91f2472
320
        # another small special case: a merge, producing the same text
321
        # as auto-merge
938 by Martin Pool
- various optimizations to weave add code
322
        if text == basis_lines:
1570.1.12 by Robert Collins
Add a FIXME for weave.add and remove some extraneous whitespace.
323
            return new_version
938 by Martin Pool
- various optimizations to weave add code
324
325
        # add a sentinal, because we can also match against the final line
944 by Martin Pool
- refactor member names in Weave code
326
        basis_lineno.append(len(self._weave))
938 by Martin Pool
- various optimizations to weave add code
327
328
        # XXX: which line of the weave should we really consider
329
        # matches the end of the file?  the current code says it's the
330
        # last line of the weave?
331
332
        #print 'basis_lines:', basis_lines
333
        #print 'new_lines:  ', lines
334
335
        s = SequenceMatcher(None, basis_lines, text)
336
337
        # offset gives the number of lines that have been inserted
338
        # into the weave up to the current point; if the original edit instruction
339
        # says to change line A then we actually change (A+offset)
340
        offset = 0
341
342
        for tag, i1, i2, j1, j2 in s.get_opcodes():
343
            # i1,i2 are given in offsets within basis_lines; we need to map them
344
            # back to offsets within the entire weave
345
            #print 'raw match', tag, i1, i2, j1, j2
346
            if tag == 'equal':
347
                continue
348
349
            i1 = basis_lineno[i1]
350
            i2 = basis_lineno[i2]
351
352
            assert 0 <= j1 <= j2 <= len(text)
353
354
            #print tag, i1, i2, j1, j2
355
356
            # the deletion and insertion are handled separately.
357
            # first delete the region.
358
            if i1 != i2:
944 by Martin Pool
- refactor member names in Weave code
359
                self._weave.insert(i1+offset, ('[', new_version))
360
                self._weave.insert(i2+offset+1, (']', new_version))
938 by Martin Pool
- various optimizations to weave add code
361
                offset += 2
362
363
            if j1 != j2:
364
                # there may have been a deletion spanning up to
365
                # i2; we want to insert after this region to make sure
366
                # we don't destroy ourselves
367
                i = i2 + offset
944 by Martin Pool
- refactor member names in Weave code
368
                self._weave[i:i] = ([('{', new_version)] 
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
369
                                    + text[j1:j2] 
370
                                    + [('}', None)])
938 by Martin Pool
- various optimizations to weave add code
371
                offset += 2 + (j2 - j1)
372
373
        return new_version
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
374
1092.2.22 by Robert Collins
text_version and name_version unification looking reasonable
375
    def add_identical(self, old_rev_id, new_rev_id, parents):
376
        """Add an identical text to old_rev_id as new_rev_id."""
377
        old_lines = self.get(self.lookup(old_rev_id))
378
        self.add(new_rev_id, parents, old_lines)
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
379
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
380
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
381
        """Return set of all ancestors of given version(s)."""
928 by Martin Pool
- go back to using plain builtin set()
382
        i = set(versions)
1328 by Martin Pool
- slight refactoring of weave code
383
        for v in xrange(max(versions), 0, -1):
384
            if v in i:
385
                # include all its parents
386
                i.update(self._parents[v])
387
        return i
388
        ## except IndexError:
389
        ##     raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
390
391
1229 by Martin Pool
- new Weave.idx_to_name and .parents methods
392
    def parents(self, version):
393
        return self._parents[version]
394
395
1393.1.69 by Martin Pool
- new Weave.parent_names
396
    def parent_names(self, version):
397
        """Return version names for parents of a version."""
398
        return map(self.idx_to_name, self._parents[self.lookup(version)])
399
400
890 by Martin Pool
- weave info should show minimal expression of parents
401
    def minimal_parents(self, version):
402
        """Find the minimal set of parents for the version."""
944 by Martin Pool
- refactor member names in Weave code
403
        included = self._parents[version]
890 by Martin Pool
- weave info should show minimal expression of parents
404
        if not included:
405
            return []
406
        
407
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
408
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
409
410
        mininc = []
928 by Martin Pool
- go back to using plain builtin set()
411
        gotit = set()
890 by Martin Pool
- weave info should show minimal expression of parents
412
413
        for pv in li:
414
            if pv not in gotit:
415
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
416
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
417
418
        assert mininc[0] >= 0
419
        assert mininc[-1] < version
420
        return mininc
421
422
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
423
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
424
    def _check_lines(self, text):
425
        if not isinstance(text, list):
426
            raise ValueError("text should be a list, not %s" % type(text))
427
428
        for l in text:
429
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
430
                raise ValueError("text line should be a string or unicode, not %s"
431
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
432
        
433
434
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
435
    def _check_versions(self, indexes):
436
        """Check everything in the sequence of indexes is valid"""
437
        for i in indexes:
438
            try:
944 by Martin Pool
- refactor member names in Weave code
439
                self._parents[i]
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
440
            except IndexError:
441
                raise IndexError("invalid version number %r" % i)
442
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
443
    
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
444
    def annotate(self, name_or_index):
445
        return list(self.annotate_iter(name_or_index))
446
447
448
    def annotate_iter(self, name_or_index):
0.1.7 by Martin Pool
Add trivial annotate text
449
        """Yield list of (index-id, line) pairs for the specified version.
450
451
        The index indicates when the line originated in the weave."""
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
452
        incls = [self.maybe_lookup(name_or_index)]
453
        for origin, lineno, text in self._extract(incls):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
454
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
455
918 by Martin Pool
- start doing new weave-merge algorithm
456
    def _walk(self):
457
        """Walk the weave.
458
459
        Yields sequence of
460
        (lineno, insert, deletes, text)
461
        for each literal line.
462
        """
463
        
464
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
465
        dset = set()
918 by Martin Pool
- start doing new weave-merge algorithm
466
467
        lineno = 0         # line of weave, 0-based
468
944 by Martin Pool
- refactor member names in Weave code
469
        for l in self._weave:
918 by Martin Pool
- start doing new weave-merge algorithm
470
            if isinstance(l, tuple):
471
                c, v = l
472
                isactive = None
473
                if c == '{':
474
                    istack.append(v)
475
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
476
                    istack.pop()
918 by Martin Pool
- start doing new weave-merge algorithm
477
                elif c == '[':
926 by Martin Pool
- update more weave code to use intsets
478
                    assert v not in dset
479
                    dset.add(v)
918 by Martin Pool
- start doing new weave-merge algorithm
480
                elif c == ']':
926 by Martin Pool
- update more weave code to use intsets
481
                    dset.remove(v)
918 by Martin Pool
- start doing new weave-merge algorithm
482
                else:
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
483
                    raise WeaveFormatError('unexpected instruction %r' % v)
918 by Martin Pool
- start doing new weave-merge algorithm
484
            else:
485
                assert isinstance(l, basestring)
486
                assert istack
487
                yield lineno, istack[-1], dset, l
488
            lineno += 1
489
1185.50.27 by John Arbash Meinel
Cleaned up the check code by using self._walk() instead of manually writing the loop.
490
        if istack:
491
            raise WeaveFormatError("unclosed insertion blocks "
492
                    "at end of weave: %s" % istack)
493
        if dset:
494
            raise WeaveFormatError("unclosed deletion blocks at end of weave: %s"
495
                                   % dset)
918 by Martin Pool
- start doing new weave-merge algorithm
496
893 by Martin Pool
- Refactor weave calculation of inclusions
497
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
498
        """Yield annotation of lines in included set.
499
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
500
        Yields a sequence of tuples (origin, lineno, text), where
501
        origin is the origin version, lineno the index in the weave,
502
        and text the text of the line.
503
0.1.20 by Martin Pool
Factor out Knit.extract() method
504
        The set typically but not necessarily corresponds to a version.
505
        """
1196 by Martin Pool
- [WIP] retrieve historical texts from weaves
506
        for i in versions:
507
            if not isinstance(i, int):
508
                raise ValueError(i)
509
            
893 by Martin Pool
- Refactor weave calculation of inclusions
510
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
511
512
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
513
        dset = set()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
514
515
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
516
894 by Martin Pool
- small optimization for weave extract
517
        isactive = None
0.1.85 by Martin Pool
doc
518
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
519
        result = []
520
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
521
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
522
944 by Martin Pool
- refactor member names in Weave code
523
        for l in self._weave:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
524
            if isinstance(l, tuple):
525
                c, v = l
894 by Martin Pool
- small optimization for weave extract
526
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
527
                if c == '{':
528
                    assert v not in istack
529
                    istack.append(v)
530
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
531
                    istack.pop()
891 by Martin Pool
- fix up refactoring of weave
532
                elif c == '[':
533
                    if v in included:
881 by Martin Pool
- faster weave extraction
534
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
535
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
536
                else:
537
                    assert c == ']'
538
                    if v in included:
881 by Martin Pool
- faster weave extraction
539
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
540
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
541
            else:
542
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
543
                if isactive is None:
544
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
545
                if isactive:
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
546
                    result.append((istack[-1], lineno, l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
547
            lineno += 1
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
548
        if istack:
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
549
            raise WeaveFormatError("unclosed insertion blocks "
550
                    "at end of weave: %s" % istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
551
        if dset:
1185.16.116 by mbp at sourcefrog
Move weave errors into bzrlib.errors and make consistent with new system.
552
            raise WeaveFormatError("unclosed deletion blocks at end of weave: %s"
553
                                   % dset)
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
554
        return result
555
0.1.7 by Martin Pool
Add trivial annotate text
556
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
557
    def get_iter(self, name_or_index):
0.1.5 by Martin Pool
Add test for storing two text versions.
558
        """Yield lines for the specified version."""
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
559
        incls = [self.maybe_lookup(name_or_index)]
1185.50.23 by John Arbash Meinel
Adding sha1 check when weave extracts a text.
560
        if len(incls) == 1:
1185.50.29 by John Arbash Meinel
Whitespace and other formatting cleanups suggested by Robert.
561
            index = incls[0]
562
            cur_sha = sha.new()
1185.50.23 by John Arbash Meinel
Adding sha1 check when weave extracts a text.
563
        else:
564
            # We don't have sha1 sums for multiple entries
1185.50.29 by John Arbash Meinel
Whitespace and other formatting cleanups suggested by Robert.
565
            cur_sha = None
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
566
        for origin, lineno, line in self._extract(incls):
1185.50.29 by John Arbash Meinel
Whitespace and other formatting cleanups suggested by Robert.
567
            if cur_sha:
568
                cur_sha.update(line)
0.1.8 by Martin Pool
Unify get/annotate code
569
            yield line
1185.50.29 by John Arbash Meinel
Whitespace and other formatting cleanups suggested by Robert.
570
        if cur_sha:
1185.50.23 by John Arbash Meinel
Adding sha1 check when weave extracts a text.
571
            expected_sha1 = self._sha1s[index]
1185.50.29 by John Arbash Meinel
Whitespace and other formatting cleanups suggested by Robert.
572
            measured_sha1 = cur_sha.hexdigest() 
1185.50.23 by John Arbash Meinel
Adding sha1 check when weave extracts a text.
573
            if measured_sha1 != expected_sha1:
574
                raise errors.WeaveInvalidChecksum(
575
                        'file %s, revision %s, expected: %s, measured %s' 
576
                        % (self._weave_name, self._names[index],
577
                           expected_sha1, measured_sha1))
0.1.5 by Martin Pool
Add test for storing two text versions.
578
579
1369 by Martin Pool
- try to avoid redundant conversion of strings when retrieving from weaves
580
    def get_text(self, name_or_index):
581
        return ''.join(self.get_iter(name_or_index))
582
        assert isinstance(version, int)
583
584
585
    def get_lines(self, name_or_index):
1269 by Martin Pool
- some weave operations automatically look up symbolic names if supplied
586
        return list(self.get_iter(name_or_index))
0.1.1 by Martin Pool
Check in old existing knit code.
587
588
1369 by Martin Pool
- try to avoid redundant conversion of strings when retrieving from weaves
589
    get = get_lines
590
591
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
592
    def get_sha1(self, name):
593
        """Get the stored sha1 sum for the given revision.
594
        
595
        :param name: The name of the version to lookup
596
        """
597
        return self._sha1s[self.lookup(name)]
598
0.1.95 by Martin Pool
- preliminary merge conflict detection
599
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
600
        """Return composed version of multiple included versions."""
1338 by Martin Pool
- Weave.mash_iter optionally takes names rather than indexes
601
        included = map(self.maybe_lookup, included)
893 by Martin Pool
- Refactor weave calculation of inclusions
602
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
603
            yield text
604
605
0.1.11 by Martin Pool
Add Knit.dump method
606
    def dump(self, to_file):
607
        from pprint import pprint
944 by Martin Pool
- refactor member names in Weave code
608
        print >>to_file, "Weave._weave = ",
609
        pprint(self._weave, to_file)
610
        print >>to_file, "Weave._parents = ",
611
        pprint(self._parents, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
612
613
0.1.91 by Martin Pool
Update Weave.check
614
615
    def numversions(self):
944 by Martin Pool
- refactor member names in Weave code
616
        l = len(self._parents)
0.1.91 by Martin Pool
Update Weave.check
617
        assert l == len(self._sha1s)
618
        return l
619
620
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
621
    def __len__(self):
622
        return self.numversions()
623
894 by Martin Pool
- small optimization for weave extract
624
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
625
        # check no circular inclusions
626
        for version in range(self.numversions()):
944 by Martin Pool
- refactor member names in Weave code
627
            inclusions = list(self._parents[version])
0.1.91 by Martin Pool
Update Weave.check
628
            if inclusions:
629
                inclusions.sort()
630
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
631
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
632
                                           % (inclusions[-1], version))
633
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
634
        # try extracting all versions; parallel extraction is used
894 by Martin Pool
- small optimization for weave extract
635
        nv = self.numversions()
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
636
        sha1s = [sha.new() for i in range(nv)]
637
        texts = [[] for i in range(nv)]
638
        inclusions = []
639
        for i in range(nv):
640
            # For creating the ancestry, IntSet is much faster (3.7s vs 0.17s)
641
            # The problem is that set membership is much more expensive
642
            new_inc = set([i])
643
            for p in self._parents[i]:
644
                new_inc.update(inclusions[p])
645
646
            #assert set(new_inc) == self.inclusions([i]), 'failed %s != %s' % (new_inc, self.inclusions([i]))
647
            inclusions.append(new_inc)
648
649
        nlines = len(self._weave)
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
650
651
        update_text = 'checking weave'
652
        if self._weave_name:
653
            short_name = os.path.basename(self._weave_name)
654
            update_text = 'checking %s' % (short_name,)
655
            update_text = update_text[:25]
656
1185.50.27 by John Arbash Meinel
Cleaned up the check code by using self._walk() instead of manually writing the loop.
657
        for lineno, insert, deleteset, line in self._walk():
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
658
            if progress_bar:
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
659
                progress_bar.update(update_text, lineno, nlines)
1185.50.27 by John Arbash Meinel
Cleaned up the check code by using self._walk() instead of manually writing the loop.
660
661
            for j, j_inc in enumerate(inclusions):
662
                # The active inclusion must be an ancestor,
663
                # and no ancestors must have deleted this line,
664
                # because we don't support resurrection.
665
                if (insert in j_inc) and not (deleteset & j_inc):
666
                    sha1s[j].update(line)
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
667
894 by Martin Pool
- small optimization for weave extract
668
        for version in range(nv):
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
669
            hd = sha1s[version].hexdigest()
0.1.91 by Martin Pool
Update Weave.check
670
            expected = self._sha1s[version]
671
            if hd != expected:
1185.50.26 by John Arbash Meinel
Change Weave.check() so that it checks all revisions in parallel.
672
                raise errors.WeaveInvalidChecksum(
673
                        "mismatched sha1 for version %s: "
674
                        "got %s, expected %s"
675
                        % (self._names[version], hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
676
881 by Martin Pool
- faster weave extraction
677
        # TODO: check insertions are properly nested, that there are
678
        # no lines outside of insertion blocks, that deletions are
679
        # properly paired, etc.
680
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
681
    def _delta(self, included, lines):
682
        """Return changes from basis to new revision.
683
684
        The old text for comparison is the union of included revisions.
685
686
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
687
0.1.55 by Martin Pool
doc
688
        Delta is returned as a sequence of
689
        (weave1, weave2, newlines).
690
691
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
692
        replaced by the sequence of lines in newlines.  Note that
693
        these line numbers are positions in the total weave and don't
694
        correspond to the lines in any extracted version, or even the
695
        extracted union of included versions.
696
697
        If line1=line2, this is a pure insert; if newlines=[] this is a
698
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
699
        """
1393.1.47 by Martin Pool
- Weave._delta is not implemented at the moment
700
        raise NotImplementedError()
0.1.1 by Martin Pool
Check in old existing knit code.
701
918 by Martin Pool
- start doing new weave-merge algorithm
702
            
703
    def plan_merge(self, ver_a, ver_b):
704
        """Return pseudo-annotation indicating how the two versions merge.
705
706
        This is computed between versions a and b and their common
707
        base.
708
709
        Weave lines present in none of them are skipped entirely.
710
        """
926 by Martin Pool
- update more weave code to use intsets
711
        inc_a = self.inclusions([ver_a])
712
        inc_b = self.inclusions([ver_b])
918 by Martin Pool
- start doing new weave-merge algorithm
713
        inc_c = inc_a & inc_b
714
715
        for lineno, insert, deleteset, line in self._walk():
716
            if deleteset & inc_c:
717
                # killed in parent; can't be in either a or b
718
                # not relevant to our work
719
                yield 'killed-base', line
926 by Martin Pool
- update more weave code to use intsets
720
            elif insert in inc_c:
918 by Martin Pool
- start doing new weave-merge algorithm
721
                # was inserted in base
722
                killed_a = bool(deleteset & inc_a)
723
                killed_b = bool(deleteset & inc_b)
724
                if killed_a and killed_b:
725
                    yield 'killed-both', line
726
                elif killed_a:
727
                    yield 'killed-a', line
728
                elif killed_b:
729
                    yield 'killed-b', line
730
                else:
731
                    yield 'unchanged', line
926 by Martin Pool
- update more weave code to use intsets
732
            elif insert in inc_a:
918 by Martin Pool
- start doing new weave-merge algorithm
733
                if deleteset & inc_a:
734
                    yield 'ghost-a', line
735
                else:
736
                    # new in A; not in B
737
                    yield 'new-a', line
926 by Martin Pool
- update more weave code to use intsets
738
            elif insert in inc_b:
918 by Martin Pool
- start doing new weave-merge algorithm
739
                if deleteset & inc_b:
740
                    yield 'ghost-b', line
741
                else:
742
                    yield 'new-b', line
743
            else:
744
                # not in either revision
745
                yield 'irrelevant', line
746
919 by Martin Pool
- more development of weave-merge
747
        yield 'unchanged', ''           # terminator
748
749
750
751
    def weave_merge(self, plan):
752
        lines_a = []
753
        lines_b = []
754
        ch_a = ch_b = False
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
755
        # TODO: Return a structured form of the conflicts (e.g. 2-tuples for
756
        # conflicted regions), rather than just inserting the markers.
757
        # 
758
        # TODO: Show some version information (e.g. author, date) on 
759
        # conflicted regions.
919 by Martin Pool
- more development of weave-merge
760
        for state, line in plan:
761
            if state == 'unchanged' or state == 'killed-both':
762
                # resync and flush queued conflicts changes if any
763
                if not lines_a and not lines_b:
764
                    pass
765
                elif ch_a and not ch_b:
766
                    # one-sided change:                    
767
                    for l in lines_a: yield l
768
                elif ch_b and not ch_a:
769
                    for l in lines_b: yield l
770
                elif lines_a == lines_b:
771
                    for l in lines_a: yield l
772
                else:
1185.12.86 by Aaron Bentley
Switched to standard 7-char conflict markers
773
                    yield '<<<<<<<\n'
919 by Martin Pool
- more development of weave-merge
774
                    for l in lines_a: yield l
1185.12.86 by Aaron Bentley
Switched to standard 7-char conflict markers
775
                    yield '=======\n'
919 by Martin Pool
- more development of weave-merge
776
                    for l in lines_b: yield l
1185.12.86 by Aaron Bentley
Switched to standard 7-char conflict markers
777
                    yield '>>>>>>>\n'
919 by Martin Pool
- more development of weave-merge
778
779
                del lines_a[:]
780
                del lines_b[:]
781
                ch_a = ch_b = False
782
                
783
            if state == 'unchanged':
784
                if line:
785
                    yield line
786
            elif state == 'killed-a':
787
                ch_a = True
788
                lines_b.append(line)
789
            elif state == 'killed-b':
790
                ch_b = True
791
                lines_a.append(line)
792
            elif state == 'new-a':
793
                ch_a = True
794
                lines_a.append(line)
795
            elif state == 'new-b':
796
                ch_b = True
797
                lines_b.append(line)
798
            else:
920 by Martin Pool
- add more test cases for weave_merge
799
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
800
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
801
                       state
802
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
803
1551.3.11 by Aaron Bentley
Merge from Robert
804
    def join(self, other, pb=None, msg=None):
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
805
        import sys, time
1393.1.48 by Martin Pool
- Add stub Weave.join() method
806
        """Integrate versions from other into this weave.
807
1393.1.49 by Martin Pool
- add Weave._check_consistent_with
808
        The resulting weave contains all the history of both weaves; 
809
        any version you could retrieve from either self or other can be 
810
        retrieved from self after this call.
1393.1.48 by Martin Pool
- Add stub Weave.join() method
811
1393.1.49 by Martin Pool
- add Weave._check_consistent_with
812
        It is illegal for the two weaves to contain different values 
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
813
        or different parents for any version.  See also reweave().
1551.3.11 by Aaron Bentley
Merge from Robert
814
815
        :param other: The other weave to pull into this one
816
        :param pb: An optional progress bar
817
        :param msg: An optional message to display for progress
1393.1.48 by Martin Pool
- Add stub Weave.join() method
818
        """
819
        if other.numversions() == 0:
820
            return          # nothing to update, easy
1185.13.4 by Robert Collins
make reweave visible as a weave method, and quickly integrate into fetch
821
        # two loops so that we do not change ourselves before verifying it
822
        # will be ok
1393.1.50 by Martin Pool
- more development of Weave.join()
823
        # work through in index order to make sure we get all dependencies
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
824
        names_to_join = []
825
        processed = 0
1393.1.50 by Martin Pool
- more development of Weave.join()
826
        for other_idx, name in enumerate(other._names):
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
827
            self._check_version_consistent(other, other_idx, name)
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
828
            sha1 = other._sha1s[other_idx]
829
830
            processed += 1
831
832
            if name in self._name_map:
833
                idx = self.lookup(name)
834
                n1 = set(map(other.idx_to_name, other._parents[other_idx]))
835
                n2 = set(map(self.idx_to_name, self._parents[idx]))
836
                if sha1 ==  self._sha1s[idx] and n1 == n2:
837
                        continue
838
839
            names_to_join.append((other_idx, name))
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
840
1551.3.11 by Aaron Bentley
Merge from Robert
841
        if pb and not msg:
842
            msg = 'weave join'
843
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
844
        merged = 0
845
        time0 = time.time( )
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
846
        for other_idx, name in names_to_join:
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
847
            # TODO: If all the parents of the other version are already
1393.1.50 by Martin Pool
- more development of Weave.join()
848
            # present then we can avoid some work by just taking the delta
849
            # and adjusting the offsets.
1393.1.51 by Martin Pool
- new Weave.copy()
850
            new_parents = self._imported_parents(other, other_idx)
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
851
            sha1 = other._sha1s[other_idx]
852
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
853
            merged += 1
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
854
1551.3.11 by Aaron Bentley
Merge from Robert
855
            if pb:
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
856
                pb.update(msg, merged, len(names_to_join))
1551.3.11 by Aaron Bentley
Merge from Robert
857
           
1393.1.50 by Martin Pool
- more development of Weave.join()
858
            lines = other.get_lines(other_idx)
859
            self.add(name, new_parents, lines, sha1)
860
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
861
        mutter("merged = %d, processed = %d, file_id=%s; deltat=%d"%(
1185.77.1 by John Arbash Meinel
Simplifying what bzr displays when merging
862
                merged, processed, self._weave_name, time.time( )-time0))
1185.33.93 by Martin Pool
[patch] Speed improvement for Weave.join() (Goffredo Baroncelli)
863
 
1393.1.51 by Martin Pool
- new Weave.copy()
864
    def _imported_parents(self, other, other_idx):
865
        """Return list of parents in self corresponding to indexes in other."""
866
        new_parents = []
867
        for parent_idx in other._parents[other_idx]:
868
            parent_name = other._names[parent_idx]
869
            if parent_name not in self._names:
870
                # should not be possible
871
                raise WeaveError("missing parent {%s} of {%s} in %r" 
872
                                 % (parent_name, other._name_map[other_idx], self))
873
            new_parents.append(self._name_map[parent_name])
874
        return new_parents
875
1393.1.50 by Martin Pool
- more development of Weave.join()
876
    def _check_version_consistent(self, other, other_idx, name):
877
        """Check if a version in consistent in this and other.
1393.1.66 by Martin Pool
- fix join of weaves where parents occur at different offsets
878
879
        To be consistent it must have:
880
881
         * the same text
882
         * the same direct parents (by name, not index, and disregarding
883
           order)
1393.1.50 by Martin Pool
- more development of Weave.join()
884
        
885
        If present & correct return True;
886
        if not present in self return False; 
887
        if inconsistent raise error."""
888
        this_idx = self._name_map.get(name, -1)
889
        if this_idx != -1:
890
            if self._sha1s[this_idx] != other._sha1s[other_idx]:
1393.1.66 by Martin Pool
- fix join of weaves where parents occur at different offsets
891
                raise WeaveError("inconsistent texts for version {%s} "
892
                                 "when joining weaves"
893
                                 % (name))
894
            self_parents = self._parents[this_idx]
895
            other_parents = other._parents[other_idx]
1551.3.11 by Aaron Bentley
Merge from Robert
896
            n1 = set([self._names[i] for i in self_parents])
897
            n2 = set([other._names[i] for i in other_parents])
1393.1.66 by Martin Pool
- fix join of weaves where parents occur at different offsets
898
            if n1 != n2:
1185.13.4 by Robert Collins
make reweave visible as a weave method, and quickly integrate into fetch
899
                raise WeaveParentMismatch("inconsistent parents "
900
                    "for version {%s}: %s vs %s" % (name, n1, n2))
1393.1.50 by Martin Pool
- more development of Weave.join()
901
            else:
902
                return True         # ok!
903
        else:
904
            return False
918 by Martin Pool
- start doing new weave-merge algorithm
905
1551.3.11 by Aaron Bentley
Merge from Robert
906
    def reweave(self, other, pb=None, msg=None):
907
        """Reweave self with other.
908
909
        :param other: The other weave to merge
910
        :param pb: An optional progress bar, indicating how far done we are
911
        :param msg: An optional message for the progress
912
        """
913
        new_weave = reweave(self, other, pb=pb, msg=msg)
1185.13.4 by Robert Collins
make reweave visible as a weave method, and quickly integrate into fetch
914
        for attr in self.__slots__:
915
            setattr(self, attr, getattr(new_weave, attr))
916
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
917
1551.3.11 by Aaron Bentley
Merge from Robert
918
def reweave(wa, wb, pb=None, msg=None):
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
919
    """Combine two weaves and return the result.
920
921
    This works even if a revision R has different parents in 
922
    wa and wb.  In the resulting weave all the parents are given.
923
924
    This is done by just building up a new weave, maintaining ordering 
925
    of the versions in the two inputs.  More efficient approaches
926
    might be possible but it should only be necessary to do 
927
    this operation rarely, when a new previously ghost version is 
928
    inserted.
1551.3.11 by Aaron Bentley
Merge from Robert
929
930
    :param pb: An optional progress bar, indicating how far done we are
931
    :param msg: An optional message for the progress
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
932
    """
933
    wr = Weave()
934
    ia = ib = 0
935
    queue_a = range(wa.numversions())
936
    queue_b = range(wb.numversions())
937
    # first determine combined parents of all versions
938
    # map from version name -> all parent names
939
    combined_parents = _reweave_parent_graphs(wa, wb)
940
    mutter("combined parents: %r", combined_parents)
1185.16.117 by mbp at sourcefrog
Reweave in topological-sort order.
941
    order = topo_sort(combined_parents.iteritems())
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
942
    mutter("order to reweave: %r", order)
1551.3.11 by Aaron Bentley
Merge from Robert
943
944
    if pb and not msg:
945
        msg = 'reweave'
946
947
    for idx, name in enumerate(order):
948
        if pb:
949
            pb.update(msg, idx, len(order))
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
950
        if name in wa._name_map:
951
            lines = wa.get_lines(name)
952
            if name in wb._name_map:
1551.3.11 by Aaron Bentley
Merge from Robert
953
                lines_b = wb.get_lines(name)
954
                if lines != lines_b:
955
                    mutter('Weaves differ on content. rev_id {%s}', name)
956
                    mutter('weaves: %s, %s', wa._weave_name, wb._weave_name)
957
                    import difflib
958
                    lines = list(difflib.unified_diff(lines, lines_b,
959
                            wa._weave_name, wb._weave_name))
960
                    mutter('lines:\n%s', ''.join(lines))
961
                    raise errors.WeaveTextDiffers(name, wa, wb)
1393.1.68 by Martin Pool
- add reweave that joins with ghosts
962
        else:
963
            lines = wb.get_lines(name)
964
        wr.add(name, combined_parents[name], lines)
965
    return wr
966
967
968
def _reweave_parent_graphs(wa, wb):
969
    """Return combined parent ancestry for two weaves.
970
    
971
    Returned as a list of (version_name, set(parent_names))"""
972
    combined = {}
973
    for weave in [wa, wb]:
974
        for idx, name in enumerate(weave._names):
975
            p = combined.setdefault(name, set())
976
            p.update(map(weave.idx_to_name, weave._parents[idx]))
977
    return combined
978
979
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
980
def weave_toc(w):
981
    """Show the weave's table-of-contents"""
1083 by Martin Pool
- add space to store revision-id in weave files
982
    print '%6s %50s %10s %10s' % ('ver', 'name', 'sha1', 'parents')
983
    for i in (6, 50, 10, 10):
870 by Martin Pool
- better weave info display
984
        print '-' * i,
985
    print
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
986
    for i in range(w.numversions()):
0.1.91 by Martin Pool
Update Weave.check
987
        sha1 = w._sha1s[i]
1083 by Martin Pool
- add space to store revision-id in weave files
988
        name = w._names[i]
989
        parent_str = ' '.join(map(str, w._parents[i]))
990
        print '%6d %-50.50s %10.10s %s' % (i, name, sha1, parent_str)
0.1.88 by Martin Pool
Add weave info command.
991
869 by Martin Pool
- more weave.py command line options
992
993
1393.1.4 by Martin Pool
- weave tool now finds bzrlib if it's not on the default path
994
def weave_stats(weave_file, pb):
947 by Martin Pool
- new 'weave stats' command
995
    from bzrlib.weavefile import read_weave
996
997
    wf = file(weave_file, 'rb')
998
    w = read_weave(wf)
999
    # FIXME: doesn't work on pipes
1000
    weave_size = wf.tell()
1001
1002
    total = 0
1003
    vers = len(w)
1004
    for i in range(vers):
1005
        pb.update('checking sizes', i, vers)
1393.1.24 by Martin Pool
- slight speedup for weave stats
1006
        for origin, lineno, line in w._extract([i]):
947 by Martin Pool
- new 'weave stats' command
1007
            total += len(line)
1008
1009
    pb.clear()
1010
1011
    print 'versions          %9d' % vers
1012
    print 'weave file        %9d bytes' % weave_size
1013
    print 'total contents    %9d bytes' % total
1014
    print 'compression ratio %9.2fx' % (float(total) / float(weave_size))
974.1.26 by aaron.bentley at utoronto
merged mbp@sourcefrog.net-20050817233101-0939da1cf91f2472
1015
    if vers:
1016
        avg = total/vers
1017
        print 'average size      %9d bytes' % avg
1018
        print 'relative size     %9.2fx' % (float(weave_size) / float(avg))
947 by Martin Pool
- new 'weave stats' command
1019
1020
869 by Martin Pool
- more weave.py command line options
1021
def usage():
871 by Martin Pool
- add command for merge-based weave
1022
    print """bzr weave tool
1023
1024
Experimental tool for weave algorithm.
1025
869 by Martin Pool
- more weave.py command line options
1026
usage:
1027
    weave init WEAVEFILE
1028
        Create an empty weave file
1029
    weave get WEAVEFILE VERSION
1030
        Write out specified version.
1031
    weave check WEAVEFILE
1032
        Check consistency of all versions.
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
1033
    weave toc WEAVEFILE
869 by Martin Pool
- more weave.py command line options
1034
        Display table of contents.
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1035
    weave add WEAVEFILE NAME [BASE...] < NEWTEXT
869 by Martin Pool
- more weave.py command line options
1036
        Add NEWTEXT, with specified parent versions.
1037
    weave annotate WEAVEFILE VERSION
1038
        Display origin of each line.
1039
    weave mash WEAVEFILE VERSION...
1040
        Display composite of all selected versions.
1041
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
1042
        Auto-merge two versions and display conflicts.
1329 by Martin Pool
- add diff command to weave utility
1043
    weave diff WEAVEFILE VERSION1 VERSION2 
1044
        Show differences between two versions.
871 by Martin Pool
- add command for merge-based weave
1045
1046
example:
1047
1048
    % weave init foo.weave
1049
    % vi foo.txt
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1050
    % weave add foo.weave ver0 < foo.txt
871 by Martin Pool
- add command for merge-based weave
1051
    added version 0
1052
1053
    (create updated version)
1054
    % vi foo.txt
1055
    % weave get foo.weave 0 | diff -u - foo.txt
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1056
    % weave add foo.weave ver1 0 < foo.txt
871 by Martin Pool
- add command for merge-based weave
1057
    added version 1
1058
1059
    % weave get foo.weave 0 > foo.txt       (create forked version)
1060
    % vi foo.txt
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1061
    % weave add foo.weave ver2 0 < foo.txt
871 by Martin Pool
- add command for merge-based weave
1062
    added version 2
1063
1064
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
1065
    % vi foo.txt                            (resolve conflicts)
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1066
    % weave add foo.weave merged 1 2 < foo.txt     (commit merged version)     
871 by Martin Pool
- add command for merge-based weave
1067
    
869 by Martin Pool
- more weave.py command line options
1068
"""
0.1.88 by Martin Pool
Add weave info command.
1069
    
1070
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1071
1072
def main(argv):
1073
    import sys
1074
    import os
1393.1.4 by Martin Pool
- weave tool now finds bzrlib if it's not on the default path
1075
    try:
1076
        import bzrlib
1077
    except ImportError:
1078
        # in case we're run directly from the subdirectory
1079
        sys.path.append('..')
1080
        import bzrlib
1336 by Martin Pool
- tidy up import
1081
    from bzrlib.weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
1082
    from bzrlib.progress import ProgressBar
1083
1078 by Martin Pool
- use psyco for weave if possible
1084
    try:
1085
        import psyco
1086
        psyco.full()
1087
    except ImportError:
1088
        pass
894 by Martin Pool
- small optimization for weave extract
1089
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
1090
    if len(argv) < 2:
1091
        usage()
1092
        return 0
1093
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1094
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
1095
1096
    def readit():
1097
        return read_weave(file(argv[2], 'rb'))
1098
    
1099
    if cmd == 'help':
1100
        usage()
1101
    elif cmd == 'add':
1102
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1103
        # at the moment, based on everything in the file
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1104
        name = argv[3]
1105
        parents = map(int, argv[4:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
1106
        lines = sys.stdin.readlines()
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1107
        ver = w.add(name, parents, lines)
869 by Martin Pool
- more weave.py command line options
1108
        write_weave(w, file(argv[2], 'wb'))
1084 by Martin Pool
- weave add command needs to take a symbolic name too
1109
        print 'added version %r %d' % (name, ver)
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1110
    elif cmd == 'init':
1111
        fn = argv[2]
1112
        if os.path.exists(fn):
1113
            raise IOError("file exists")
1114
        w = Weave()
869 by Martin Pool
- more weave.py command line options
1115
        write_weave(w, file(fn, 'wb'))
1116
    elif cmd == 'get': # get one version
1117
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
1118
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
1119
        
1120
    elif cmd == 'mash': # get composite
1121
        w = readit()
1122
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
1123
1329 by Martin Pool
- add diff command to weave utility
1124
    elif cmd == 'diff':
1125
        from difflib import unified_diff
1126
        w = readit()
1127
        fn = argv[2]
1128
        v1, v2 = map(int, argv[3:5])
1129
        lines1 = w.get(v1)
1130
        lines2 = w.get(v2)
1131
        diff_gen = unified_diff(lines1, lines2,
1132
                                '%s version %d' % (fn, v1),
1133
                                '%s version %d' % (fn, v2))
1134
        sys.stdout.writelines(diff_gen)
1135
            
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1136
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
1137
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
1138
        # newline is added to all lines regardless; too hard to get
1139
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1140
        lasto = None
1141
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
1142
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1143
            if origin == lasto:
1144
                print '      | %s' % (text)
1145
            else:
1146
                print '%5d | %s' % (origin, text)
1147
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
1148
                
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
1149
    elif cmd == 'toc':
1150
        weave_toc(readit())
947 by Martin Pool
- new 'weave stats' command
1151
1152
    elif cmd == 'stats':
1393.1.4 by Martin Pool
- weave tool now finds bzrlib if it's not on the default path
1153
        weave_stats(argv[2], ProgressBar())
871 by Martin Pool
- add command for merge-based weave
1154
        
0.1.91 by Martin Pool
Update Weave.check
1155
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
1156
        w = readit()
894 by Martin Pool
- small optimization for weave extract
1157
        pb = ProgressBar()
1158
        w.check(pb)
1159
        pb.clear()
938 by Martin Pool
- various optimizations to weave add code
1160
        print '%d versions ok' % w.numversions()
871 by Martin Pool
- add command for merge-based weave
1161
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
1162
    elif cmd == 'inclusions':
1163
        w = readit()
1164
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
1165
1166
    elif cmd == 'parents':
1167
        w = readit()
944 by Martin Pool
- refactor member names in Weave code
1168
        print ' '.join(map(str, w._parents[int(argv[3])]))
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
1169
918 by Martin Pool
- start doing new weave-merge algorithm
1170
    elif cmd == 'plan-merge':
1171
        w = readit()
1172
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
1173
            if line:
1174
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
1175
871 by Martin Pool
- add command for merge-based weave
1176
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
1177
        w = readit()
1178
        p = w.plan_merge(int(argv[3]), int(argv[4]))
1179
        sys.stdout.writelines(w.weave_merge(p))
1180
            
1181
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
1182
        if len(argv) != 5:
1183
            usage()
1184
            return 1
1185
1186
        w = readit()
1187
        v1, v2 = map(int, argv[3:5])
1188
1189
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
1190
1191
        base_lines = list(w.mash_iter(basis))
1192
        a_lines = list(w.get(v1))
1193
        b_lines = list(w.get(v2))
1194
1195
        from bzrlib.merge3 import Merge3
1196
        m3 = Merge3(base_lines, a_lines, b_lines)
1197
1198
        name_a = 'version %d' % v1
1199
        name_b = 'version %d' % v2
1200
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1201
    else:
1202
        raise ValueError('unknown command %r' % cmd)
1203
    
1204
1076 by Martin Pool
- add code to run weave utility under profiler
1205
1206
def profile_main(argv): 
1207
    import tempfile, hotshot, hotshot.stats
1208
1209
    prof_f = tempfile.NamedTemporaryFile()
1210
1211
    prof = hotshot.Profile(prof_f.name)
1212
1213
    ret = prof.runcall(main, argv)
1214
    prof.close()
1215
1216
    stats = hotshot.stats.load(prof_f.name)
1217
    #stats.strip_dirs()
1079 by Martin Pool
- weavefile can just use lists for read-in ancestry, not frozensets
1218
    stats.sort_stats('cumulative')
1076 by Martin Pool
- add code to run weave utility under profiler
1219
    ## XXX: Might like to write to stderr or the trace file instead but
1220
    ## print_stats seems hardcoded to stdout
1221
    stats.print_stats(20)
1222
            
1223
    return ret
1224
1225
1185.33.85 by Martin Pool
New --lsprof option from Denys Duchier
1226
def lsprofile_main(argv): 
1227
    from bzrlib.lsprof import profile
1228
    ret,stats = profile(main, argv)
1229
    stats.sort()
1230
    stats.pprint()
1231
    return ret
1232
1233
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
1234
if __name__ == '__main__':
1235
    import sys
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
1236
    if '--profile' in sys.argv:
1078 by Martin Pool
- use psyco for weave if possible
1237
        args = sys.argv[:]
1081 by Martin Pool
- if weave tool is invoked with no arguments, show help
1238
        args.remove('--profile')
1078 by Martin Pool
- use psyco for weave if possible
1239
        sys.exit(profile_main(args))
1185.33.85 by Martin Pool
New --lsprof option from Denys Duchier
1240
    elif '--lsprof' in sys.argv:
1241
        args = sys.argv[:]
1242
        args.remove('--lsprof')
1243
        sys.exit(lsprofile_main(args))
1078 by Martin Pool
- use psyco for weave if possible
1244
    else:
1245
        sys.exit(main(sys.argv))
1076 by Martin Pool
- add code to run weave utility under profiler
1246