20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
94
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
96
108
directory, as a list of components
100
112
--- Format 1 had the following different definition: ---
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
108
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
109
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
262
284
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
287
def _unpack_stat(packed_stat):
288
"""Turn a packed_stat back into the stat fields.
290
This is meant as a debugging tool, should not be used in real code.
292
(st_size, st_mtime, st_ctime, st_dev, st_ino,
293
st_mode) = struct.unpack('>LLLLLL', binascii.a2b_base64(packed_stat))
294
return dict(st_size=st_size, st_mtime=st_mtime, st_ctime=st_ctime,
295
st_dev=st_dev, st_ino=st_ino, st_mode=st_mode)
298
class SHA1Provider(object):
299
"""An interface for getting sha1s of a file."""
301
def sha1(self, abspath):
302
"""Return the sha1 of a file given its absolute path.
304
:param abspath: May be a filesystem encoded absolute path
307
raise NotImplementedError(self.sha1)
309
def stat_and_sha1(self, abspath):
310
"""Return the stat and sha1 of a file given its absolute path.
312
:param abspath: May be a filesystem encoded absolute path
315
Note: the stat should be the stat of the physical file
316
while the sha may be the sha of its canonical content.
318
raise NotImplementedError(self.stat_and_sha1)
321
class DefaultSHA1Provider(SHA1Provider):
322
"""A SHA1Provider that reads directly from the filesystem."""
324
def sha1(self, abspath):
325
"""Return the sha1 of a file given its absolute path."""
326
return osutils.sha_file_by_name(abspath)
328
def stat_and_sha1(self, abspath):
329
"""Return the stat and sha1 of a file given its absolute path."""
330
file_obj = file(abspath, 'rb')
332
statvalue = os.fstat(file_obj.fileno())
333
sha1 = osutils.sha_file(file_obj)
336
return statvalue, sha1
265
339
class DirState(object):
266
340
"""Record directory and metadata state for fast access.
355
435
self._cutoff_time = None
356
436
self._split_path_cache = {}
357
437
self._bisect_page_size = DirState.BISECT_PAGE_SIZE
438
self._sha1_provider = sha1_provider
358
439
if 'hashcache' in debug.debug_flags:
359
440
self._sha1_file = self._sha1_file_and_mutter
361
self._sha1_file = osutils.sha_file_by_name
442
self._sha1_file = self._sha1_provider.sha1
362
443
# These two attributes provide a simple cache for lookups into the
363
444
# dirstate in-memory vectors. By probing respectively for the last
364
445
# block, and for the next entry, we save nearly 2 bisections per path
366
447
self._last_block_index = None
367
448
self._last_entry_index = None
449
# The set of known hash changes
450
self._known_hash_changes = set()
451
# How many hash changed entries can we have without saving
452
self._worth_saving_limit = worth_saving_limit
453
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
369
456
def __repr__(self):
370
457
return "%s(%r)" % \
371
458
(self.__class__.__name__, self._filename)
460
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
461
"""Mark this dirstate as modified.
463
:param hash_changed_entries: if non-None, mark just these entries as
464
having their hash modified.
465
:param header_modified: mark the header modified as well, not just the
468
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
469
if hash_changed_entries:
470
self._known_hash_changes.update([e[0] for e in hash_changed_entries])
471
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
472
DirState.IN_MEMORY_UNMODIFIED):
473
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
474
# that takes precedence.
475
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
477
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
478
# should fail noisily if someone tries to set
479
# IN_MEMORY_MODIFIED but we don't have a write-lock!
480
# We don't know exactly what changed so disable smart saving
481
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
483
self._header_state = DirState.IN_MEMORY_MODIFIED
485
def _mark_unmodified(self):
486
"""Mark this dirstate as unmodified."""
487
self._header_state = DirState.IN_MEMORY_UNMODIFIED
488
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
489
self._known_hash_changes = set()
373
491
def add(self, path, file_id, kind, stat, fingerprint):
374
492
"""Add a path to be tracked.
376
494
:param path: The path within the dirstate - '' is the root, 'foo' is the
377
path foo within the root, 'foo/bar' is the path bar within foo
495
path foo within the root, 'foo/bar' is the path bar within foo
379
497
:param file_id: The file id of the path being added.
380
:param kind: The kind of the path, as a string like 'file',
498
:param kind: The kind of the path, as a string like 'file',
381
499
'directory', etc.
382
500
:param stat: The output of os.lstat for the path.
383
:param fingerprint: The sha value of the file,
501
:param fingerprint: The sha value of the file's canonical form (i.e.
502
after any read filters have been applied),
384
503
or the target of a symlink,
385
504
or the referenced revision id for tree-references,
386
505
or '' for directories.
389
# find the block its in.
508
# find the block its in.
390
509
# find the location in the block.
391
510
# check its not there
415
534
raise AssertionError(
416
535
"must be a utf8 file_id not %s" % (type(file_id), ))
417
536
# Make sure the file_id does not exist in this tree
418
file_id_entry = self._get_entry(0, fileid_utf8=file_id)
538
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
419
539
if file_id_entry != (None, None):
420
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
421
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
422
info = '%s:%s' % (kind, path)
423
raise errors.DuplicateFileId(file_id, info)
540
if file_id_entry[1][0][0] == 'a':
541
if file_id_entry[0] != (dirname, basename, file_id):
542
# set the old name's current operation to rename
543
self.update_minimal(file_id_entry[0],
549
rename_from = file_id_entry[0][0:2]
551
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
552
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
553
info = '%s:%s' % (kind, path)
554
raise errors.DuplicateFileId(file_id, info)
424
555
first_key = (dirname, basename, '')
425
556
block_index, present = self._find_block_index_from_key(first_key)
427
558
# check the path is not in the tree
428
559
block = self._dirblocks[block_index][1]
429
560
entry_index, _ = self._find_entry_index(first_key, block)
430
while (entry_index < len(block) and
561
while (entry_index < len(block) and
431
562
block[entry_index][0][0:2] == first_key[0:2]):
432
563
if block[entry_index][1][0][0] not in 'ar':
433
564
# this path is in the dirstate in the current tree.
1351
def _check_delta_is_valid(self, delta):
1352
return list(inventory._check_delta_unique_ids(
1353
inventory._check_delta_unique_old_paths(
1354
inventory._check_delta_unique_new_paths(
1355
inventory._check_delta_ids_match_entry(
1356
inventory._check_delta_ids_are_valid(
1357
inventory._check_delta_new_path_entry_both_or_None(delta)))))))
1211
1359
def update_by_delta(self, delta):
1212
1360
"""Apply an inventory delta to the dirstate for tree 0
1362
This is the workhorse for apply_inventory_delta in dirstate based
1214
1365
:param delta: An inventory delta. See Inventory.apply_delta for
1217
1368
self._read_dirblocks_if_needed()
1369
encode = cache_utf8.encode
1218
1370
insertions = {}
1220
for old_path, new_path, file_id, inv_entry in sorted(delta, reverse=True):
1372
# Accumulate parent references (path_utf8, id), to check for parentless
1373
# items or items placed under files/links/tree-references. We get
1374
# references from every item in the delta that is not a deletion and
1375
# is not itself the root.
1377
# Added ids must not be in the dirstate already. This set holds those
1380
# This loop transforms the delta to single atomic operations that can
1381
# be executed and validated.
1382
delta = sorted(self._check_delta_is_valid(delta), reverse=True)
1383
for old_path, new_path, file_id, inv_entry in delta:
1221
1384
if (file_id in insertions) or (file_id in removals):
1222
raise AssertionError("repeated file id in delta %r" % (file_id,))
1385
self._raise_invalid(old_path or new_path, file_id,
1223
1387
if old_path is not None:
1224
1388
old_path = old_path.encode('utf-8')
1225
1389
removals[file_id] = old_path
1391
new_ids.add(file_id)
1226
1392
if new_path is not None:
1393
if inv_entry is None:
1394
self._raise_invalid(new_path, file_id,
1395
"new_path with no entry")
1227
1396
new_path = new_path.encode('utf-8')
1228
dirname, basename = osutils.split(new_path)
1229
key = (dirname, basename, file_id)
1397
dirname_utf8, basename = osutils.split(new_path)
1399
parents.add((dirname_utf8, inv_entry.parent_id))
1400
key = (dirname_utf8, basename, file_id)
1230
1401
minikind = DirState._kind_to_minikind[inv_entry.kind]
1231
1402
if minikind == 't':
1232
fingerprint = inv_entry.reference_revision
1403
fingerprint = inv_entry.reference_revision or ''
1234
1405
fingerprint = ''
1235
1406
insertions[file_id] = (key, minikind, inv_entry.executable,
1244
1415
minikind = child[1][0][0]
1245
1416
fingerprint = child[1][0][4]
1246
1417
executable = child[1][0][3]
1247
old_child_path = osutils.pathjoin(child[0][0],
1418
old_child_path = osutils.pathjoin(child_dirname,
1249
1420
removals[child[0][2]] = old_child_path
1250
1421
child_suffix = child_dirname[len(old_path):]
1251
1422
new_child_dirname = (new_path + child_suffix)
1252
1423
key = (new_child_dirname, child_basename, child[0][2])
1253
new_child_path = os.path.join(new_child_dirname,
1424
new_child_path = osutils.pathjoin(new_child_dirname,
1255
1426
insertions[child[0][2]] = (key, minikind, executable,
1256
1427
fingerprint, new_child_path)
1257
self._apply_removals(removals.values())
1258
self._apply_insertions(insertions.values())
1428
self._check_delta_ids_absent(new_ids, delta, 0)
1430
self._apply_removals(removals.iteritems())
1431
self._apply_insertions(insertions.values())
1433
self._after_delta_check_parents(parents, 0)
1434
except errors.BzrError, e:
1435
self._changes_aborted = True
1436
if 'integrity error' not in str(e):
1438
# _get_entry raises BzrError when a request is inconsistent; we
1439
# want such errors to be shown as InconsistentDelta - and that
1440
# fits the behaviour we trigger.
1441
raise errors.InconsistentDeltaDelta(delta,
1442
"error from _get_entry. %s" % (e,))
1260
1444
def _apply_removals(self, removals):
1261
for path in sorted(removals, reverse=True):
1445
for file_id, path in sorted(removals, reverse=True,
1446
key=operator.itemgetter(1)):
1262
1447
dirname, basename = osutils.split(path)
1263
1448
block_i, entry_i, d_present, f_present = \
1264
1449
self._get_block_entry_index(dirname, basename, 0)
1265
entry = self._dirblocks[block_i][1][entry_i]
1451
entry = self._dirblocks[block_i][1][entry_i]
1453
self._raise_invalid(path, file_id,
1454
"Wrong path for old path.")
1455
if not f_present or entry[1][0][0] in 'ar':
1456
self._raise_invalid(path, file_id,
1457
"Wrong path for old path.")
1458
if file_id != entry[0][2]:
1459
self._raise_invalid(path, file_id,
1460
"Attempt to remove path has wrong id - found %r."
1266
1462
self._make_absent(entry)
1267
1463
# See if we have a malformed delta: deleting a directory must not
1268
1464
# leave crud behind. This increases the number of bisects needed
1333
1530
# At the same time, to reduce interface friction we convert the input
1334
1531
# inventory entries to dirstate.
1335
1532
root_only = ('', '')
1533
# Accumulate parent references (path_utf8, id), to check for parentless
1534
# items or items placed under files/links/tree-references. We get
1535
# references from every item in the delta that is not a deletion and
1536
# is not itself the root.
1538
# Added ids must not be in the dirstate already. This set holds those
1336
1541
for old_path, new_path, file_id, inv_entry in delta:
1337
if old_path is None:
1338
adds.append((None, encode(new_path), file_id,
1542
if inv_entry is not None and file_id != inv_entry.file_id:
1543
self._raise_invalid(new_path, file_id,
1544
"mismatched entry file_id %r" % inv_entry)
1545
if new_path is None:
1546
new_path_utf8 = None
1548
if inv_entry is None:
1549
self._raise_invalid(new_path, file_id,
1550
"new_path with no entry")
1551
new_path_utf8 = encode(new_path)
1552
# note the parent for validation
1553
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1555
parents.add((dirname_utf8, inv_entry.parent_id))
1556
if old_path is None:
1557
old_path_utf8 = None
1559
old_path_utf8 = encode(old_path)
1560
if old_path is None:
1561
adds.append((None, new_path_utf8, file_id,
1339
1562
inv_to_entry(inv_entry), True))
1563
new_ids.add(file_id)
1340
1564
elif new_path is None:
1341
deletes.append((encode(old_path), None, file_id, None, True))
1342
elif (old_path, new_path) != root_only:
1565
deletes.append((old_path_utf8, None, file_id, None, True))
1566
elif (old_path, new_path) == root_only:
1567
# change things in-place
1568
# Note: the case of a parent directory changing its file_id
1569
# tends to break optimizations here, because officially
1570
# the file has actually been moved, it just happens to
1571
# end up at the same path. If we can figure out how to
1572
# handle that case, we can avoid a lot of add+delete
1573
# pairs for objects that stay put.
1574
# elif old_path == new_path:
1575
changes.append((old_path_utf8, new_path_utf8, file_id,
1576
inv_to_entry(inv_entry)))
1344
1579
# Because renames must preserve their children we must have
1345
1580
# processed all relocations and removes before hand. The sort
1354
1589
# for 'r' items on every pass.
1355
1590
self._update_basis_apply_deletes(deletes)
1357
new_path_utf8 = encode(new_path)
1358
1592
# Split into an add/delete pair recursively.
1359
adds.append((None, new_path_utf8, file_id,
1360
inv_to_entry(inv_entry), False))
1593
adds.append((old_path_utf8, new_path_utf8, file_id,
1594
inv_to_entry(inv_entry), False))
1361
1595
# Expunge deletes that we've seen so that deleted/renamed
1362
1596
# children of a rename directory are handled correctly.
1363
new_deletes = reversed(list(self._iter_child_entries(1,
1597
new_deletes = reversed(list(
1598
self._iter_child_entries(1, old_path_utf8)))
1365
1599
# Remove the current contents of the tree at orig_path, and
1366
1600
# reinsert at the correct new path.
1367
1601
for entry in new_deletes:
1369
source_path = entry[0][0] + '/' + entry[0][1]
1602
child_dirname, child_basename, child_file_id = entry[0]
1604
source_path = child_dirname + '/' + child_basename
1371
source_path = entry[0][1]
1606
source_path = child_basename
1372
1607
if new_path_utf8:
1373
1608
target_path = new_path_utf8 + source_path[len(old_path):]
1375
1610
if old_path == '':
1376
1611
raise AssertionError("cannot rename directory to"
1378
1613
target_path = source_path[len(old_path) + 1:]
1379
1614
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1380
1615
deletes.append(
1381
1616
(source_path, target_path, entry[0][2], None, False))
1383
(encode(old_path), new_path, file_id, None, False))
1385
# changes to just the root should not require remove/insertion
1387
changes.append((encode(old_path), encode(new_path), file_id,
1388
inv_to_entry(inv_entry)))
1390
# Finish expunging deletes/first half of renames.
1391
self._update_basis_apply_deletes(deletes)
1392
# Reinstate second half of renames and new paths.
1393
self._update_basis_apply_adds(adds)
1394
# Apply in-situ changes.
1395
self._update_basis_apply_changes(changes)
1397
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
1398
self._header_state = DirState.IN_MEMORY_MODIFIED
1617
deletes.append((old_path_utf8, new_path, file_id, None, False))
1618
self._check_delta_ids_absent(new_ids, delta, 1)
1620
# Finish expunging deletes/first half of renames.
1621
self._update_basis_apply_deletes(deletes)
1622
# Reinstate second half of renames and new paths.
1623
self._update_basis_apply_adds(adds)
1624
# Apply in-situ changes.
1625
self._update_basis_apply_changes(changes)
1627
self._after_delta_check_parents(parents, 1)
1628
except errors.BzrError, e:
1629
self._changes_aborted = True
1630
if 'integrity error' not in str(e):
1632
# _get_entry raises BzrError when a request is inconsistent; we
1633
# want such errors to be shown as InconsistentDelta - and that
1634
# fits the behaviour we trigger.
1635
raise errors.InconsistentDeltaDelta(delta,
1636
"error from _get_entry. %s" % (e,))
1638
self._mark_modified(header_modified=True)
1399
1639
self._id_index = None
1642
def _check_delta_ids_absent(self, new_ids, delta, tree_index):
1643
"""Check that none of the file_ids in new_ids are present in a tree."""
1646
id_index = self._get_id_index()
1647
for file_id in new_ids:
1648
for key in id_index.get(file_id, ()):
1649
block_i, entry_i, d_present, f_present = \
1650
self._get_block_entry_index(key[0], key[1], tree_index)
1652
# In a different tree
1654
entry = self._dirblocks[block_i][1][entry_i]
1655
if entry[0][2] != file_id:
1656
# Different file_id, so not what we want.
1658
self._raise_invalid(("%s/%s" % key[0:2]).decode('utf8'), file_id,
1659
"This file_id is new in the delta but already present in "
1662
def _raise_invalid(self, path, file_id, reason):
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(path, file_id, reason)
1402
1666
def _update_basis_apply_adds(self, adds):
1403
1667
"""Apply a sequence of adds to tree 1 during update_basis_by_delta.
1413
1677
# Adds are accumulated partly from renames, so can be in any input
1414
1678
# order - sort it.
1679
# TODO: we may want to sort in dirblocks order. That way each entry
1680
# will end up in the same directory, allowing the _get_entry
1681
# fast-path for looking up 2 items in the same dir work.
1682
adds.sort(key=lambda x: x[1])
1416
1683
# adds is now in lexographic order, which places all parents before
1417
1684
# their children, so we can process it linearly.
1686
st = static_tuple.StaticTuple
1419
1687
for old_path, new_path, file_id, new_details, real_add in adds:
1420
# the entry for this file_id must be in tree 0.
1421
entry = self._get_entry(0, file_id, new_path)
1422
if entry[0] is None or entry[0][2] != file_id:
1423
self._changes_aborted = True
1424
raise errors.InconsistentDelta(new_path, file_id,
1425
'working tree does not contain new entry')
1426
if real_add and entry[1][1][0] not in absent:
1427
self._changes_aborted = True
1428
raise errors.InconsistentDelta(new_path, file_id,
1429
'The entry was considered to be a genuinely new record,'
1430
' but there was already an old record for it.')
1431
# We don't need to update the target of an 'r' because the handling
1432
# of renames turns all 'r' situations into a delete at the original
1434
entry[1][1] = new_details
1688
dirname, basename = osutils.split(new_path)
1689
entry_key = st(dirname, basename, file_id)
1690
block_index, present = self._find_block_index_from_key(entry_key)
1692
self._raise_invalid(new_path, file_id,
1693
"Unable to find block for this record."
1694
" Was the parent added?")
1695
block = self._dirblocks[block_index][1]
1696
entry_index, present = self._find_entry_index(entry_key, block)
1698
if old_path is not None:
1699
self._raise_invalid(new_path, file_id,
1700
'considered a real add but still had old_path at %s'
1703
entry = block[entry_index]
1704
basis_kind = entry[1][1][0]
1705
if basis_kind == 'a':
1706
entry[1][1] = new_details
1707
elif basis_kind == 'r':
1708
raise NotImplementedError()
1710
self._raise_invalid(new_path, file_id,
1711
"An entry was marked as a new add"
1712
" but the basis target already existed")
1714
# The exact key was not found in the block. However, we need to
1715
# check if there is a key next to us that would have matched.
1716
# We only need to check 2 locations, because there are only 2
1718
for maybe_index in range(entry_index-1, entry_index+1):
1719
if maybe_index < 0 or maybe_index >= len(block):
1721
maybe_entry = block[maybe_index]
1722
if maybe_entry[0][:2] != (dirname, basename):
1723
# Just a random neighbor
1725
if maybe_entry[0][2] == file_id:
1726
raise AssertionError(
1727
'_find_entry_index didnt find a key match'
1728
' but walking the data did, for %s'
1730
basis_kind = maybe_entry[1][1][0]
1731
if basis_kind not in 'ar':
1732
self._raise_invalid(new_path, file_id,
1733
"we have an add record for path, but the path"
1734
" is already present with another file_id %s"
1735
% (maybe_entry[0][2],))
1737
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1739
block.insert(entry_index, entry)
1741
active_kind = entry[1][0][0]
1742
if active_kind == 'a':
1743
# The active record shows up as absent, this could be genuine,
1744
# or it could be present at some other location. We need to
1746
id_index = self._get_id_index()
1747
# The id_index may not be perfectly accurate for tree1, because
1748
# we haven't been keeping it updated. However, it should be
1749
# fine for tree0, and that gives us enough info for what we
1751
keys = id_index.get(file_id, ())
1753
block_i, entry_i, d_present, f_present = \
1754
self._get_block_entry_index(key[0], key[1], 0)
1757
active_entry = self._dirblocks[block_i][1][entry_i]
1758
if (active_entry[0][2] != file_id):
1759
# Some other file is at this path, we don't need to
1762
real_active_kind = active_entry[1][0][0]
1763
if real_active_kind in 'ar':
1764
# We found a record, which was not *this* record,
1765
# which matches the file_id, but is not actually
1766
# present. Something seems *really* wrong.
1767
self._raise_invalid(new_path, file_id,
1768
"We found a tree0 entry that doesnt make sense")
1769
# Now, we've found a tree0 entry which matches the file_id
1770
# but is at a different location. So update them to be
1772
active_dir, active_name = active_entry[0][:2]
1774
active_path = active_dir + '/' + active_name
1776
active_path = active_name
1777
active_entry[1][1] = st('r', new_path, 0, False, '')
1778
entry[1][0] = st('r', active_path, 0, False, '')
1779
elif active_kind == 'r':
1780
raise NotImplementedError()
1782
new_kind = new_details[0]
1784
self._ensure_block(block_index, entry_index, new_path)
1436
1786
def _update_basis_apply_changes(self, changes):
1437
1787
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1469
1813
null = DirState.NULL_PARENT_DETAILS
1470
1814
for old_path, new_path, file_id, _, real_delete in deletes:
1471
1815
if real_delete != (new_path is None):
1472
raise AssertionError("bad delete delta")
1816
self._raise_invalid(old_path, file_id, "bad delete delta")
1473
1817
# the entry for this file_id must be in tree 1.
1474
1818
dirname, basename = osutils.split(old_path)
1475
1819
block_index, entry_index, dir_present, file_present = \
1476
1820
self._get_block_entry_index(dirname, basename, 1)
1477
1821
if not file_present:
1478
self._changes_aborted = True
1479
raise errors.InconsistentDelta(old_path, file_id,
1822
self._raise_invalid(old_path, file_id,
1480
1823
'basis tree does not contain removed entry')
1481
1824
entry = self._dirblocks[block_index][1][entry_index]
1825
# The state of the entry in the 'active' WT
1826
active_kind = entry[1][0][0]
1482
1827
if entry[0][2] != file_id:
1483
self._changes_aborted = True
1484
raise errors.InconsistentDelta(old_path, file_id,
1828
self._raise_invalid(old_path, file_id,
1485
1829
'mismatched file_id in tree 1')
1487
if entry[1][0][0] != 'a':
1488
self._changes_aborted = True
1489
raise errors.InconsistentDelta(old_path, file_id,
1490
'This was marked as a real delete, but the WT state'
1491
' claims that it still exists and is versioned.')
1831
old_kind = entry[1][1][0]
1832
if active_kind in 'ar':
1833
# The active tree doesn't have this file_id.
1834
# The basis tree is changing this record. If this is a
1835
# rename, then we don't want the record here at all
1836
# anymore. If it is just an in-place change, we want the
1837
# record here, but we'll add it if we need to. So we just
1839
if active_kind == 'r':
1840
active_path = entry[1][0][1]
1841
active_entry = self._get_entry(0, file_id, active_path)
1842
if active_entry[1][1][0] != 'r':
1843
self._raise_invalid(old_path, file_id,
1844
"Dirstate did not have matching rename entries")
1845
elif active_entry[1][0][0] in 'ar':
1846
self._raise_invalid(old_path, file_id,
1847
"Dirstate had a rename pointing at an inactive"
1849
active_entry[1][1] = null
1492
1850
del self._dirblocks[block_index][1][entry_index]
1852
# This was a directory, and the active tree says it
1853
# doesn't exist, and now the basis tree says it doesn't
1854
# exist. Remove its dirblock if present
1856
present) = self._find_block_index_from_key(
1859
dir_block = self._dirblocks[dir_block_index][1]
1861
# This entry is empty, go ahead and just remove it
1862
del self._dirblocks[dir_block_index]
1494
if entry[1][0][0] == 'a':
1495
self._changes_aborted = True
1496
raise errors.InconsistentDelta(old_path, file_id,
1497
'The entry was considered a rename, but the source path'
1498
' is marked as absent.')
1499
# For whatever reason, we were asked to rename an entry
1500
# that was originally marked as deleted. This could be
1501
# because we are renaming the parent directory, and the WT
1502
# current state has the file marked as deleted.
1503
elif entry[1][0][0] == 'r':
1504
# implement the rename
1505
del self._dirblocks[block_index][1][entry_index]
1507
# it is being resurrected here, so blank it out temporarily.
1508
self._dirblocks[block_index][1][entry_index][1][1] = null
1864
# There is still an active record, so just mark this
1867
block_i, entry_i, d_present, f_present = \
1868
self._get_block_entry_index(old_path, '', 1)
1870
dir_block = self._dirblocks[block_i][1]
1871
for child_entry in dir_block:
1872
child_basis_kind = child_entry[1][1][0]
1873
if child_basis_kind not in 'ar':
1874
self._raise_invalid(old_path, file_id,
1875
"The file id was deleted but its children were "
1878
def _after_delta_check_parents(self, parents, index):
1879
"""Check that parents required by the delta are all intact.
1881
:param parents: An iterable of (path_utf8, file_id) tuples which are
1882
required to be present in tree 'index' at path_utf8 with id file_id
1884
:param index: The column in the dirstate to check for parents in.
1886
for dirname_utf8, file_id in parents:
1887
# Get the entry - the ensures that file_id, dirname_utf8 exists and
1888
# has the right file id.
1889
entry = self._get_entry(index, file_id, dirname_utf8)
1890
if entry[1] is None:
1891
self._raise_invalid(dirname_utf8.decode('utf8'),
1892
file_id, "This parent is not present.")
1893
# Parents of things must be directories
1894
if entry[1][index][0] != 'd':
1895
self._raise_invalid(dirname_utf8.decode('utf8'),
1896
file_id, "This parent is not a directory.")
1510
1898
def _observed_sha1(self, entry, sha1, stat_value,
1511
1899
_stat_to_minikind=_stat_to_minikind, _pack_stat=pack_stat):
1914
2331
def _get_id_index(self):
1915
"""Get an id index of self._dirblocks."""
2332
"""Get an id index of self._dirblocks.
2334
This maps from file_id => [(directory, name, file_id)] entries where
2335
that file_id appears in one of the trees.
1916
2337
if self._id_index is None:
1918
2339
for key, tree_details in self._iter_entries():
1919
id_index.setdefault(key[2], set()).add(key)
2340
self._add_to_id_index(id_index, key)
1920
2341
self._id_index = id_index
1921
2342
return self._id_index
2344
def _add_to_id_index(self, id_index, entry_key):
2345
"""Add this entry to the _id_index mapping."""
2346
# This code used to use a set for every entry in the id_index. However,
2347
# it is *rare* to have more than one entry. So a set is a large
2348
# overkill. And even when we do, we won't ever have more than the
2349
# number of parent trees. Which is still a small number (rarely >2). As
2350
# such, we use a simple tuple, and do our own uniqueness checks. While
2351
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2352
# cause quadratic failure.
2353
file_id = entry_key[2]
2354
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2355
if file_id not in id_index:
2356
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2358
entry_keys = id_index[file_id]
2359
if entry_key not in entry_keys:
2360
id_index[file_id] = entry_keys + (entry_key,)
2362
def _remove_from_id_index(self, id_index, entry_key):
2363
"""Remove this entry from the _id_index mapping.
2365
It is an programming error to call this when the entry_key is not
2368
file_id = entry_key[2]
2369
entry_keys = list(id_index[file_id])
2370
entry_keys.remove(entry_key)
2371
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
1923
2373
def _get_output_lines(self, lines):
1924
2374
"""Format lines for final output.
2049
2508
trace.mutter('Not saving DirState because '
2050
2509
'_changes_aborted is set.')
2052
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2053
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2511
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2512
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2513
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2514
# fail to save IN_MEMORY_MODIFIED
2515
if not self._worth_saving():
2055
grabbed_write_lock = False
2056
if self._lock_state != 'w':
2057
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2058
# Switch over to the new lock, as the old one may be closed.
2518
grabbed_write_lock = False
2519
if self._lock_state != 'w':
2520
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2521
# Switch over to the new lock, as the old one may be closed.
2522
# TODO: jam 20070315 We should validate the disk file has
2523
# not changed contents, since temporary_write_lock may
2524
# not be an atomic operation.
2525
self._lock_token = new_lock
2526
self._state_file = new_lock.f
2527
if not grabbed_write_lock:
2528
# We couldn't grab a write lock, so we switch back to a read one
2531
lines = self.get_lines()
2532
self._state_file.seek(0)
2533
self._state_file.writelines(lines)
2534
self._state_file.truncate()
2535
self._state_file.flush()
2536
self._maybe_fdatasync()
2537
self._mark_unmodified()
2539
if grabbed_write_lock:
2540
self._lock_token = self._lock_token.restore_read_lock()
2541
self._state_file = self._lock_token.f
2059
2542
# TODO: jam 20070315 We should validate the disk file has
2060
# not changed contents. Since temporary_write_lock may
2061
# not be an atomic operation.
2062
self._lock_token = new_lock
2063
self._state_file = new_lock.f
2064
if not grabbed_write_lock:
2065
# We couldn't grab a write lock, so we switch back to a read one
2068
self._state_file.seek(0)
2069
self._state_file.writelines(self.get_lines())
2070
self._state_file.truncate()
2071
self._state_file.flush()
2072
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2073
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2075
if grabbed_write_lock:
2076
self._lock_token = self._lock_token.restore_read_lock()
2077
self._state_file = self._lock_token.f
2078
# TODO: jam 20070315 We should validate the disk file has
2079
# not changed contents. Since restore_read_lock may
2080
# not be an atomic operation.
2543
# not changed contents. Since restore_read_lock may
2544
# not be an atomic operation.
2546
def _maybe_fdatasync(self):
2547
"""Flush to disk if possible and if not configured off."""
2548
if self._config_stack.get('dirstate.fdatasync'):
2549
osutils.fdatasync(self._state_file.fileno())
2551
def _worth_saving(self):
2552
"""Is it worth saving the dirstate or not?"""
2553
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2554
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2556
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2557
if self._worth_saving_limit == -1:
2558
# We never save hash changes when the limit is -1
2560
# If we're using smart saving and only a small number of
2561
# entries have changed their hash, don't bother saving. John has
2562
# suggested using a heuristic here based on the size of the
2563
# changed files and/or tree. For now, we go with a configurable
2564
# number of changes, keeping the calculation time
2565
# as low overhead as possible. (This also keeps all existing
2566
# tests passing as the default is 0, i.e. always save.)
2567
if len(self._known_hash_changes) >= self._worth_saving_limit:
2082
2571
def _set_data(self, parent_ids, dirblocks):
2083
2572
"""Set the full dirstate data in memory.
2119
2607
self._make_absent(entry)
2120
2608
self.update_minimal(('', '', new_id), 'd',
2121
2609
path_utf8='', packed_stat=entry[1][0][4])
2122
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2610
self._mark_modified()
2611
# XXX: This was added by Ian, we need to make sure there
2612
# are tests for it, because it isn't in bzr.dev TRUNK
2613
# It looks like the only place it is called is in setting the root
2614
# id of the tree. So probably we never had an _id_index when we
2615
# don't even have a root yet.
2123
2616
if self._id_index is not None:
2124
self._id_index.setdefault(new_id, set()).add(entry[0])
2617
self._add_to_id_index(self._id_index, entry[0])
2126
2619
def set_parent_trees(self, trees, ghosts):
2127
2620
"""Set the parent trees for the dirstate.
2129
2622
:param trees: A list of revision_id, tree tuples. tree must be provided
2130
even if the revision_id refers to a ghost: supply an empty tree in
2623
even if the revision_id refers to a ghost: supply an empty tree in
2132
2625
:param ghosts: A list of the revision_ids that are ghosts at the time
2135
# TODO: generate a list of parent indexes to preserve to save
2628
# TODO: generate a list of parent indexes to preserve to save
2136
2629
# processing specific parent trees. In the common case one tree will
2137
2630
# be preserved - the left most parent.
2138
2631
# TODO: if the parent tree is a dirstate, we might want to walk them
2231
2737
new_details = []
2232
2738
for lookup_index in xrange(tree_index):
2233
2739
# boundary case: this is the first occurence of file_id
2234
# so there are no id_indexs, possibly take this out of
2740
# so there are no id_indexes, possibly take this out of
2236
if not len(id_index[file_id]):
2742
if not len(entry_keys):
2237
2743
new_details.append(DirState.NULL_PARENT_DETAILS)
2239
2745
# grab any one entry, use it to find the right path.
2240
# TODO: optimise this to reduce memory use in highly
2241
# fragmented situations by reusing the relocation
2243
a_key = iter(id_index[file_id]).next()
2746
a_key = iter(entry_keys).next()
2244
2747
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2245
# its a pointer or missing statement, use it as is.
2748
# its a pointer or missing statement, use it as
2246
2750
new_details.append(by_path[a_key][lookup_index])
2248
2752
# we have the right key, make a pointer to it.
2249
2753
real_path = ('/'.join(a_key[0:2])).strip('/')
2250
new_details.append(('r', real_path, 0, False, ''))
2754
new_details.append(st('r', real_path, 0, False,
2251
2756
new_details.append(self._inv_entry_to_details(entry))
2252
2757
new_details.extend(new_location_suffix)
2253
2758
by_path[new_entry_key] = new_details
2254
id_index[file_id].add(new_entry_key)
2759
self._add_to_id_index(id_index, new_entry_key)
2255
2760
# --- end generation of full tree mappings
2257
2762
# sort and output all the entries
2369
2901
and new_entry_key[1:] < current_old[0][1:])):
2370
2902
# new comes before:
2371
2903
# add a entry for this and advance new
2905
trace.mutter("Inserting from new '%s'.",
2906
new_path_utf8.decode('utf8'))
2372
2907
self.update_minimal(new_entry_key, current_new_minikind,
2373
2908
executable=current_new[1].executable,
2374
path_utf8=new_path_utf8, fingerprint=fingerprint)
2909
path_utf8=new_path_utf8, fingerprint=fingerprint,
2375
2911
current_new = advance(new_iterator)
2377
2913
# we've advanced past the place where the old key would be,
2378
2914
# without seeing it in the new list. so it must be gone.
2916
trace.mutter("Deleting from old '%s/%s'.",
2917
current_old[0][0].decode('utf8'),
2918
current_old[0][1].decode('utf8'))
2379
2919
self._make_absent(current_old)
2380
2920
current_old = advance(old_iterator)
2381
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2921
self._mark_modified()
2382
2922
self._id_index = None
2383
2923
self._packed_stat_index = None
2925
trace.mutter("set_state_from_inventory complete.")
2927
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2928
"""Wipe the currently stored state and set it to something new.
2930
This is a hard-reset for the data we are working with.
2932
# Technically, we really want a write lock, but until we write, we
2933
# don't really need it.
2934
self._requires_lock()
2935
# root dir and root dir contents with no children. We have to have a
2936
# root for set_state_from_inventory to work correctly.
2937
empty_root = (('', '', inventory.ROOT_ID),
2938
[('d', '', 0, False, DirState.NULLSTAT)])
2939
empty_tree_dirblocks = [('', [empty_root]), ('', [])]
2940
self._set_data([], empty_tree_dirblocks)
2941
self.set_state_from_inventory(working_inv)
2942
self.set_parent_trees(parent_trees, parent_ghosts)
2385
2944
def _make_absent(self, current_old):
2386
2945
"""Mark current_old - an entry - as absent for tree 0.
2476
3055
# grab one of them and use it to generate parent
2477
3056
# relocation/absent entries.
2478
3057
new_entry = key, [new_details]
2479
for other_key in existing_keys:
3058
# existing_keys can be changed as we iterate.
3059
for other_key in tuple(existing_keys):
2480
3060
# change the record at other to be a pointer to this new
2481
3061
# record. The loop looks similar to the change to
2482
3062
# relocations when updating an existing record but its not:
2483
3063
# the test for existing kinds is different: this can be
2484
3064
# factored out to a helper though.
2485
other_block_index, present = self._find_block_index_from_key(other_key)
2487
raise AssertionError('could not find block for %s' % (other_key,))
2488
other_entry_index, present = self._find_entry_index(other_key,
2489
self._dirblocks[other_block_index][1])
2491
raise AssertionError('could not find entry for %s' % (other_key,))
3065
other_block_index, present = self._find_block_index_from_key(
3068
raise AssertionError('could not find block for %s' % (
3070
other_block = self._dirblocks[other_block_index][1]
3071
other_entry_index, present = self._find_entry_index(
3072
other_key, other_block)
3074
raise AssertionError(
3075
'update_minimal: could not find other entry for %s'
2492
3077
if path_utf8 is None:
2493
3078
raise AssertionError('no path')
2494
self._dirblocks[other_block_index][1][other_entry_index][1][0] = \
2495
('r', path_utf8, 0, False, '')
3079
# Turn this other location into a reference to the new
3080
# location. This also updates the aliased iterator
3081
# (current_old in set_state_from_inventory) so that the old
3082
# entry, if not already examined, is skipped over by that
3084
other_entry = other_block[other_entry_index]
3085
other_entry[1][0] = ('r', path_utf8, 0, False, '')
3086
if self._maybe_remove_row(other_block, other_entry_index,
3088
# If the row holding this was removed, we need to
3089
# recompute where this entry goes
3090
entry_index, _ = self._find_entry_index(key, block)
3093
# adds a tuple to the new details for each column
3094
# - either by copying an existing relocation pointer inside that column
3095
# - or by creating a new pointer to the right row inside that column
2497
3096
num_present_parents = self._num_present_parents()
3097
if num_present_parents:
3098
# TODO: This re-evaluates the existing_keys set, do we need
3099
# to do that ourselves?
3100
other_key = list(existing_keys)[0]
2498
3101
for lookup_index in xrange(1, num_present_parents + 1):
2499
3102
# grab any one entry, use it to find the right path.
2500
# TODO: optimise this to reduce memory use in highly
3103
# TODO: optimise this to reduce memory use in highly
2501
3104
# fragmented situations by reusing the relocation
2503
3106
update_block_index, present = \
2839
3487
entry[1][0] = ('l', '', stat_value.st_size,
2840
3488
False, DirState.NULLSTAT)
2841
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3490
state._mark_modified([entry])
2842
3491
return link_or_sha1
2843
update_entry = py_update_entry
2846
3494
class ProcessEntryPython(object):
2848
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id", "uninteresting",
3496
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
2849
3497
"last_source_parent", "last_target_parent", "include_unchanged",
2850
"use_filesystem_for_exec", "utf8_decode", "searched_specific_files",
2851
"search_specific_files", "state", "source_index", "target_index",
2852
"want_unversioned", "tree"]
3498
"partial", "use_filesystem_for_exec", "utf8_decode",
3499
"searched_specific_files", "search_specific_files",
3500
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3501
"state", "source_index", "target_index", "want_unversioned", "tree"]
2854
3503
def __init__(self, include_unchanged, use_filesystem_for_exec,
2855
3504
search_specific_files, state, source_index, target_index,
2856
3505
want_unversioned, tree):
2857
3506
self.old_dirname_to_file_id = {}
2858
3507
self.new_dirname_to_file_id = {}
2859
# Just a sentry, so that _process_entry can say that this
2860
# record is handled, but isn't interesting to process (unchanged)
2861
self.uninteresting = object()
3508
# Are we doing a partial iter_changes?
3509
self.partial = search_specific_files != set([''])
2862
3510
# Using a list so that we can access the values and change them in
2863
3511
# nested scope. Each one is [path, file_id, entry]
2864
3512
self.last_source_parent = [None, None]
3143
3802
raise AssertionError("don't know how to compare "
3144
3803
"source_minikind=%r, target_minikind=%r"
3145
3804
% (source_minikind, target_minikind))
3146
## import pdb;pdb.set_trace()
3149
3807
def __iter__(self):
3810
def _gather_result_for_consistency(self, result):
3811
"""Check a result we will yield to make sure we are consistent later.
3813
This gathers result's parents into a set to output later.
3815
:param result: A result tuple.
3817
if not self.partial or not result[0]:
3819
self.seen_ids.add(result[0])
3820
new_path = result[1][1]
3822
# Not the root and not a delete: queue up the parents of the path.
3823
self.search_specific_file_parents.update(
3824
osutils.parent_directories(new_path.encode('utf8')))
3825
# Add the root directory which parent_directories does not
3827
self.search_specific_file_parents.add('')
3152
3829
def iter_changes(self):
3153
3830
"""Iterate over the changes."""
3154
3831
utf8_decode = cache_utf8._utf8_decode
3155
3832
_cmp_by_dirs = cmp_by_dirs
3156
3833
_process_entry = self._process_entry
3157
uninteresting = self.uninteresting
3158
3834
search_specific_files = self.search_specific_files
3159
3835
searched_specific_files = self.searched_specific_files
3160
3836
splitpath = osutils.splitpath
3162
3838
# compare source_index and target_index at or under each element of search_specific_files.
3163
3839
# follow the following comparison table. Note that we only want to do diff operations when
3164
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3840
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3165
3841
# for the target.
3168
3844
# Source | Target | disk | action
3169
3845
# r | fdlt | | add source to search, add id path move and perform
3170
3846
# | | | diff check on source-target
3171
# r | fdlt | a | dangling file that was present in the basis.
3847
# r | fdlt | a | dangling file that was present in the basis.
3173
3849
# r | a | | add source to search
3175
3851
# r | r | | this path is present in a non-examined tree, skip.
3176
3852
# r | r | a | this path is present in a non-examined tree, skip.
3177
3853
# a | fdlt | | add new id
3482
4168
current_dir_info = dir_iterator.next()
3483
4169
except StopIteration:
3484
4170
current_dir_info = None
3485
_process_entry = ProcessEntryPython
4171
for result in self._iter_specific_file_parents():
4174
def _iter_specific_file_parents(self):
4175
"""Iter over the specific file parents."""
4176
while self.search_specific_file_parents:
4177
# Process the parent directories for the paths we were iterating.
4178
# Even in extremely large trees this should be modest, so currently
4179
# no attempt is made to optimise.
4180
path_utf8 = self.search_specific_file_parents.pop()
4181
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4182
# We've examined this path.
4184
if path_utf8 in self.searched_exact_paths:
4185
# We've examined this path.
4187
path_entries = self.state._entries_for_path(path_utf8)
4188
# We need either one or two entries. If the path in
4189
# self.target_index has moved (so the entry in source_index is in
4190
# 'ar') then we need to also look for the entry for this path in
4191
# self.source_index, to output the appropriate delete-or-rename.
4192
selected_entries = []
4194
for candidate_entry in path_entries:
4195
# Find entries present in target at this path:
4196
if candidate_entry[1][self.target_index][0] not in 'ar':
4198
selected_entries.append(candidate_entry)
4199
# Find entries present in source at this path:
4200
elif (self.source_index is not None and
4201
candidate_entry[1][self.source_index][0] not in 'ar'):
4203
if candidate_entry[1][self.target_index][0] == 'a':
4204
# Deleted, emit it here.
4205
selected_entries.append(candidate_entry)
4207
# renamed, emit it when we process the directory it
4209
self.search_specific_file_parents.add(
4210
candidate_entry[1][self.target_index][1])
4212
raise AssertionError(
4213
"Missing entry for specific path parent %r, %r" % (
4214
path_utf8, path_entries))
4215
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4216
for entry in selected_entries:
4217
if entry[0][2] in self.seen_ids:
4219
result, changed = self._process_entry(entry, path_info)
4221
raise AssertionError(
4222
"Got entry<->path mismatch for specific path "
4223
"%r entry %r path_info %r " % (
4224
path_utf8, entry, path_info))
4225
# Only include changes - we're outside the users requested
4228
self._gather_result_for_consistency(result)
4229
if (result[6][0] == 'directory' and
4230
result[6][1] != 'directory'):
4231
# This stopped being a directory, the old children have
4233
if entry[1][self.source_index][0] == 'r':
4234
# renamed, take the source path
4235
entry_path_utf8 = entry[1][self.source_index][1]
4237
entry_path_utf8 = path_utf8
4238
initial_key = (entry_path_utf8, '', '')
4239
block_index, _ = self.state._find_block_index_from_key(
4241
if block_index == 0:
4242
# The children of the root are in block index 1.
4244
current_block = None
4245
if block_index < len(self.state._dirblocks):
4246
current_block = self.state._dirblocks[block_index]
4247
if not osutils.is_inside(
4248
entry_path_utf8, current_block[0]):
4249
# No entries for this directory at all.
4250
current_block = None
4251
if current_block is not None:
4252
for entry in current_block[1]:
4253
if entry[1][self.source_index][0] in 'ar':
4254
# Not in the source tree, so doesn't have to be
4257
# Path of the entry itself.
4259
self.search_specific_file_parents.add(
4260
osutils.pathjoin(*entry[0][:2]))
4261
if changed or self.include_unchanged:
4263
self.searched_exact_paths.add(path_utf8)
4265
def _path_info(self, utf8_path, unicode_path):
4266
"""Generate path_info for unicode_path.
4268
:return: None if unicode_path does not exist, or a path_info tuple.
4270
abspath = self.tree.abspath(unicode_path)
4272
stat = os.lstat(abspath)
4274
if e.errno == errno.ENOENT:
4275
# the path does not exist.
4279
utf8_basename = utf8_path.rsplit('/', 1)[-1]
4280
dir_info = (utf8_path, utf8_basename,
4281
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4283
if dir_info[2] == 'directory':
4284
if self.tree._directory_is_tree_reference(
4286
self.root_dir_info = self.root_dir_info[:2] + \
4287
('tree-reference',) + self.root_dir_info[3:]
3488
4291
# Try to load the compiled form if possible
3490
from bzrlib._dirstate_helpers_c import (
3491
_read_dirblocks_c as _read_dirblocks,
3492
bisect_dirblock_c as bisect_dirblock,
3493
_bisect_path_left_c as _bisect_path_left,
3494
_bisect_path_right_c as _bisect_path_right,
3495
cmp_by_dirs_c as cmp_by_dirs,
4293
from bzrlib._dirstate_helpers_pyx import (
3496
4299
ProcessEntryC as _process_entry,
3497
4300
update_entry as update_entry,
4302
except ImportError, e:
4303
osutils.failed_to_load_extension(e)
3500
4304
from bzrlib._dirstate_helpers_py import (
3501
_read_dirblocks_py as _read_dirblocks,
3502
bisect_dirblock_py as bisect_dirblock,
3503
_bisect_path_left_py as _bisect_path_left,
3504
_bisect_path_right_py as _bisect_path_right,
3505
cmp_by_dirs_py as cmp_by_dirs,
4311
# FIXME: It would be nice to be able to track moved lines so that the
4312
# corresponding python code can be moved to the _dirstate_helpers_py
4313
# module. I don't want to break the history for this important piece of
4314
# code so I left the code here -- vila 20090622
4315
update_entry = py_update_entry
4316
_process_entry = ProcessEntryPython