20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 2", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", digit, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
94
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
96
108
directory, as a list of components
100
112
--- Format 1 had the following different definition: ---
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
108
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
109
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
1091
def update_entry(self, entry, abspath, stat_value,
1092
_stat_to_minikind=_stat_to_minikind,
1093
_pack_stat=pack_stat):
1094
"""Update the entry based on what is actually on disk.
1096
:param entry: This is the dirblock entry for the file in question.
1097
:param abspath: The path on disk for this file.
1098
:param stat_value: (optional) if we already have done a stat on the
1100
:return: The sha1 hexdigest of the file (40 bytes) or link target of a
1307
def _check_delta_is_valid(self, delta):
1308
return list(inventory._check_delta_unique_ids(
1309
inventory._check_delta_unique_old_paths(
1310
inventory._check_delta_unique_new_paths(
1311
inventory._check_delta_ids_match_entry(
1312
inventory._check_delta_ids_are_valid(
1313
inventory._check_delta_new_path_entry_both_or_None(delta)))))))
1315
def update_by_delta(self, delta):
1316
"""Apply an inventory delta to the dirstate for tree 0
1318
This is the workhorse for apply_inventory_delta in dirstate based
1321
:param delta: An inventory delta. See Inventory.apply_delta for
1324
self._read_dirblocks_if_needed()
1325
encode = cache_utf8.encode
1328
# Accumulate parent references (path_utf8, id), to check for parentless
1329
# items or items placed under files/links/tree-references. We get
1330
# references from every item in the delta that is not a deletion and
1331
# is not itself the root.
1333
# Added ids must not be in the dirstate already. This set holds those
1336
# This loop transforms the delta to single atomic operations that can
1337
# be executed and validated.
1338
delta = sorted(self._check_delta_is_valid(delta), reverse=True)
1339
for old_path, new_path, file_id, inv_entry in delta:
1340
if (file_id in insertions) or (file_id in removals):
1341
self._raise_invalid(old_path or new_path, file_id,
1343
if old_path is not None:
1344
old_path = old_path.encode('utf-8')
1345
removals[file_id] = old_path
1347
new_ids.add(file_id)
1348
if new_path is not None:
1349
if inv_entry is None:
1350
self._raise_invalid(new_path, file_id,
1351
"new_path with no entry")
1352
new_path = new_path.encode('utf-8')
1353
dirname_utf8, basename = osutils.split(new_path)
1355
parents.add((dirname_utf8, inv_entry.parent_id))
1356
key = (dirname_utf8, basename, file_id)
1357
minikind = DirState._kind_to_minikind[inv_entry.kind]
1359
fingerprint = inv_entry.reference_revision or ''
1362
insertions[file_id] = (key, minikind, inv_entry.executable,
1363
fingerprint, new_path)
1364
# Transform moves into delete+add pairs
1365
if None not in (old_path, new_path):
1366
for child in self._iter_child_entries(0, old_path):
1367
if child[0][2] in insertions or child[0][2] in removals:
1369
child_dirname = child[0][0]
1370
child_basename = child[0][1]
1371
minikind = child[1][0][0]
1372
fingerprint = child[1][0][4]
1373
executable = child[1][0][3]
1374
old_child_path = osutils.pathjoin(child_dirname,
1376
removals[child[0][2]] = old_child_path
1377
child_suffix = child_dirname[len(old_path):]
1378
new_child_dirname = (new_path + child_suffix)
1379
key = (new_child_dirname, child_basename, child[0][2])
1380
new_child_path = osutils.pathjoin(new_child_dirname,
1382
insertions[child[0][2]] = (key, minikind, executable,
1383
fingerprint, new_child_path)
1384
self._check_delta_ids_absent(new_ids, delta, 0)
1386
self._apply_removals(removals.iteritems())
1387
self._apply_insertions(insertions.values())
1389
self._after_delta_check_parents(parents, 0)
1390
except errors.BzrError, e:
1391
self._changes_aborted = True
1392
if 'integrity error' not in str(e):
1394
# _get_entry raises BzrError when a request is inconsistent; we
1395
# want such errors to be shown as InconsistentDelta - and that
1396
# fits the behaviour we trigger.
1397
raise errors.InconsistentDeltaDelta(delta,
1398
"error from _get_entry. %s" % (e,))
1400
def _apply_removals(self, removals):
1401
for file_id, path in sorted(removals, reverse=True,
1402
key=operator.itemgetter(1)):
1403
dirname, basename = osutils.split(path)
1404
block_i, entry_i, d_present, f_present = \
1405
self._get_block_entry_index(dirname, basename, 0)
1407
entry = self._dirblocks[block_i][1][entry_i]
1409
self._raise_invalid(path, file_id,
1410
"Wrong path for old path.")
1411
if not f_present or entry[1][0][0] in 'ar':
1412
self._raise_invalid(path, file_id,
1413
"Wrong path for old path.")
1414
if file_id != entry[0][2]:
1415
self._raise_invalid(path, file_id,
1416
"Attempt to remove path has wrong id - found %r."
1418
self._make_absent(entry)
1419
# See if we have a malformed delta: deleting a directory must not
1420
# leave crud behind. This increases the number of bisects needed
1421
# substantially, but deletion or renames of large numbers of paths
1422
# is rare enough it shouldn't be an issue (famous last words?) RBC
1424
block_i, entry_i, d_present, f_present = \
1425
self._get_block_entry_index(path, '', 0)
1427
# The dir block is still present in the dirstate; this could
1428
# be due to it being in a parent tree, or a corrupt delta.
1429
for child_entry in self._dirblocks[block_i][1]:
1430
if child_entry[1][0][0] not in ('r', 'a'):
1431
self._raise_invalid(path, entry[0][2],
1432
"The file id was deleted but its children were "
1435
def _apply_insertions(self, adds):
1437
for key, minikind, executable, fingerprint, path_utf8 in sorted(adds):
1438
self.update_minimal(key, minikind, executable, fingerprint,
1439
path_utf8=path_utf8)
1440
except errors.NotVersionedError:
1441
self._raise_invalid(path_utf8.decode('utf8'), key[2],
1444
def update_basis_by_delta(self, delta, new_revid):
1445
"""Update the parents of this tree after a commit.
1447
This gives the tree one parent, with revision id new_revid. The
1448
inventory delta is applied to the current basis tree to generate the
1449
inventory for the parent new_revid, and all other parent trees are
1452
Note that an exception during the operation of this method will leave
1453
the dirstate in a corrupt state where it should not be saved.
1455
:param new_revid: The new revision id for the trees parent.
1456
:param delta: An inventory delta (see apply_inventory_delta) describing
1457
the changes from the current left most parent revision to new_revid.
1459
self._read_dirblocks_if_needed()
1460
self._discard_merge_parents()
1461
if self._ghosts != []:
1462
raise NotImplementedError(self.update_basis_by_delta)
1463
if len(self._parents) == 0:
1464
# setup a blank tree, the most simple way.
1465
empty_parent = DirState.NULL_PARENT_DETAILS
1466
for entry in self._iter_entries():
1467
entry[1].append(empty_parent)
1468
self._parents.append(new_revid)
1470
self._parents[0] = new_revid
1472
delta = sorted(self._check_delta_is_valid(delta), reverse=True)
1476
# The paths this function accepts are unicode and must be encoded as we
1478
encode = cache_utf8.encode
1479
inv_to_entry = self._inv_entry_to_details
1480
# delta is now (deletes, changes), (adds) in reverse lexographical
1482
# deletes in reverse lexographic order are safe to process in situ.
1483
# renames are not, as a rename from any path could go to a path
1484
# lexographically lower, so we transform renames into delete, add pairs,
1485
# expanding them recursively as needed.
1486
# At the same time, to reduce interface friction we convert the input
1487
# inventory entries to dirstate.
1488
root_only = ('', '')
1489
# Accumulate parent references (path_utf8, id), to check for parentless
1490
# items or items placed under files/links/tree-references. We get
1491
# references from every item in the delta that is not a deletion and
1492
# is not itself the root.
1494
# Added ids must not be in the dirstate already. This set holds those
1497
for old_path, new_path, file_id, inv_entry in delta:
1498
if inv_entry is not None and file_id != inv_entry.file_id:
1499
self._raise_invalid(new_path, file_id,
1500
"mismatched entry file_id %r" % inv_entry)
1501
if new_path is None:
1502
new_path_utf8 = None
1504
if inv_entry is None:
1505
self._raise_invalid(new_path, file_id,
1506
"new_path with no entry")
1507
new_path_utf8 = encode(new_path)
1508
# note the parent for validation
1509
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1511
parents.add((dirname_utf8, inv_entry.parent_id))
1512
if old_path is None:
1513
old_path_utf8 = None
1515
old_path_utf8 = encode(old_path)
1516
if old_path is None:
1517
adds.append((None, new_path_utf8, file_id,
1518
inv_to_entry(inv_entry), True))
1519
new_ids.add(file_id)
1520
elif new_path is None:
1521
deletes.append((old_path_utf8, None, file_id, None, True))
1522
elif (old_path, new_path) == root_only:
1523
# change things in-place
1524
# Note: the case of a parent directory changing its file_id
1525
# tends to break optimizations here, because officially
1526
# the file has actually been moved, it just happens to
1527
# end up at the same path. If we can figure out how to
1528
# handle that case, we can avoid a lot of add+delete
1529
# pairs for objects that stay put.
1530
# elif old_path == new_path:
1531
changes.append((old_path_utf8, new_path_utf8, file_id,
1532
inv_to_entry(inv_entry)))
1535
# Because renames must preserve their children we must have
1536
# processed all relocations and removes before hand. The sort
1537
# order ensures we've examined the child paths, but we also
1538
# have to execute the removals, or the split to an add/delete
1539
# pair will result in the deleted item being reinserted, or
1540
# renamed items being reinserted twice - and possibly at the
1541
# wrong place. Splitting into a delete/add pair also simplifies
1542
# the handling of entries with ('f', ...), ('r' ...) because
1543
# the target of the 'r' is old_path here, and we add that to
1544
# deletes, meaning that the add handler does not need to check
1545
# for 'r' items on every pass.
1546
self._update_basis_apply_deletes(deletes)
1548
# Split into an add/delete pair recursively.
1549
adds.append((old_path_utf8, new_path_utf8, file_id,
1550
inv_to_entry(inv_entry), False))
1551
# Expunge deletes that we've seen so that deleted/renamed
1552
# children of a rename directory are handled correctly.
1553
new_deletes = reversed(list(
1554
self._iter_child_entries(1, old_path_utf8)))
1555
# Remove the current contents of the tree at orig_path, and
1556
# reinsert at the correct new path.
1557
for entry in new_deletes:
1558
child_dirname, child_basename, child_file_id = entry[0]
1560
source_path = child_dirname + '/' + child_basename
1562
source_path = child_basename
1565
new_path_utf8 + source_path[len(old_path_utf8):]
1567
if old_path_utf8 == '':
1568
raise AssertionError("cannot rename directory to"
1570
target_path = source_path[len(old_path_utf8) + 1:]
1571
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1573
(source_path, target_path, entry[0][2], None, False))
1575
(old_path_utf8, new_path_utf8, file_id, None, False))
1577
self._check_delta_ids_absent(new_ids, delta, 1)
1579
# Finish expunging deletes/first half of renames.
1580
self._update_basis_apply_deletes(deletes)
1581
# Reinstate second half of renames and new paths.
1582
self._update_basis_apply_adds(adds)
1583
# Apply in-situ changes.
1584
self._update_basis_apply_changes(changes)
1586
self._after_delta_check_parents(parents, 1)
1587
except errors.BzrError, e:
1588
self._changes_aborted = True
1589
if 'integrity error' not in str(e):
1591
# _get_entry raises BzrError when a request is inconsistent; we
1592
# want such errors to be shown as InconsistentDelta - and that
1593
# fits the behaviour we trigger.
1594
raise errors.InconsistentDeltaDelta(delta,
1595
"error from _get_entry. %s" % (e,))
1597
self._mark_modified(header_modified=True)
1598
self._id_index = None
1601
def _check_delta_ids_absent(self, new_ids, delta, tree_index):
1602
"""Check that none of the file_ids in new_ids are present in a tree."""
1605
id_index = self._get_id_index()
1606
for file_id in new_ids:
1607
for key in id_index.get(file_id, ()):
1608
block_i, entry_i, d_present, f_present = \
1609
self._get_block_entry_index(key[0], key[1], tree_index)
1611
# In a different tree
1613
entry = self._dirblocks[block_i][1][entry_i]
1614
if entry[0][2] != file_id:
1615
# Different file_id, so not what we want.
1617
self._raise_invalid(("%s/%s" % key[0:2]).decode('utf8'), file_id,
1618
"This file_id is new in the delta but already present in "
1621
def _raise_invalid(self, path, file_id, reason):
1622
self._changes_aborted = True
1623
raise errors.InconsistentDelta(path, file_id, reason)
1625
def _update_basis_apply_adds(self, adds):
1626
"""Apply a sequence of adds to tree 1 during update_basis_by_delta.
1628
They may be adds, or renames that have been split into add/delete
1631
:param adds: A sequence of adds. Each add is a tuple:
1632
(None, new_path_utf8, file_id, (entry_details), real_add). real_add
1633
is False when the add is the second half of a remove-and-reinsert
1634
pair created to handle renames and deletes.
1636
# Adds are accumulated partly from renames, so can be in any input
1638
# TODO: we may want to sort in dirblocks order. That way each entry
1639
# will end up in the same directory, allowing the _get_entry
1640
# fast-path for looking up 2 items in the same dir work.
1641
adds.sort(key=lambda x: x[1])
1642
# adds is now in lexographic order, which places all parents before
1643
# their children, so we can process it linearly.
1645
st = static_tuple.StaticTuple
1646
for old_path, new_path, file_id, new_details, real_add in adds:
1647
dirname, basename = osutils.split(new_path)
1648
entry_key = st(dirname, basename, file_id)
1649
block_index, present = self._find_block_index_from_key(entry_key)
1651
self._raise_invalid(new_path, file_id,
1652
"Unable to find block for this record."
1653
" Was the parent added?")
1654
block = self._dirblocks[block_index][1]
1655
entry_index, present = self._find_entry_index(entry_key, block)
1657
if old_path is not None:
1658
self._raise_invalid(new_path, file_id,
1659
'considered a real add but still had old_path at %s'
1662
entry = block[entry_index]
1663
basis_kind = entry[1][1][0]
1664
if basis_kind == 'a':
1665
entry[1][1] = new_details
1666
elif basis_kind == 'r':
1667
raise NotImplementedError()
1669
self._raise_invalid(new_path, file_id,
1670
"An entry was marked as a new add"
1671
" but the basis target already existed")
1673
# The exact key was not found in the block. However, we need to
1674
# check if there is a key next to us that would have matched.
1675
# We only need to check 2 locations, because there are only 2
1677
for maybe_index in range(entry_index-1, entry_index+1):
1678
if maybe_index < 0 or maybe_index >= len(block):
1680
maybe_entry = block[maybe_index]
1681
if maybe_entry[0][:2] != (dirname, basename):
1682
# Just a random neighbor
1684
if maybe_entry[0][2] == file_id:
1685
raise AssertionError(
1686
'_find_entry_index didnt find a key match'
1687
' but walking the data did, for %s'
1689
basis_kind = maybe_entry[1][1][0]
1690
if basis_kind not in 'ar':
1691
self._raise_invalid(new_path, file_id,
1692
"we have an add record for path, but the path"
1693
" is already present with another file_id %s"
1694
% (maybe_entry[0][2],))
1696
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1698
block.insert(entry_index, entry)
1700
active_kind = entry[1][0][0]
1701
if active_kind == 'a':
1702
# The active record shows up as absent, this could be genuine,
1703
# or it could be present at some other location. We need to
1705
id_index = self._get_id_index()
1706
# The id_index may not be perfectly accurate for tree1, because
1707
# we haven't been keeping it updated. However, it should be
1708
# fine for tree0, and that gives us enough info for what we
1710
keys = id_index.get(file_id, ())
1712
block_i, entry_i, d_present, f_present = \
1713
self._get_block_entry_index(key[0], key[1], 0)
1716
active_entry = self._dirblocks[block_i][1][entry_i]
1717
if (active_entry[0][2] != file_id):
1718
# Some other file is at this path, we don't need to
1721
real_active_kind = active_entry[1][0][0]
1722
if real_active_kind in 'ar':
1723
# We found a record, which was not *this* record,
1724
# which matches the file_id, but is not actually
1725
# present. Something seems *really* wrong.
1726
self._raise_invalid(new_path, file_id,
1727
"We found a tree0 entry that doesnt make sense")
1728
# Now, we've found a tree0 entry which matches the file_id
1729
# but is at a different location. So update them to be
1731
active_dir, active_name = active_entry[0][:2]
1733
active_path = active_dir + '/' + active_name
1735
active_path = active_name
1736
active_entry[1][1] = st('r', new_path, 0, False, '')
1737
entry[1][0] = st('r', active_path, 0, False, '')
1738
elif active_kind == 'r':
1739
raise NotImplementedError()
1741
new_kind = new_details[0]
1743
self._ensure_block(block_index, entry_index, new_path)
1745
def _update_basis_apply_changes(self, changes):
1746
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1748
:param adds: A sequence of changes. Each change is a tuple:
1749
(path_utf8, path_utf8, file_id, (entry_details))
1752
for old_path, new_path, file_id, new_details in changes:
1753
# the entry for this file_id must be in tree 0.
1754
entry = self._get_entry(1, file_id, new_path)
1755
if entry[0] is None or entry[1][1][0] in 'ar':
1756
self._raise_invalid(new_path, file_id,
1757
'changed entry considered not present')
1758
entry[1][1] = new_details
1760
def _update_basis_apply_deletes(self, deletes):
1761
"""Apply a sequence of deletes to tree 1 during update_basis_by_delta.
1763
They may be deletes, or renames that have been split into add/delete
1766
:param deletes: A sequence of deletes. Each delete is a tuple:
1767
(old_path_utf8, new_path_utf8, file_id, None, real_delete).
1768
real_delete is True when the desired outcome is an actual deletion
1769
rather than the rename handling logic temporarily deleting a path
1770
during the replacement of a parent.
1772
null = DirState.NULL_PARENT_DETAILS
1773
for old_path, new_path, file_id, _, real_delete in deletes:
1774
if real_delete != (new_path is None):
1775
self._raise_invalid(old_path, file_id, "bad delete delta")
1776
# the entry for this file_id must be in tree 1.
1777
dirname, basename = osutils.split(old_path)
1778
block_index, entry_index, dir_present, file_present = \
1779
self._get_block_entry_index(dirname, basename, 1)
1780
if not file_present:
1781
self._raise_invalid(old_path, file_id,
1782
'basis tree does not contain removed entry')
1783
entry = self._dirblocks[block_index][1][entry_index]
1784
# The state of the entry in the 'active' WT
1785
active_kind = entry[1][0][0]
1786
if entry[0][2] != file_id:
1787
self._raise_invalid(old_path, file_id,
1788
'mismatched file_id in tree 1')
1790
old_kind = entry[1][1][0]
1791
if active_kind in 'ar':
1792
# The active tree doesn't have this file_id.
1793
# The basis tree is changing this record. If this is a
1794
# rename, then we don't want the record here at all
1795
# anymore. If it is just an in-place change, we want the
1796
# record here, but we'll add it if we need to. So we just
1798
if active_kind == 'r':
1799
active_path = entry[1][0][1]
1800
active_entry = self._get_entry(0, file_id, active_path)
1801
if active_entry[1][1][0] != 'r':
1802
self._raise_invalid(old_path, file_id,
1803
"Dirstate did not have matching rename entries")
1804
elif active_entry[1][0][0] in 'ar':
1805
self._raise_invalid(old_path, file_id,
1806
"Dirstate had a rename pointing at an inactive"
1808
active_entry[1][1] = null
1809
del self._dirblocks[block_index][1][entry_index]
1811
# This was a directory, and the active tree says it
1812
# doesn't exist, and now the basis tree says it doesn't
1813
# exist. Remove its dirblock if present
1815
present) = self._find_block_index_from_key(
1818
dir_block = self._dirblocks[dir_block_index][1]
1820
# This entry is empty, go ahead and just remove it
1821
del self._dirblocks[dir_block_index]
1823
# There is still an active record, so just mark this
1826
block_i, entry_i, d_present, f_present = \
1827
self._get_block_entry_index(old_path, '', 1)
1829
dir_block = self._dirblocks[block_i][1]
1830
for child_entry in dir_block:
1831
child_basis_kind = child_entry[1][1][0]
1832
if child_basis_kind not in 'ar':
1833
self._raise_invalid(old_path, file_id,
1834
"The file id was deleted but its children were "
1837
def _after_delta_check_parents(self, parents, index):
1838
"""Check that parents required by the delta are all intact.
1840
:param parents: An iterable of (path_utf8, file_id) tuples which are
1841
required to be present in tree 'index' at path_utf8 with id file_id
1843
:param index: The column in the dirstate to check for parents in.
1845
for dirname_utf8, file_id in parents:
1846
# Get the entry - the ensures that file_id, dirname_utf8 exists and
1847
# has the right file id.
1848
entry = self._get_entry(index, file_id, dirname_utf8)
1849
if entry[1] is None:
1850
self._raise_invalid(dirname_utf8.decode('utf8'),
1851
file_id, "This parent is not present.")
1852
# Parents of things must be directories
1853
if entry[1][index][0] != 'd':
1854
self._raise_invalid(dirname_utf8.decode('utf8'),
1855
file_id, "This parent is not a directory.")
1857
def _observed_sha1(self, entry, sha1, stat_value,
1858
_stat_to_minikind=_stat_to_minikind):
1859
"""Note the sha1 of a file.
1861
:param entry: The entry the sha1 is for.
1862
:param sha1: The observed sha1.
1863
:param stat_value: The os.lstat for the file.
1104
1866
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
1105
1867
except KeyError:
1106
1868
# Unhandled kind
1108
packed_stat = _pack_stat(stat_value)
1109
(saved_minikind, saved_link_or_sha1, saved_file_size,
1110
saved_executable, saved_packed_stat) = entry[1][0]
1112
if (minikind == saved_minikind
1113
and packed_stat == saved_packed_stat):
1114
# The stat hasn't changed since we saved, so we can re-use the
1119
# size should also be in packed_stat
1120
if saved_file_size == stat_value.st_size:
1121
return saved_link_or_sha1
1123
# If we have gotten this far, that means that we need to actually
1124
# process this entry.
1126
1870
if minikind == 'f':
1127
link_or_sha1 = self._sha1_file(abspath, entry)
1128
executable = self._is_executable(stat_value.st_mode,
1130
if self._cutoff_time is None:
1131
self._sha_cutoff_time()
1132
if (stat_value.st_mtime < self._cutoff_time
1133
and stat_value.st_ctime < self._cutoff_time):
1134
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
1135
executable, packed_stat)
1137
entry[1][0] = ('f', '', stat_value.st_size,
1138
executable, DirState.NULLSTAT)
1139
elif minikind == 'd':
1141
entry[1][0] = ('d', '', 0, False, packed_stat)
1142
if saved_minikind != 'd':
1143
# This changed from something into a directory. Make sure we
1144
# have a directory block for it. This doesn't happen very
1145
# often, so this doesn't have to be super fast.
1146
block_index, entry_index, dir_present, file_present = \
1147
self._get_block_entry_index(entry[0][0], entry[0][1], 0)
1148
self._ensure_block(block_index, entry_index,
1149
osutils.pathjoin(entry[0][0], entry[0][1]))
1150
elif minikind == 'l':
1151
link_or_sha1 = self._read_link(abspath, saved_link_or_sha1)
1152
if self._cutoff_time is None:
1153
self._sha_cutoff_time()
1154
if (stat_value.st_mtime < self._cutoff_time
1155
and stat_value.st_ctime < self._cutoff_time):
1156
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
1159
entry[1][0] = ('l', '', stat_value.st_size,
1160
False, DirState.NULLSTAT)
1161
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
1871
if self._cutoff_time is None:
1872
self._sha_cutoff_time()
1873
if (stat_value.st_mtime < self._cutoff_time
1874
and stat_value.st_ctime < self._cutoff_time):
1875
entry[1][0] = ('f', sha1, stat_value.st_size, entry[1][0][3],
1876
pack_stat(stat_value))
1877
self._mark_modified([entry])
1164
1879
def _sha_cutoff_time(self):
1165
1880
"""Return cutoff time.
2367
3349
self._split_path_cache = {}
2369
3351
def _requires_lock(self):
2370
"""Checks that a lock is currently held by someone on the dirstate"""
3352
"""Check that a lock is currently held by someone on the dirstate."""
2371
3353
if not self._lock_token:
2372
3354
raise errors.ObjectNotLocked(self)
2375
def bisect_dirblock(dirblocks, dirname, lo=0, hi=None, cache={}):
2376
"""Return the index where to insert dirname into the dirblocks.
2378
The return value idx is such that all directories blocks in dirblock[:idx]
2379
have names < dirname, and all blocks in dirblock[idx:] have names >=
2382
Optional args lo (default 0) and hi (default len(dirblocks)) bound the
2383
slice of a to be searched.
3357
def py_update_entry(state, entry, abspath, stat_value,
3358
_stat_to_minikind=DirState._stat_to_minikind):
3359
"""Update the entry based on what is actually on disk.
3361
This function only calculates the sha if it needs to - if the entry is
3362
uncachable, or clearly different to the first parent's entry, no sha
3363
is calculated, and None is returned.
3365
:param state: The dirstate this entry is in.
3366
:param entry: This is the dirblock entry for the file in question.
3367
:param abspath: The path on disk for this file.
3368
:param stat_value: The stat value done on the path.
3369
:return: None, or The sha1 hexdigest of the file (40 bytes) or link
3370
target of a symlink.
2388
dirname_split = cache[dirname]
3373
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
2389
3374
except KeyError:
2390
dirname_split = dirname.split('/')
2391
cache[dirname] = dirname_split
2394
# Grab the dirname for the current dirblock
2395
cur = dirblocks[mid][0]
3377
packed_stat = pack_stat(stat_value)
3378
(saved_minikind, saved_link_or_sha1, saved_file_size,
3379
saved_executable, saved_packed_stat) = entry[1][0]
3381
if minikind == 'd' and saved_minikind == 't':
3383
if (minikind == saved_minikind
3384
and packed_stat == saved_packed_stat):
3385
# The stat hasn't changed since we saved, so we can re-use the
3390
# size should also be in packed_stat
3391
if saved_file_size == stat_value.st_size:
3392
return saved_link_or_sha1
3394
# If we have gotten this far, that means that we need to actually
3395
# process this entry.
3399
executable = state._is_executable(stat_value.st_mode,
3401
if state._cutoff_time is None:
3402
state._sha_cutoff_time()
3403
if (stat_value.st_mtime < state._cutoff_time
3404
and stat_value.st_ctime < state._cutoff_time
3405
and len(entry[1]) > 1
3406
and entry[1][1][0] != 'a'):
3407
# Could check for size changes for further optimised
3408
# avoidance of sha1's. However the most prominent case of
3409
# over-shaing is during initial add, which this catches.
3410
# Besides, if content filtering happens, size and sha
3411
# are calculated at the same time, so checking just the size
3412
# gains nothing w.r.t. performance.
3413
link_or_sha1 = state._sha1_file(abspath)
3414
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3415
executable, packed_stat)
3417
entry[1][0] = ('f', '', stat_value.st_size,
3418
executable, DirState.NULLSTAT)
3419
worth_saving = False
3420
elif minikind == 'd':
3422
entry[1][0] = ('d', '', 0, False, packed_stat)
3423
if saved_minikind != 'd':
3424
# This changed from something into a directory. Make sure we
3425
# have a directory block for it. This doesn't happen very
3426
# often, so this doesn't have to be super fast.
3427
block_index, entry_index, dir_present, file_present = \
3428
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3429
state._ensure_block(block_index, entry_index,
3430
osutils.pathjoin(entry[0][0], entry[0][1]))
3432
worth_saving = False
3433
elif minikind == 'l':
3434
if saved_minikind == 'l':
3435
worth_saving = False
3436
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3437
if state._cutoff_time is None:
3438
state._sha_cutoff_time()
3439
if (stat_value.st_mtime < state._cutoff_time
3440
and stat_value.st_ctime < state._cutoff_time):
3441
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3444
entry[1][0] = ('l', '', stat_value.st_size,
3445
False, DirState.NULLSTAT)
3447
state._mark_modified([entry])
3451
class ProcessEntryPython(object):
3453
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3454
"last_source_parent", "last_target_parent", "include_unchanged",
3455
"partial", "use_filesystem_for_exec", "utf8_decode",
3456
"searched_specific_files", "search_specific_files",
3457
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3458
"state", "source_index", "target_index", "want_unversioned", "tree"]
3460
def __init__(self, include_unchanged, use_filesystem_for_exec,
3461
search_specific_files, state, source_index, target_index,
3462
want_unversioned, tree):
3463
self.old_dirname_to_file_id = {}
3464
self.new_dirname_to_file_id = {}
3465
# Are we doing a partial iter_changes?
3466
self.partial = search_specific_files != set([''])
3467
# Using a list so that we can access the values and change them in
3468
# nested scope. Each one is [path, file_id, entry]
3469
self.last_source_parent = [None, None]
3470
self.last_target_parent = [None, None]
3471
self.include_unchanged = include_unchanged
3472
self.use_filesystem_for_exec = use_filesystem_for_exec
3473
self.utf8_decode = cache_utf8._utf8_decode
3474
# for all search_indexs in each path at or under each element of
3475
# search_specific_files, if the detail is relocated: add the id, and
3476
# add the relocated path as one to search if its not searched already.
3477
# If the detail is not relocated, add the id.
3478
self.searched_specific_files = set()
3479
# When we search exact paths without expanding downwards, we record
3481
self.searched_exact_paths = set()
3482
self.search_specific_files = search_specific_files
3483
# The parents up to the root of the paths we are searching.
3484
# After all normal paths are returned, these specific items are returned.
3485
self.search_specific_file_parents = set()
3486
# The ids we've sent out in the delta.
3487
self.seen_ids = set()
3489
self.source_index = source_index
3490
self.target_index = target_index
3491
if target_index != 0:
3492
# A lot of code in here depends on target_index == 0
3493
raise errors.BzrError('unsupported target index')
3494
self.want_unversioned = want_unversioned
3497
def _process_entry(self, entry, path_info, pathjoin=osutils.pathjoin):
3498
"""Compare an entry and real disk to generate delta information.
3500
:param path_info: top_relpath, basename, kind, lstat, abspath for
3501
the path of entry. If None, then the path is considered absent in
3502
the target (Perhaps we should pass in a concrete entry for this ?)
3503
Basename is returned as a utf8 string because we expect this
3504
tuple will be ignored, and don't want to take the time to
3506
:return: (iter_changes_result, changed). If the entry has not been
3507
handled then changed is None. Otherwise it is False if no content
3508
or metadata changes have occurred, and True if any content or
3509
metadata change has occurred. If self.include_unchanged is True then
3510
if changed is not None, iter_changes_result will always be a result
3511
tuple. Otherwise, iter_changes_result is None unless changed is
3514
if self.source_index is None:
3515
source_details = DirState.NULL_PARENT_DETAILS
3517
source_details = entry[1][self.source_index]
3518
target_details = entry[1][self.target_index]
3519
target_minikind = target_details[0]
3520
if path_info is not None and target_minikind in 'fdlt':
3521
if not (self.target_index == 0):
3522
raise AssertionError()
3523
link_or_sha1 = update_entry(self.state, entry,
3524
abspath=path_info[4], stat_value=path_info[3])
3525
# The entry may have been modified by update_entry
3526
target_details = entry[1][self.target_index]
3527
target_minikind = target_details[0]
3530
file_id = entry[0][2]
3531
source_minikind = source_details[0]
3532
if source_minikind in 'fdltr' and target_minikind in 'fdlt':
3533
# claimed content in both: diff
3534
# r | fdlt | | add source to search, add id path move and perform
3535
# | | | diff check on source-target
3536
# r | fdlt | a | dangling file that was present in the basis.
3538
if source_minikind in 'r':
3539
# add the source to the search path to find any children it
3540
# has. TODO ? : only add if it is a container ?
3541
if not osutils.is_inside_any(self.searched_specific_files,
3543
self.search_specific_files.add(source_details[1])
3544
# generate the old path; this is needed for stating later
3546
old_path = source_details[1]
3547
old_dirname, old_basename = os.path.split(old_path)
3548
path = pathjoin(entry[0][0], entry[0][1])
3549
old_entry = self.state._get_entry(self.source_index,
3551
# update the source details variable to be the real
3553
if old_entry == (None, None):
3554
raise errors.CorruptDirstate(self.state._filename,
3555
"entry '%s/%s' is considered renamed from %r"
3556
" but source does not exist\n"
3557
"entry: %s" % (entry[0][0], entry[0][1], old_path, entry))
3558
source_details = old_entry[1][self.source_index]
3559
source_minikind = source_details[0]
3561
old_dirname = entry[0][0]
3562
old_basename = entry[0][1]
3563
old_path = path = None
3564
if path_info is None:
3565
# the file is missing on disk, show as removed.
3566
content_change = True
3570
# source and target are both versioned and disk file is present.
3571
target_kind = path_info[2]
3572
if target_kind == 'directory':
3574
old_path = path = pathjoin(old_dirname, old_basename)
3575
self.new_dirname_to_file_id[path] = file_id
3576
if source_minikind != 'd':
3577
content_change = True
3579
# directories have no fingerprint
3580
content_change = False
3582
elif target_kind == 'file':
3583
if source_minikind != 'f':
3584
content_change = True
3586
# Check the sha. We can't just rely on the size as
3587
# content filtering may mean differ sizes actually
3588
# map to the same content
3589
if link_or_sha1 is None:
3591
statvalue, link_or_sha1 = \
3592
self.state._sha1_provider.stat_and_sha1(
3594
self.state._observed_sha1(entry, link_or_sha1,
3596
content_change = (link_or_sha1 != source_details[1])
3597
# Target details is updated at update_entry time
3598
if self.use_filesystem_for_exec:
3599
# We don't need S_ISREG here, because we are sure
3600
# we are dealing with a file.
3601
target_exec = bool(stat.S_IEXEC & path_info[3].st_mode)
3603
target_exec = target_details[3]
3604
elif target_kind == 'symlink':
3605
if source_minikind != 'l':
3606
content_change = True
3608
content_change = (link_or_sha1 != source_details[1])
3610
elif target_kind == 'tree-reference':
3611
if source_minikind != 't':
3612
content_change = True
3614
content_change = False
3618
path = pathjoin(old_dirname, old_basename)
3619
raise errors.BadFileKindError(path, path_info[2])
3620
if source_minikind == 'd':
3622
old_path = path = pathjoin(old_dirname, old_basename)
3623
self.old_dirname_to_file_id[old_path] = file_id
3624
# parent id is the entry for the path in the target tree
3625
if old_basename and old_dirname == self.last_source_parent[0]:
3626
source_parent_id = self.last_source_parent[1]
3629
source_parent_id = self.old_dirname_to_file_id[old_dirname]
3631
source_parent_entry = self.state._get_entry(self.source_index,
3632
path_utf8=old_dirname)
3633
source_parent_id = source_parent_entry[0][2]
3634
if source_parent_id == entry[0][2]:
3635
# This is the root, so the parent is None
3636
source_parent_id = None
3638
self.last_source_parent[0] = old_dirname
3639
self.last_source_parent[1] = source_parent_id
3640
new_dirname = entry[0][0]
3641
if entry[0][1] and new_dirname == self.last_target_parent[0]:
3642
target_parent_id = self.last_target_parent[1]
3645
target_parent_id = self.new_dirname_to_file_id[new_dirname]
3647
# TODO: We don't always need to do the lookup, because the
3648
# parent entry will be the same as the source entry.
3649
target_parent_entry = self.state._get_entry(self.target_index,
3650
path_utf8=new_dirname)
3651
if target_parent_entry == (None, None):
3652
raise AssertionError(
3653
"Could not find target parent in wt: %s\nparent of: %s"
3654
% (new_dirname, entry))
3655
target_parent_id = target_parent_entry[0][2]
3656
if target_parent_id == entry[0][2]:
3657
# This is the root, so the parent is None
3658
target_parent_id = None
3660
self.last_target_parent[0] = new_dirname
3661
self.last_target_parent[1] = target_parent_id
3663
source_exec = source_details[3]
3664
changed = (content_change
3665
or source_parent_id != target_parent_id
3666
or old_basename != entry[0][1]
3667
or source_exec != target_exec
3669
if not changed and not self.include_unchanged:
3672
if old_path is None:
3673
old_path = path = pathjoin(old_dirname, old_basename)
3674
old_path_u = self.utf8_decode(old_path)[0]
3677
old_path_u = self.utf8_decode(old_path)[0]
3678
if old_path == path:
3681
path_u = self.utf8_decode(path)[0]
3682
source_kind = DirState._minikind_to_kind[source_minikind]
3683
return (entry[0][2],
3684
(old_path_u, path_u),
3687
(source_parent_id, target_parent_id),
3688
(self.utf8_decode(old_basename)[0], self.utf8_decode(entry[0][1])[0]),
3689
(source_kind, target_kind),
3690
(source_exec, target_exec)), changed
3691
elif source_minikind in 'a' and target_minikind in 'fdlt':
3692
# looks like a new file
3693
path = pathjoin(entry[0][0], entry[0][1])
3694
# parent id is the entry for the path in the target tree
3695
# TODO: these are the same for an entire directory: cache em.
3696
parent_id = self.state._get_entry(self.target_index,
3697
path_utf8=entry[0][0])[0][2]
3698
if parent_id == entry[0][2]:
3700
if path_info is not None:
3702
if self.use_filesystem_for_exec:
3703
# We need S_ISREG here, because we aren't sure if this
3706
stat.S_ISREG(path_info[3].st_mode)
3707
and stat.S_IEXEC & path_info[3].st_mode)
3709
target_exec = target_details[3]
3710
return (entry[0][2],
3711
(None, self.utf8_decode(path)[0]),
3715
(None, self.utf8_decode(entry[0][1])[0]),
3716
(None, path_info[2]),
3717
(None, target_exec)), True
3719
# Its a missing file, report it as such.
3720
return (entry[0][2],
3721
(None, self.utf8_decode(path)[0]),
3725
(None, self.utf8_decode(entry[0][1])[0]),
3727
(None, False)), True
3728
elif source_minikind in 'fdlt' and target_minikind in 'a':
3729
# unversioned, possibly, or possibly not deleted: we dont care.
3730
# if its still on disk, *and* theres no other entry at this
3731
# path [we dont know this in this routine at the moment -
3732
# perhaps we should change this - then it would be an unknown.
3733
old_path = pathjoin(entry[0][0], entry[0][1])
3734
# parent id is the entry for the path in the target tree
3735
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3736
if parent_id == entry[0][2]:
3738
return (entry[0][2],
3739
(self.utf8_decode(old_path)[0], None),
3743
(self.utf8_decode(entry[0][1])[0], None),
3744
(DirState._minikind_to_kind[source_minikind], None),
3745
(source_details[3], None)), True
3746
elif source_minikind in 'fdlt' and target_minikind in 'r':
3747
# a rename; could be a true rename, or a rename inherited from
3748
# a renamed parent. TODO: handle this efficiently. Its not
3749
# common case to rename dirs though, so a correct but slow
3750
# implementation will do.
3751
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3752
self.search_specific_files.add(target_details[1])
3753
elif source_minikind in 'ra' and target_minikind in 'ra':
3754
# neither of the selected trees contain this file,
3755
# so skip over it. This is not currently directly tested, but
3756
# is indirectly via test_too_much.TestCommands.test_conflicts.
3759
raise AssertionError("don't know how to compare "
3760
"source_minikind=%r, target_minikind=%r"
3761
% (source_minikind, target_minikind))
3767
def _gather_result_for_consistency(self, result):
3768
"""Check a result we will yield to make sure we are consistent later.
3770
This gathers result's parents into a set to output later.
3772
:param result: A result tuple.
3774
if not self.partial or not result[0]:
3776
self.seen_ids.add(result[0])
3777
new_path = result[1][1]
3779
# Not the root and not a delete: queue up the parents of the path.
3780
self.search_specific_file_parents.update(
3781
osutils.parent_directories(new_path.encode('utf8')))
3782
# Add the root directory which parent_directories does not
3784
self.search_specific_file_parents.add('')
3786
def iter_changes(self):
3787
"""Iterate over the changes."""
3788
utf8_decode = cache_utf8._utf8_decode
3789
_cmp_by_dirs = cmp_by_dirs
3790
_process_entry = self._process_entry
3791
search_specific_files = self.search_specific_files
3792
searched_specific_files = self.searched_specific_files
3793
splitpath = osutils.splitpath
3795
# compare source_index and target_index at or under each element of search_specific_files.
3796
# follow the following comparison table. Note that we only want to do diff operations when
3797
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3801
# Source | Target | disk | action
3802
# r | fdlt | | add source to search, add id path move and perform
3803
# | | | diff check on source-target
3804
# r | fdlt | a | dangling file that was present in the basis.
3806
# r | a | | add source to search
3808
# r | r | | this path is present in a non-examined tree, skip.
3809
# r | r | a | this path is present in a non-examined tree, skip.
3810
# a | fdlt | | add new id
3811
# a | fdlt | a | dangling locally added file, skip
3812
# a | a | | not present in either tree, skip
3813
# a | a | a | not present in any tree, skip
3814
# a | r | | not present in either tree at this path, skip as it
3815
# | | | may not be selected by the users list of paths.
3816
# a | r | a | not present in either tree at this path, skip as it
3817
# | | | may not be selected by the users list of paths.
3818
# fdlt | fdlt | | content in both: diff them
3819
# fdlt | fdlt | a | deleted locally, but not unversioned - show as deleted ?
3820
# fdlt | a | | unversioned: output deleted id for now
3821
# fdlt | a | a | unversioned and deleted: output deleted id
3822
# fdlt | r | | relocated in this tree, so add target to search.
3823
# | | | Dont diff, we will see an r,fd; pair when we reach
3824
# | | | this id at the other path.
3825
# fdlt | r | a | relocated in this tree, so add target to search.
3826
# | | | Dont diff, we will see an r,fd; pair when we reach
3827
# | | | this id at the other path.
3829
# TODO: jam 20070516 - Avoid the _get_entry lookup overhead by
3830
# keeping a cache of directories that we have seen.
3832
while search_specific_files:
3833
# TODO: the pending list should be lexically sorted? the
3834
# interface doesn't require it.
3835
current_root = search_specific_files.pop()
3836
current_root_unicode = current_root.decode('utf8')
3837
searched_specific_files.add(current_root)
3838
# process the entries for this containing directory: the rest will be
3839
# found by their parents recursively.
3840
root_entries = self.state._entries_for_path(current_root)
3841
root_abspath = self.tree.abspath(current_root_unicode)
3843
root_stat = os.lstat(root_abspath)
3845
if e.errno == errno.ENOENT:
3846
# the path does not exist: let _process_entry know that.
3847
root_dir_info = None
3849
# some other random error: hand it up.
3852
root_dir_info = ('', current_root,
3853
osutils.file_kind_from_stat_mode(root_stat.st_mode), root_stat,
3855
if root_dir_info[2] == 'directory':
3856
if self.tree._directory_is_tree_reference(
3857
current_root.decode('utf8')):
3858
root_dir_info = root_dir_info[:2] + \
3859
('tree-reference',) + root_dir_info[3:]
3861
if not root_entries and not root_dir_info:
3862
# this specified path is not present at all, skip it.
3864
path_handled = False
3865
for entry in root_entries:
3866
result, changed = _process_entry(entry, root_dir_info)
3867
if changed is not None:
3870
self._gather_result_for_consistency(result)
3871
if changed or self.include_unchanged:
3873
if self.want_unversioned and not path_handled and root_dir_info:
3874
new_executable = bool(
3875
stat.S_ISREG(root_dir_info[3].st_mode)
3876
and stat.S_IEXEC & root_dir_info[3].st_mode)
3878
(None, current_root_unicode),
3882
(None, splitpath(current_root_unicode)[-1]),
3883
(None, root_dir_info[2]),
3884
(None, new_executable)
3886
initial_key = (current_root, '', '')
3887
block_index, _ = self.state._find_block_index_from_key(initial_key)
3888
if block_index == 0:
3889
# we have processed the total root already, but because the
3890
# initial key matched it we should skip it here.
3892
if root_dir_info and root_dir_info[2] == 'tree-reference':
3893
current_dir_info = None
3895
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3897
current_dir_info = dir_iterator.next()
3899
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3900
# python 2.5 has e.errno == EINVAL,
3901
# and e.winerror == ERROR_DIRECTORY
3902
e_winerror = getattr(e, 'winerror', None)
3903
win_errors = (ERROR_DIRECTORY, ERROR_PATH_NOT_FOUND)
3904
# there may be directories in the inventory even though
3905
# this path is not a file on disk: so mark it as end of
3907
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3908
current_dir_info = None
3909
elif (sys.platform == 'win32'
3910
and (e.errno in win_errors
3911
or e_winerror in win_errors)):
3912
current_dir_info = None
3916
if current_dir_info[0][0] == '':
3917
# remove .bzr from iteration
3918
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3919
if current_dir_info[1][bzr_index][0] != '.bzr':
3920
raise AssertionError()
3921
del current_dir_info[1][bzr_index]
3922
# walk until both the directory listing and the versioned metadata
3924
if (block_index < len(self.state._dirblocks) and
3925
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
3926
current_block = self.state._dirblocks[block_index]
3928
current_block = None
3929
while (current_dir_info is not None or
3930
current_block is not None):
3931
if (current_dir_info and current_block
3932
and current_dir_info[0][0] != current_block[0]):
3933
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
3934
# filesystem data refers to paths not covered by the dirblock.
3935
# this has two possibilities:
3936
# A) it is versioned but empty, so there is no block for it
3937
# B) it is not versioned.
3939
# if (A) then we need to recurse into it to check for
3940
# new unknown files or directories.
3941
# if (B) then we should ignore it, because we don't
3942
# recurse into unknown directories.
3944
while path_index < len(current_dir_info[1]):
3945
current_path_info = current_dir_info[1][path_index]
3946
if self.want_unversioned:
3947
if current_path_info[2] == 'directory':
3948
if self.tree._directory_is_tree_reference(
3949
current_path_info[0].decode('utf8')):
3950
current_path_info = current_path_info[:2] + \
3951
('tree-reference',) + current_path_info[3:]
3952
new_executable = bool(
3953
stat.S_ISREG(current_path_info[3].st_mode)
3954
and stat.S_IEXEC & current_path_info[3].st_mode)
3956
(None, utf8_decode(current_path_info[0])[0]),
3960
(None, utf8_decode(current_path_info[1])[0]),
3961
(None, current_path_info[2]),
3962
(None, new_executable))
3963
# dont descend into this unversioned path if it is
3965
if current_path_info[2] in ('directory',
3967
del current_dir_info[1][path_index]
3971
# This dir info has been handled, go to the next
3973
current_dir_info = dir_iterator.next()
3974
except StopIteration:
3975
current_dir_info = None
3977
# We have a dirblock entry for this location, but there
3978
# is no filesystem path for this. This is most likely
3979
# because a directory was removed from the disk.
3980
# We don't have to report the missing directory,
3981
# because that should have already been handled, but we
3982
# need to handle all of the files that are contained
3984
for current_entry in current_block[1]:
3985
# entry referring to file not present on disk.
3986
# advance the entry only, after processing.
3987
result, changed = _process_entry(current_entry, None)
3988
if changed is not None:
3990
self._gather_result_for_consistency(result)
3991
if changed or self.include_unchanged:
3994
if (block_index < len(self.state._dirblocks) and
3995
osutils.is_inside(current_root,
3996
self.state._dirblocks[block_index][0])):
3997
current_block = self.state._dirblocks[block_index]
3999
current_block = None
4002
if current_block and entry_index < len(current_block[1]):
4003
current_entry = current_block[1][entry_index]
4005
current_entry = None
4006
advance_entry = True
4008
if current_dir_info and path_index < len(current_dir_info[1]):
4009
current_path_info = current_dir_info[1][path_index]
4010
if current_path_info[2] == 'directory':
4011
if self.tree._directory_is_tree_reference(
4012
current_path_info[0].decode('utf8')):
4013
current_path_info = current_path_info[:2] + \
4014
('tree-reference',) + current_path_info[3:]
4016
current_path_info = None
4018
path_handled = False
4019
while (current_entry is not None or
4020
current_path_info is not None):
4021
if current_entry is None:
4022
# the check for path_handled when the path is advanced
4023
# will yield this path if needed.
4025
elif current_path_info is None:
4026
# no path is fine: the per entry code will handle it.
4027
result, changed = _process_entry(current_entry, current_path_info)
4028
if changed is not None:
4030
self._gather_result_for_consistency(result)
4031
if changed or self.include_unchanged:
4033
elif (current_entry[0][1] != current_path_info[1]
4034
or current_entry[1][self.target_index][0] in 'ar'):
4035
# The current path on disk doesn't match the dirblock
4036
# record. Either the dirblock is marked as absent, or
4037
# the file on disk is not present at all in the
4038
# dirblock. Either way, report about the dirblock
4039
# entry, and let other code handle the filesystem one.
4041
# Compare the basename for these files to determine
4043
if current_path_info[1] < current_entry[0][1]:
4044
# extra file on disk: pass for now, but only
4045
# increment the path, not the entry
4046
advance_entry = False
4048
# entry referring to file not present on disk.
4049
# advance the entry only, after processing.
4050
result, changed = _process_entry(current_entry, None)
4051
if changed is not None:
4053
self._gather_result_for_consistency(result)
4054
if changed or self.include_unchanged:
4056
advance_path = False
4058
result, changed = _process_entry(current_entry, current_path_info)
4059
if changed is not None:
4062
self._gather_result_for_consistency(result)
4063
if changed or self.include_unchanged:
4065
if advance_entry and current_entry is not None:
4067
if entry_index < len(current_block[1]):
4068
current_entry = current_block[1][entry_index]
4070
current_entry = None
4072
advance_entry = True # reset the advance flaga
4073
if advance_path and current_path_info is not None:
4074
if not path_handled:
4075
# unversioned in all regards
4076
if self.want_unversioned:
4077
new_executable = bool(
4078
stat.S_ISREG(current_path_info[3].st_mode)
4079
and stat.S_IEXEC & current_path_info[3].st_mode)
4081
relpath_unicode = utf8_decode(current_path_info[0])[0]
4082
except UnicodeDecodeError:
4083
raise errors.BadFilenameEncoding(
4084
current_path_info[0], osutils._fs_enc)
4086
(None, relpath_unicode),
4090
(None, utf8_decode(current_path_info[1])[0]),
4091
(None, current_path_info[2]),
4092
(None, new_executable))
4093
# dont descend into this unversioned path if it is
4095
if current_path_info[2] in ('directory'):
4096
del current_dir_info[1][path_index]
4098
# dont descend the disk iterator into any tree
4100
if current_path_info[2] == 'tree-reference':
4101
del current_dir_info[1][path_index]
4104
if path_index < len(current_dir_info[1]):
4105
current_path_info = current_dir_info[1][path_index]
4106
if current_path_info[2] == 'directory':
4107
if self.tree._directory_is_tree_reference(
4108
current_path_info[0].decode('utf8')):
4109
current_path_info = current_path_info[:2] + \
4110
('tree-reference',) + current_path_info[3:]
4112
current_path_info = None
4113
path_handled = False
4115
advance_path = True # reset the advance flagg.
4116
if current_block is not None:
4118
if (block_index < len(self.state._dirblocks) and
4119
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4120
current_block = self.state._dirblocks[block_index]
4122
current_block = None
4123
if current_dir_info is not None:
4125
current_dir_info = dir_iterator.next()
4126
except StopIteration:
4127
current_dir_info = None
4128
for result in self._iter_specific_file_parents():
4131
def _iter_specific_file_parents(self):
4132
"""Iter over the specific file parents."""
4133
while self.search_specific_file_parents:
4134
# Process the parent directories for the paths we were iterating.
4135
# Even in extremely large trees this should be modest, so currently
4136
# no attempt is made to optimise.
4137
path_utf8 = self.search_specific_file_parents.pop()
4138
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4139
# We've examined this path.
4141
if path_utf8 in self.searched_exact_paths:
4142
# We've examined this path.
4144
path_entries = self.state._entries_for_path(path_utf8)
4145
# We need either one or two entries. If the path in
4146
# self.target_index has moved (so the entry in source_index is in
4147
# 'ar') then we need to also look for the entry for this path in
4148
# self.source_index, to output the appropriate delete-or-rename.
4149
selected_entries = []
4151
for candidate_entry in path_entries:
4152
# Find entries present in target at this path:
4153
if candidate_entry[1][self.target_index][0] not in 'ar':
4155
selected_entries.append(candidate_entry)
4156
# Find entries present in source at this path:
4157
elif (self.source_index is not None and
4158
candidate_entry[1][self.source_index][0] not in 'ar'):
4160
if candidate_entry[1][self.target_index][0] == 'a':
4161
# Deleted, emit it here.
4162
selected_entries.append(candidate_entry)
4164
# renamed, emit it when we process the directory it
4166
self.search_specific_file_parents.add(
4167
candidate_entry[1][self.target_index][1])
4169
raise AssertionError(
4170
"Missing entry for specific path parent %r, %r" % (
4171
path_utf8, path_entries))
4172
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4173
for entry in selected_entries:
4174
if entry[0][2] in self.seen_ids:
4176
result, changed = self._process_entry(entry, path_info)
4178
raise AssertionError(
4179
"Got entry<->path mismatch for specific path "
4180
"%r entry %r path_info %r " % (
4181
path_utf8, entry, path_info))
4182
# Only include changes - we're outside the users requested
4185
self._gather_result_for_consistency(result)
4186
if (result[6][0] == 'directory' and
4187
result[6][1] != 'directory'):
4188
# This stopped being a directory, the old children have
4190
if entry[1][self.source_index][0] == 'r':
4191
# renamed, take the source path
4192
entry_path_utf8 = entry[1][self.source_index][1]
4194
entry_path_utf8 = path_utf8
4195
initial_key = (entry_path_utf8, '', '')
4196
block_index, _ = self.state._find_block_index_from_key(
4198
if block_index == 0:
4199
# The children of the root are in block index 1.
4201
current_block = None
4202
if block_index < len(self.state._dirblocks):
4203
current_block = self.state._dirblocks[block_index]
4204
if not osutils.is_inside(
4205
entry_path_utf8, current_block[0]):
4206
# No entries for this directory at all.
4207
current_block = None
4208
if current_block is not None:
4209
for entry in current_block[1]:
4210
if entry[1][self.source_index][0] in 'ar':
4211
# Not in the source tree, so doesn't have to be
4214
# Path of the entry itself.
4216
self.search_specific_file_parents.add(
4217
osutils.pathjoin(*entry[0][:2]))
4218
if changed or self.include_unchanged:
4220
self.searched_exact_paths.add(path_utf8)
4222
def _path_info(self, utf8_path, unicode_path):
4223
"""Generate path_info for unicode_path.
4225
:return: None if unicode_path does not exist, or a path_info tuple.
4227
abspath = self.tree.abspath(unicode_path)
2397
cur_split = cache[cur]
2399
cur_split = cur.split('/')
2400
cache[cur] = cur_split
2401
if cur_split < dirname_split: lo = mid+1
4229
stat = os.lstat(abspath)
4231
if e.errno == errno.ENOENT:
4232
# the path does not exist.
4236
utf8_basename = utf8_path.rsplit('/', 1)[-1]
4237
dir_info = (utf8_path, utf8_basename,
4238
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4240
if dir_info[2] == 'directory':
4241
if self.tree._directory_is_tree_reference(
4243
self.root_dir_info = self.root_dir_info[:2] + \
4244
('tree-reference',) + self.root_dir_info[3:]
4248
# Try to load the compiled form if possible
4250
from bzrlib._dirstate_helpers_pyx import (
4257
ProcessEntryC as _process_entry,
4258
update_entry as update_entry,
4260
except ImportError, e:
4261
osutils.failed_to_load_extension(e)
4262
from bzrlib._dirstate_helpers_py import (
4270
# FIXME: It would be nice to be able to track moved lines so that the
4271
# corresponding python code can be moved to the _dirstate_helpers_py
4272
# module. I don't want to break the history for this important piece of
4273
# code so I left the code here -- vila 20090622
4274
update_entry = py_update_entry
4275
_process_entry = ProcessEntryPython