20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
94
The entries on disk and in memory are ordered according to the following keys:
108
96
directory, as a list of components
112
100
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
108
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
109
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
1767
1413
null = DirState.NULL_PARENT_DETAILS
1768
1414
for old_path, new_path, file_id, _, real_delete in deletes:
1769
if real_delete != (new_path is None):
1770
self._raise_invalid(old_path, file_id, "bad delete delta")
1416
assert new_path is None
1418
assert new_path is not None
1771
1419
# the entry for this file_id must be in tree 1.
1772
1420
dirname, basename = osutils.split(old_path)
1773
1421
block_index, entry_index, dir_present, file_present = \
1774
1422
self._get_block_entry_index(dirname, basename, 1)
1775
1423
if not file_present:
1776
self._raise_invalid(old_path, file_id,
1424
self._changes_aborted = True
1425
raise errors.InconsistentDelta(old_path, file_id,
1777
1426
'basis tree does not contain removed entry')
1778
1427
entry = self._dirblocks[block_index][1][entry_index]
1779
# The state of the entry in the 'active' WT
1780
active_kind = entry[1][0][0]
1781
1428
if entry[0][2] != file_id:
1782
self._raise_invalid(old_path, file_id,
1429
self._changes_aborted = True
1430
raise errors.InconsistentDelta(old_path, file_id,
1783
1431
'mismatched file_id in tree 1')
1785
old_kind = entry[1][1][0]
1786
if active_kind in 'ar':
1787
# The active tree doesn't have this file_id.
1788
# The basis tree is changing this record. If this is a
1789
# rename, then we don't want the record here at all
1790
# anymore. If it is just an in-place change, we want the
1791
# record here, but we'll add it if we need to. So we just
1793
if active_kind == 'r':
1794
active_path = entry[1][0][1]
1795
active_entry = self._get_entry(0, file_id, active_path)
1796
if active_entry[1][1][0] != 'r':
1797
self._raise_invalid(old_path, file_id,
1798
"Dirstate did not have matching rename entries")
1799
elif active_entry[1][0][0] in 'ar':
1800
self._raise_invalid(old_path, file_id,
1801
"Dirstate had a rename pointing at an inactive"
1803
active_entry[1][1] = null
1433
if entry[1][0][0] != 'a':
1434
self._changes_aborted = True
1435
raise errors.InconsistentDelta(old_path, file_id,
1436
'This was marked as a real delete, but the WT state'
1437
' claims that it still exists and is versioned.')
1804
1438
del self._dirblocks[block_index][1][entry_index]
1806
# This was a directory, and the active tree says it
1807
# doesn't exist, and now the basis tree says it doesn't
1808
# exist. Remove its dirblock if present
1810
present) = self._find_block_index_from_key(
1813
dir_block = self._dirblocks[dir_block_index][1]
1815
# This entry is empty, go ahead and just remove it
1816
del self._dirblocks[dir_block_index]
1818
# There is still an active record, so just mark this
1821
block_i, entry_i, d_present, f_present = \
1822
self._get_block_entry_index(old_path, '', 1)
1824
dir_block = self._dirblocks[block_i][1]
1825
for child_entry in dir_block:
1826
child_basis_kind = child_entry[1][1][0]
1827
if child_basis_kind not in 'ar':
1828
self._raise_invalid(old_path, file_id,
1829
"The file id was deleted but its children were "
1832
def _after_delta_check_parents(self, parents, index):
1833
"""Check that parents required by the delta are all intact.
1835
:param parents: An iterable of (path_utf8, file_id) tuples which are
1836
required to be present in tree 'index' at path_utf8 with id file_id
1838
:param index: The column in the dirstate to check for parents in.
1840
for dirname_utf8, file_id in parents:
1841
# Get the entry - the ensures that file_id, dirname_utf8 exists and
1842
# has the right file id.
1843
entry = self._get_entry(index, file_id, dirname_utf8)
1844
if entry[1] is None:
1845
self._raise_invalid(dirname_utf8.decode('utf8'),
1846
file_id, "This parent is not present.")
1847
# Parents of things must be directories
1848
if entry[1][index][0] != 'd':
1849
self._raise_invalid(dirname_utf8.decode('utf8'),
1850
file_id, "This parent is not a directory.")
1852
def _observed_sha1(self, entry, sha1, stat_value,
1853
_stat_to_minikind=_stat_to_minikind):
1854
"""Note the sha1 of a file.
1856
:param entry: The entry the sha1 is for.
1857
:param sha1: The observed sha1.
1858
:param stat_value: The os.lstat for the file.
1440
if entry[1][0][0] == 'a':
1441
self._changes_aborted = True
1442
raise errors.InconsistentDelta(old_path, file_id,
1443
'The entry was considered a rename, but the source path'
1444
' is marked as absent.')
1445
# For whatever reason, we were asked to rename an entry
1446
# that was originally marked as deleted. This could be
1447
# because we are renaming the parent directory, and the WT
1448
# current state has the file marked as deleted.
1449
elif entry[1][0][0] == 'r':
1450
# implement the rename
1451
del self._dirblocks[block_index][1][entry_index]
1453
# it is being resurrected here, so blank it out temporarily.
1454
self._dirblocks[block_index][1][entry_index][1][1] = null
1456
def update_entry(self, entry, abspath, stat_value,
1457
_stat_to_minikind=_stat_to_minikind,
1458
_pack_stat=pack_stat):
1459
"""Update the entry based on what is actually on disk.
1461
:param entry: This is the dirblock entry for the file in question.
1462
:param abspath: The path on disk for this file.
1463
:param stat_value: (optional) if we already have done a stat on the
1465
:return: The sha1 hexdigest of the file (40 bytes) or link target of a
1861
1469
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
1862
1470
except KeyError:
1863
1471
# Unhandled kind
1473
packed_stat = _pack_stat(stat_value)
1474
(saved_minikind, saved_link_or_sha1, saved_file_size,
1475
saved_executable, saved_packed_stat) = entry[1][0]
1477
if (minikind == saved_minikind
1478
and packed_stat == saved_packed_stat):
1479
# The stat hasn't changed since we saved, so we can re-use the
1484
# size should also be in packed_stat
1485
if saved_file_size == stat_value.st_size:
1486
return saved_link_or_sha1
1488
# If we have gotten this far, that means that we need to actually
1489
# process this entry.
1865
1491
if minikind == 'f':
1866
if self._cutoff_time is None:
1867
self._sha_cutoff_time()
1868
if (stat_value.st_mtime < self._cutoff_time
1869
and stat_value.st_ctime < self._cutoff_time):
1870
entry[1][0] = ('f', sha1, stat_value.st_size, entry[1][0][3],
1871
pack_stat(stat_value))
1872
self._mark_modified([entry])
1492
link_or_sha1 = self._sha1_file(abspath)
1493
executable = self._is_executable(stat_value.st_mode,
1495
if self._cutoff_time is None:
1496
self._sha_cutoff_time()
1497
if (stat_value.st_mtime < self._cutoff_time
1498
and stat_value.st_ctime < self._cutoff_time):
1499
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
1500
executable, packed_stat)
1502
entry[1][0] = ('f', '', stat_value.st_size,
1503
executable, DirState.NULLSTAT)
1504
elif minikind == 'd':
1506
entry[1][0] = ('d', '', 0, False, packed_stat)
1507
if saved_minikind != 'd':
1508
# This changed from something into a directory. Make sure we
1509
# have a directory block for it. This doesn't happen very
1510
# often, so this doesn't have to be super fast.
1511
block_index, entry_index, dir_present, file_present = \
1512
self._get_block_entry_index(entry[0][0], entry[0][1], 0)
1513
self._ensure_block(block_index, entry_index,
1514
osutils.pathjoin(entry[0][0], entry[0][1]))
1515
elif minikind == 'l':
1516
link_or_sha1 = self._read_link(abspath, saved_link_or_sha1)
1517
if self._cutoff_time is None:
1518
self._sha_cutoff_time()
1519
if (stat_value.st_mtime < self._cutoff_time
1520
and stat_value.st_ctime < self._cutoff_time):
1521
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
1524
entry[1][0] = ('l', '', stat_value.st_size,
1525
False, DirState.NULLSTAT)
1526
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
1874
1529
def _sha_cutoff_time(self):
1875
1530
"""Return cutoff time.
3349
2743
raise errors.ObjectNotLocked(self)
3352
def py_update_entry(state, entry, abspath, stat_value,
3353
_stat_to_minikind=DirState._stat_to_minikind):
3354
"""Update the entry based on what is actually on disk.
3356
This function only calculates the sha if it needs to - if the entry is
3357
uncachable, or clearly different to the first parent's entry, no sha
3358
is calculated, and None is returned.
3360
:param state: The dirstate this entry is in.
3361
:param entry: This is the dirblock entry for the file in question.
3362
:param abspath: The path on disk for this file.
3363
:param stat_value: The stat value done on the path.
3364
:return: None, or The sha1 hexdigest of the file (40 bytes) or link
3365
target of a symlink.
3368
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
3372
packed_stat = pack_stat(stat_value)
3373
(saved_minikind, saved_link_or_sha1, saved_file_size,
3374
saved_executable, saved_packed_stat) = entry[1][0]
3376
if minikind == 'd' and saved_minikind == 't':
3378
if (minikind == saved_minikind
3379
and packed_stat == saved_packed_stat):
3380
# The stat hasn't changed since we saved, so we can re-use the
3385
# size should also be in packed_stat
3386
if saved_file_size == stat_value.st_size:
3387
return saved_link_or_sha1
3389
# If we have gotten this far, that means that we need to actually
3390
# process this entry.
3394
executable = state._is_executable(stat_value.st_mode,
3396
if state._cutoff_time is None:
3397
state._sha_cutoff_time()
3398
if (stat_value.st_mtime < state._cutoff_time
3399
and stat_value.st_ctime < state._cutoff_time
3400
and len(entry[1]) > 1
3401
and entry[1][1][0] != 'a'):
3402
# Could check for size changes for further optimised
3403
# avoidance of sha1's. However the most prominent case of
3404
# over-shaing is during initial add, which this catches.
3405
# Besides, if content filtering happens, size and sha
3406
# are calculated at the same time, so checking just the size
3407
# gains nothing w.r.t. performance.
3408
link_or_sha1 = state._sha1_file(abspath)
3409
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3410
executable, packed_stat)
3412
entry[1][0] = ('f', '', stat_value.st_size,
3413
executable, DirState.NULLSTAT)
3414
worth_saving = False
3415
elif minikind == 'd':
3417
entry[1][0] = ('d', '', 0, False, packed_stat)
3418
if saved_minikind != 'd':
3419
# This changed from something into a directory. Make sure we
3420
# have a directory block for it. This doesn't happen very
3421
# often, so this doesn't have to be super fast.
3422
block_index, entry_index, dir_present, file_present = \
3423
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3424
state._ensure_block(block_index, entry_index,
3425
osutils.pathjoin(entry[0][0], entry[0][1]))
3427
worth_saving = False
3428
elif minikind == 'l':
3429
if saved_minikind == 'l':
3430
worth_saving = False
3431
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3432
if state._cutoff_time is None:
3433
state._sha_cutoff_time()
3434
if (stat_value.st_mtime < state._cutoff_time
3435
and stat_value.st_ctime < state._cutoff_time):
3436
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3439
entry[1][0] = ('l', '', stat_value.st_size,
3440
False, DirState.NULLSTAT)
3442
state._mark_modified([entry])
3446
class ProcessEntryPython(object):
3448
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3449
"last_source_parent", "last_target_parent", "include_unchanged",
3450
"partial", "use_filesystem_for_exec", "utf8_decode",
3451
"searched_specific_files", "search_specific_files",
3452
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3453
"state", "source_index", "target_index", "want_unversioned", "tree"]
3455
def __init__(self, include_unchanged, use_filesystem_for_exec,
3456
search_specific_files, state, source_index, target_index,
3457
want_unversioned, tree):
3458
self.old_dirname_to_file_id = {}
3459
self.new_dirname_to_file_id = {}
3460
# Are we doing a partial iter_changes?
3461
self.partial = search_specific_files != set([''])
3462
# Using a list so that we can access the values and change them in
3463
# nested scope. Each one is [path, file_id, entry]
3464
self.last_source_parent = [None, None]
3465
self.last_target_parent = [None, None]
3466
self.include_unchanged = include_unchanged
3467
self.use_filesystem_for_exec = use_filesystem_for_exec
3468
self.utf8_decode = cache_utf8._utf8_decode
3469
# for all search_indexs in each path at or under each element of
3470
# search_specific_files, if the detail is relocated: add the id, and
3471
# add the relocated path as one to search if its not searched already.
3472
# If the detail is not relocated, add the id.
3473
self.searched_specific_files = set()
3474
# When we search exact paths without expanding downwards, we record
3476
self.searched_exact_paths = set()
3477
self.search_specific_files = search_specific_files
3478
# The parents up to the root of the paths we are searching.
3479
# After all normal paths are returned, these specific items are returned.
3480
self.search_specific_file_parents = set()
3481
# The ids we've sent out in the delta.
3482
self.seen_ids = set()
3484
self.source_index = source_index
3485
self.target_index = target_index
3486
if target_index != 0:
3487
# A lot of code in here depends on target_index == 0
3488
raise errors.BzrError('unsupported target index')
3489
self.want_unversioned = want_unversioned
3492
def _process_entry(self, entry, path_info, pathjoin=osutils.pathjoin):
3493
"""Compare an entry and real disk to generate delta information.
3495
:param path_info: top_relpath, basename, kind, lstat, abspath for
3496
the path of entry. If None, then the path is considered absent in
3497
the target (Perhaps we should pass in a concrete entry for this ?)
3498
Basename is returned as a utf8 string because we expect this
3499
tuple will be ignored, and don't want to take the time to
3501
:return: (iter_changes_result, changed). If the entry has not been
3502
handled then changed is None. Otherwise it is False if no content
3503
or metadata changes have occurred, and True if any content or
3504
metadata change has occurred. If self.include_unchanged is True then
3505
if changed is not None, iter_changes_result will always be a result
3506
tuple. Otherwise, iter_changes_result is None unless changed is
3509
if self.source_index is None:
3510
source_details = DirState.NULL_PARENT_DETAILS
3512
source_details = entry[1][self.source_index]
3513
target_details = entry[1][self.target_index]
3514
target_minikind = target_details[0]
3515
if path_info is not None and target_minikind in 'fdlt':
3516
if not (self.target_index == 0):
3517
raise AssertionError()
3518
link_or_sha1 = update_entry(self.state, entry,
3519
abspath=path_info[4], stat_value=path_info[3])
3520
# The entry may have been modified by update_entry
3521
target_details = entry[1][self.target_index]
3522
target_minikind = target_details[0]
3525
file_id = entry[0][2]
3526
source_minikind = source_details[0]
3527
if source_minikind in 'fdltr' and target_minikind in 'fdlt':
3528
# claimed content in both: diff
3529
# r | fdlt | | add source to search, add id path move and perform
3530
# | | | diff check on source-target
3531
# r | fdlt | a | dangling file that was present in the basis.
3533
if source_minikind in 'r':
3534
# add the source to the search path to find any children it
3535
# has. TODO ? : only add if it is a container ?
3536
if not osutils.is_inside_any(self.searched_specific_files,
3538
self.search_specific_files.add(source_details[1])
3539
# generate the old path; this is needed for stating later
3541
old_path = source_details[1]
3542
old_dirname, old_basename = os.path.split(old_path)
3543
path = pathjoin(entry[0][0], entry[0][1])
3544
old_entry = self.state._get_entry(self.source_index,
3546
# update the source details variable to be the real
3548
if old_entry == (None, None):
3549
raise errors.CorruptDirstate(self.state._filename,
3550
"entry '%s/%s' is considered renamed from %r"
3551
" but source does not exist\n"
3552
"entry: %s" % (entry[0][0], entry[0][1], old_path, entry))
3553
source_details = old_entry[1][self.source_index]
3554
source_minikind = source_details[0]
3556
old_dirname = entry[0][0]
3557
old_basename = entry[0][1]
3558
old_path = path = None
3559
if path_info is None:
3560
# the file is missing on disk, show as removed.
3561
content_change = True
3565
# source and target are both versioned and disk file is present.
3566
target_kind = path_info[2]
3567
if target_kind == 'directory':
3569
old_path = path = pathjoin(old_dirname, old_basename)
3570
self.new_dirname_to_file_id[path] = file_id
3571
if source_minikind != 'd':
3572
content_change = True
3574
# directories have no fingerprint
3575
content_change = False
3577
elif target_kind == 'file':
3578
if source_minikind != 'f':
3579
content_change = True
3581
# Check the sha. We can't just rely on the size as
3582
# content filtering may mean differ sizes actually
3583
# map to the same content
3584
if link_or_sha1 is None:
3586
statvalue, link_or_sha1 = \
3587
self.state._sha1_provider.stat_and_sha1(
3589
self.state._observed_sha1(entry, link_or_sha1,
3591
content_change = (link_or_sha1 != source_details[1])
3592
# Target details is updated at update_entry time
3593
if self.use_filesystem_for_exec:
3594
# We don't need S_ISREG here, because we are sure
3595
# we are dealing with a file.
3596
target_exec = bool(stat.S_IEXEC & path_info[3].st_mode)
3598
target_exec = target_details[3]
3599
elif target_kind == 'symlink':
3600
if source_minikind != 'l':
3601
content_change = True
3603
content_change = (link_or_sha1 != source_details[1])
3605
elif target_kind == 'tree-reference':
3606
if source_minikind != 't':
3607
content_change = True
3609
content_change = False
3613
path = pathjoin(old_dirname, old_basename)
3614
raise errors.BadFileKindError(path, path_info[2])
3615
if source_minikind == 'd':
3617
old_path = path = pathjoin(old_dirname, old_basename)
3618
self.old_dirname_to_file_id[old_path] = file_id
3619
# parent id is the entry for the path in the target tree
3620
if old_basename and old_dirname == self.last_source_parent[0]:
3621
source_parent_id = self.last_source_parent[1]
3624
source_parent_id = self.old_dirname_to_file_id[old_dirname]
3626
source_parent_entry = self.state._get_entry(self.source_index,
3627
path_utf8=old_dirname)
3628
source_parent_id = source_parent_entry[0][2]
3629
if source_parent_id == entry[0][2]:
3630
# This is the root, so the parent is None
3631
source_parent_id = None
3633
self.last_source_parent[0] = old_dirname
3634
self.last_source_parent[1] = source_parent_id
3635
new_dirname = entry[0][0]
3636
if entry[0][1] and new_dirname == self.last_target_parent[0]:
3637
target_parent_id = self.last_target_parent[1]
3640
target_parent_id = self.new_dirname_to_file_id[new_dirname]
3642
# TODO: We don't always need to do the lookup, because the
3643
# parent entry will be the same as the source entry.
3644
target_parent_entry = self.state._get_entry(self.target_index,
3645
path_utf8=new_dirname)
3646
if target_parent_entry == (None, None):
3647
raise AssertionError(
3648
"Could not find target parent in wt: %s\nparent of: %s"
3649
% (new_dirname, entry))
3650
target_parent_id = target_parent_entry[0][2]
3651
if target_parent_id == entry[0][2]:
3652
# This is the root, so the parent is None
3653
target_parent_id = None
3655
self.last_target_parent[0] = new_dirname
3656
self.last_target_parent[1] = target_parent_id
3658
source_exec = source_details[3]
3659
changed = (content_change
3660
or source_parent_id != target_parent_id
3661
or old_basename != entry[0][1]
3662
or source_exec != target_exec
3664
if not changed and not self.include_unchanged:
3667
if old_path is None:
3668
old_path = path = pathjoin(old_dirname, old_basename)
3669
old_path_u = self.utf8_decode(old_path)[0]
3672
old_path_u = self.utf8_decode(old_path)[0]
3673
if old_path == path:
3676
path_u = self.utf8_decode(path)[0]
3677
source_kind = DirState._minikind_to_kind[source_minikind]
3678
return (entry[0][2],
3679
(old_path_u, path_u),
3682
(source_parent_id, target_parent_id),
3683
(self.utf8_decode(old_basename)[0], self.utf8_decode(entry[0][1])[0]),
3684
(source_kind, target_kind),
3685
(source_exec, target_exec)), changed
3686
elif source_minikind in 'a' and target_minikind in 'fdlt':
3687
# looks like a new file
3688
path = pathjoin(entry[0][0], entry[0][1])
3689
# parent id is the entry for the path in the target tree
3690
# TODO: these are the same for an entire directory: cache em.
3691
parent_id = self.state._get_entry(self.target_index,
3692
path_utf8=entry[0][0])[0][2]
3693
if parent_id == entry[0][2]:
3695
if path_info is not None:
3697
if self.use_filesystem_for_exec:
3698
# We need S_ISREG here, because we aren't sure if this
3701
stat.S_ISREG(path_info[3].st_mode)
3702
and stat.S_IEXEC & path_info[3].st_mode)
3704
target_exec = target_details[3]
3705
return (entry[0][2],
3706
(None, self.utf8_decode(path)[0]),
3710
(None, self.utf8_decode(entry[0][1])[0]),
3711
(None, path_info[2]),
3712
(None, target_exec)), True
3714
# Its a missing file, report it as such.
3715
return (entry[0][2],
3716
(None, self.utf8_decode(path)[0]),
3720
(None, self.utf8_decode(entry[0][1])[0]),
3722
(None, False)), True
3723
elif source_minikind in 'fdlt' and target_minikind in 'a':
3724
# unversioned, possibly, or possibly not deleted: we dont care.
3725
# if its still on disk, *and* theres no other entry at this
3726
# path [we dont know this in this routine at the moment -
3727
# perhaps we should change this - then it would be an unknown.
3728
old_path = pathjoin(entry[0][0], entry[0][1])
3729
# parent id is the entry for the path in the target tree
3730
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3731
if parent_id == entry[0][2]:
3733
return (entry[0][2],
3734
(self.utf8_decode(old_path)[0], None),
3738
(self.utf8_decode(entry[0][1])[0], None),
3739
(DirState._minikind_to_kind[source_minikind], None),
3740
(source_details[3], None)), True
3741
elif source_minikind in 'fdlt' and target_minikind in 'r':
3742
# a rename; could be a true rename, or a rename inherited from
3743
# a renamed parent. TODO: handle this efficiently. Its not
3744
# common case to rename dirs though, so a correct but slow
3745
# implementation will do.
3746
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3747
self.search_specific_files.add(target_details[1])
3748
elif source_minikind in 'ra' and target_minikind in 'ra':
3749
# neither of the selected trees contain this file,
3750
# so skip over it. This is not currently directly tested, but
3751
# is indirectly via test_too_much.TestCommands.test_conflicts.
3754
raise AssertionError("don't know how to compare "
3755
"source_minikind=%r, target_minikind=%r"
3756
% (source_minikind, target_minikind))
3762
def _gather_result_for_consistency(self, result):
3763
"""Check a result we will yield to make sure we are consistent later.
3765
This gathers result's parents into a set to output later.
3767
:param result: A result tuple.
3769
if not self.partial or not result[0]:
3771
self.seen_ids.add(result[0])
3772
new_path = result[1][1]
3774
# Not the root and not a delete: queue up the parents of the path.
3775
self.search_specific_file_parents.update(
3776
osutils.parent_directories(new_path.encode('utf8')))
3777
# Add the root directory which parent_directories does not
3779
self.search_specific_file_parents.add('')
3781
def iter_changes(self):
3782
"""Iterate over the changes."""
3783
utf8_decode = cache_utf8._utf8_decode
3784
_cmp_by_dirs = cmp_by_dirs
3785
_process_entry = self._process_entry
3786
search_specific_files = self.search_specific_files
3787
searched_specific_files = self.searched_specific_files
3788
splitpath = osutils.splitpath
3790
# compare source_index and target_index at or under each element of search_specific_files.
3791
# follow the following comparison table. Note that we only want to do diff operations when
3792
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3796
# Source | Target | disk | action
3797
# r | fdlt | | add source to search, add id path move and perform
3798
# | | | diff check on source-target
3799
# r | fdlt | a | dangling file that was present in the basis.
3801
# r | a | | add source to search
3803
# r | r | | this path is present in a non-examined tree, skip.
3804
# r | r | a | this path is present in a non-examined tree, skip.
3805
# a | fdlt | | add new id
3806
# a | fdlt | a | dangling locally added file, skip
3807
# a | a | | not present in either tree, skip
3808
# a | a | a | not present in any tree, skip
3809
# a | r | | not present in either tree at this path, skip as it
3810
# | | | may not be selected by the users list of paths.
3811
# a | r | a | not present in either tree at this path, skip as it
3812
# | | | may not be selected by the users list of paths.
3813
# fdlt | fdlt | | content in both: diff them
3814
# fdlt | fdlt | a | deleted locally, but not unversioned - show as deleted ?
3815
# fdlt | a | | unversioned: output deleted id for now
3816
# fdlt | a | a | unversioned and deleted: output deleted id
3817
# fdlt | r | | relocated in this tree, so add target to search.
3818
# | | | Dont diff, we will see an r,fd; pair when we reach
3819
# | | | this id at the other path.
3820
# fdlt | r | a | relocated in this tree, so add target to search.
3821
# | | | Dont diff, we will see an r,fd; pair when we reach
3822
# | | | this id at the other path.
3824
# TODO: jam 20070516 - Avoid the _get_entry lookup overhead by
3825
# keeping a cache of directories that we have seen.
3827
while search_specific_files:
3828
# TODO: the pending list should be lexically sorted? the
3829
# interface doesn't require it.
3830
current_root = search_specific_files.pop()
3831
current_root_unicode = current_root.decode('utf8')
3832
searched_specific_files.add(current_root)
3833
# process the entries for this containing directory: the rest will be
3834
# found by their parents recursively.
3835
root_entries = self.state._entries_for_path(current_root)
3836
root_abspath = self.tree.abspath(current_root_unicode)
3838
root_stat = os.lstat(root_abspath)
3840
if e.errno == errno.ENOENT:
3841
# the path does not exist: let _process_entry know that.
3842
root_dir_info = None
3844
# some other random error: hand it up.
3847
root_dir_info = ('', current_root,
3848
osutils.file_kind_from_stat_mode(root_stat.st_mode), root_stat,
3850
if root_dir_info[2] == 'directory':
3851
if self.tree._directory_is_tree_reference(
3852
current_root.decode('utf8')):
3853
root_dir_info = root_dir_info[:2] + \
3854
('tree-reference',) + root_dir_info[3:]
3856
if not root_entries and not root_dir_info:
3857
# this specified path is not present at all, skip it.
3859
path_handled = False
3860
for entry in root_entries:
3861
result, changed = _process_entry(entry, root_dir_info)
3862
if changed is not None:
3865
self._gather_result_for_consistency(result)
3866
if changed or self.include_unchanged:
3868
if self.want_unversioned and not path_handled and root_dir_info:
3869
new_executable = bool(
3870
stat.S_ISREG(root_dir_info[3].st_mode)
3871
and stat.S_IEXEC & root_dir_info[3].st_mode)
3873
(None, current_root_unicode),
3877
(None, splitpath(current_root_unicode)[-1]),
3878
(None, root_dir_info[2]),
3879
(None, new_executable)
3881
initial_key = (current_root, '', '')
3882
block_index, _ = self.state._find_block_index_from_key(initial_key)
3883
if block_index == 0:
3884
# we have processed the total root already, but because the
3885
# initial key matched it we should skip it here.
3887
if root_dir_info and root_dir_info[2] == 'tree-reference':
3888
current_dir_info = None
3890
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3892
current_dir_info = dir_iterator.next()
3894
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3895
# python 2.5 has e.errno == EINVAL,
3896
# and e.winerror == ERROR_DIRECTORY
3897
e_winerror = getattr(e, 'winerror', None)
3898
win_errors = (ERROR_DIRECTORY, ERROR_PATH_NOT_FOUND)
3899
# there may be directories in the inventory even though
3900
# this path is not a file on disk: so mark it as end of
3902
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3903
current_dir_info = None
3904
elif (sys.platform == 'win32'
3905
and (e.errno in win_errors
3906
or e_winerror in win_errors)):
3907
current_dir_info = None
3911
if current_dir_info[0][0] == '':
3912
# remove .bzr from iteration
3913
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3914
if current_dir_info[1][bzr_index][0] != '.bzr':
3915
raise AssertionError()
3916
del current_dir_info[1][bzr_index]
3917
# walk until both the directory listing and the versioned metadata
3919
if (block_index < len(self.state._dirblocks) and
3920
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
3921
current_block = self.state._dirblocks[block_index]
3923
current_block = None
3924
while (current_dir_info is not None or
3925
current_block is not None):
3926
if (current_dir_info and current_block
3927
and current_dir_info[0][0] != current_block[0]):
3928
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
3929
# filesystem data refers to paths not covered by the dirblock.
3930
# this has two possibilities:
3931
# A) it is versioned but empty, so there is no block for it
3932
# B) it is not versioned.
3934
# if (A) then we need to recurse into it to check for
3935
# new unknown files or directories.
3936
# if (B) then we should ignore it, because we don't
3937
# recurse into unknown directories.
3939
while path_index < len(current_dir_info[1]):
3940
current_path_info = current_dir_info[1][path_index]
3941
if self.want_unversioned:
3942
if current_path_info[2] == 'directory':
3943
if self.tree._directory_is_tree_reference(
3944
current_path_info[0].decode('utf8')):
3945
current_path_info = current_path_info[:2] + \
3946
('tree-reference',) + current_path_info[3:]
3947
new_executable = bool(
3948
stat.S_ISREG(current_path_info[3].st_mode)
3949
and stat.S_IEXEC & current_path_info[3].st_mode)
3951
(None, utf8_decode(current_path_info[0])[0]),
3955
(None, utf8_decode(current_path_info[1])[0]),
3956
(None, current_path_info[2]),
3957
(None, new_executable))
3958
# dont descend into this unversioned path if it is
3960
if current_path_info[2] in ('directory',
3962
del current_dir_info[1][path_index]
3966
# This dir info has been handled, go to the next
3968
current_dir_info = dir_iterator.next()
3969
except StopIteration:
3970
current_dir_info = None
3972
# We have a dirblock entry for this location, but there
3973
# is no filesystem path for this. This is most likely
3974
# because a directory was removed from the disk.
3975
# We don't have to report the missing directory,
3976
# because that should have already been handled, but we
3977
# need to handle all of the files that are contained
3979
for current_entry in current_block[1]:
3980
# entry referring to file not present on disk.
3981
# advance the entry only, after processing.
3982
result, changed = _process_entry(current_entry, None)
3983
if changed is not None:
3985
self._gather_result_for_consistency(result)
3986
if changed or self.include_unchanged:
3989
if (block_index < len(self.state._dirblocks) and
3990
osutils.is_inside(current_root,
3991
self.state._dirblocks[block_index][0])):
3992
current_block = self.state._dirblocks[block_index]
3994
current_block = None
3997
if current_block and entry_index < len(current_block[1]):
3998
current_entry = current_block[1][entry_index]
4000
current_entry = None
4001
advance_entry = True
4003
if current_dir_info and path_index < len(current_dir_info[1]):
4004
current_path_info = current_dir_info[1][path_index]
4005
if current_path_info[2] == 'directory':
4006
if self.tree._directory_is_tree_reference(
4007
current_path_info[0].decode('utf8')):
4008
current_path_info = current_path_info[:2] + \
4009
('tree-reference',) + current_path_info[3:]
4011
current_path_info = None
4013
path_handled = False
4014
while (current_entry is not None or
4015
current_path_info is not None):
4016
if current_entry is None:
4017
# the check for path_handled when the path is advanced
4018
# will yield this path if needed.
4020
elif current_path_info is None:
4021
# no path is fine: the per entry code will handle it.
4022
result, changed = _process_entry(current_entry, current_path_info)
4023
if changed is not None:
4025
self._gather_result_for_consistency(result)
4026
if changed or self.include_unchanged:
4028
elif (current_entry[0][1] != current_path_info[1]
4029
or current_entry[1][self.target_index][0] in 'ar'):
4030
# The current path on disk doesn't match the dirblock
4031
# record. Either the dirblock is marked as absent, or
4032
# the file on disk is not present at all in the
4033
# dirblock. Either way, report about the dirblock
4034
# entry, and let other code handle the filesystem one.
4036
# Compare the basename for these files to determine
4038
if current_path_info[1] < current_entry[0][1]:
4039
# extra file on disk: pass for now, but only
4040
# increment the path, not the entry
4041
advance_entry = False
4043
# entry referring to file not present on disk.
4044
# advance the entry only, after processing.
4045
result, changed = _process_entry(current_entry, None)
4046
if changed is not None:
4048
self._gather_result_for_consistency(result)
4049
if changed or self.include_unchanged:
4051
advance_path = False
4053
result, changed = _process_entry(current_entry, current_path_info)
4054
if changed is not None:
4057
self._gather_result_for_consistency(result)
4058
if changed or self.include_unchanged:
4060
if advance_entry and current_entry is not None:
4062
if entry_index < len(current_block[1]):
4063
current_entry = current_block[1][entry_index]
4065
current_entry = None
4067
advance_entry = True # reset the advance flaga
4068
if advance_path and current_path_info is not None:
4069
if not path_handled:
4070
# unversioned in all regards
4071
if self.want_unversioned:
4072
new_executable = bool(
4073
stat.S_ISREG(current_path_info[3].st_mode)
4074
and stat.S_IEXEC & current_path_info[3].st_mode)
4076
relpath_unicode = utf8_decode(current_path_info[0])[0]
4077
except UnicodeDecodeError:
4078
raise errors.BadFilenameEncoding(
4079
current_path_info[0], osutils._fs_enc)
4081
(None, relpath_unicode),
4085
(None, utf8_decode(current_path_info[1])[0]),
4086
(None, current_path_info[2]),
4087
(None, new_executable))
4088
# dont descend into this unversioned path if it is
4090
if current_path_info[2] in ('directory'):
4091
del current_dir_info[1][path_index]
4093
# dont descend the disk iterator into any tree
4095
if current_path_info[2] == 'tree-reference':
4096
del current_dir_info[1][path_index]
4099
if path_index < len(current_dir_info[1]):
4100
current_path_info = current_dir_info[1][path_index]
4101
if current_path_info[2] == 'directory':
4102
if self.tree._directory_is_tree_reference(
4103
current_path_info[0].decode('utf8')):
4104
current_path_info = current_path_info[:2] + \
4105
('tree-reference',) + current_path_info[3:]
4107
current_path_info = None
4108
path_handled = False
4110
advance_path = True # reset the advance flagg.
4111
if current_block is not None:
4113
if (block_index < len(self.state._dirblocks) and
4114
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4115
current_block = self.state._dirblocks[block_index]
4117
current_block = None
4118
if current_dir_info is not None:
4120
current_dir_info = dir_iterator.next()
4121
except StopIteration:
4122
current_dir_info = None
4123
for result in self._iter_specific_file_parents():
4126
def _iter_specific_file_parents(self):
4127
"""Iter over the specific file parents."""
4128
while self.search_specific_file_parents:
4129
# Process the parent directories for the paths we were iterating.
4130
# Even in extremely large trees this should be modest, so currently
4131
# no attempt is made to optimise.
4132
path_utf8 = self.search_specific_file_parents.pop()
4133
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4134
# We've examined this path.
4136
if path_utf8 in self.searched_exact_paths:
4137
# We've examined this path.
4139
path_entries = self.state._entries_for_path(path_utf8)
4140
# We need either one or two entries. If the path in
4141
# self.target_index has moved (so the entry in source_index is in
4142
# 'ar') then we need to also look for the entry for this path in
4143
# self.source_index, to output the appropriate delete-or-rename.
4144
selected_entries = []
4146
for candidate_entry in path_entries:
4147
# Find entries present in target at this path:
4148
if candidate_entry[1][self.target_index][0] not in 'ar':
4150
selected_entries.append(candidate_entry)
4151
# Find entries present in source at this path:
4152
elif (self.source_index is not None and
4153
candidate_entry[1][self.source_index][0] not in 'ar'):
4155
if candidate_entry[1][self.target_index][0] == 'a':
4156
# Deleted, emit it here.
4157
selected_entries.append(candidate_entry)
4159
# renamed, emit it when we process the directory it
4161
self.search_specific_file_parents.add(
4162
candidate_entry[1][self.target_index][1])
4164
raise AssertionError(
4165
"Missing entry for specific path parent %r, %r" % (
4166
path_utf8, path_entries))
4167
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4168
for entry in selected_entries:
4169
if entry[0][2] in self.seen_ids:
4171
result, changed = self._process_entry(entry, path_info)
4173
raise AssertionError(
4174
"Got entry<->path mismatch for specific path "
4175
"%r entry %r path_info %r " % (
4176
path_utf8, entry, path_info))
4177
# Only include changes - we're outside the users requested
4180
self._gather_result_for_consistency(result)
4181
if (result[6][0] == 'directory' and
4182
result[6][1] != 'directory'):
4183
# This stopped being a directory, the old children have
4185
if entry[1][self.source_index][0] == 'r':
4186
# renamed, take the source path
4187
entry_path_utf8 = entry[1][self.source_index][1]
4189
entry_path_utf8 = path_utf8
4190
initial_key = (entry_path_utf8, '', '')
4191
block_index, _ = self.state._find_block_index_from_key(
4193
if block_index == 0:
4194
# The children of the root are in block index 1.
4196
current_block = None
4197
if block_index < len(self.state._dirblocks):
4198
current_block = self.state._dirblocks[block_index]
4199
if not osutils.is_inside(
4200
entry_path_utf8, current_block[0]):
4201
# No entries for this directory at all.
4202
current_block = None
4203
if current_block is not None:
4204
for entry in current_block[1]:
4205
if entry[1][self.source_index][0] in 'ar':
4206
# Not in the source tree, so doesn't have to be
4209
# Path of the entry itself.
4211
self.search_specific_file_parents.add(
4212
osutils.pathjoin(*entry[0][:2]))
4213
if changed or self.include_unchanged:
4215
self.searched_exact_paths.add(path_utf8)
4217
def _path_info(self, utf8_path, unicode_path):
4218
"""Generate path_info for unicode_path.
4220
:return: None if unicode_path does not exist, or a path_info tuple.
4222
abspath = self.tree.abspath(unicode_path)
4224
stat = os.lstat(abspath)
4226
if e.errno == errno.ENOENT:
4227
# the path does not exist.
4231
utf8_basename = utf8_path.rsplit('/', 1)[-1]
4232
dir_info = (utf8_path, utf8_basename,
4233
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4235
if dir_info[2] == 'directory':
4236
if self.tree._directory_is_tree_reference(
4238
self.root_dir_info = self.root_dir_info[:2] + \
4239
('tree-reference',) + self.root_dir_info[3:]
4243
2746
# Try to load the compiled form if possible
4245
from bzrlib._dirstate_helpers_pyx import (
4252
ProcessEntryC as _process_entry,
4253
update_entry as update_entry,
2748
from bzrlib._dirstate_helpers_c import (
2749
_read_dirblocks_c as _read_dirblocks,
2750
bisect_dirblock_c as bisect_dirblock,
2751
_bisect_path_left_c as _bisect_path_left,
2752
_bisect_path_right_c as _bisect_path_right,
2753
cmp_by_dirs_c as cmp_by_dirs,
4255
except ImportError, e:
4256
osutils.failed_to_load_extension(e)
4257
2756
from bzrlib._dirstate_helpers_py import (
2757
_read_dirblocks_py as _read_dirblocks,
2758
bisect_dirblock_py as bisect_dirblock,
2759
_bisect_path_left_py as _bisect_path_left,
2760
_bisect_path_right_py as _bisect_path_right,
2761
cmp_by_dirs_py as cmp_by_dirs,
4265
# FIXME: It would be nice to be able to track moved lines so that the
4266
# corresponding python code can be moved to the _dirstate_helpers_py
4267
# module. I don't want to break the history for this important piece of
4268
# code so I left the code here -- vila 20090622
4269
update_entry = py_update_entry
4270
_process_entry = ProcessEntryPython