20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
94
The entries on disk and in memory are ordered according to the following keys:
108
96
directory, as a list of components
112
100
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
108
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
109
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
1781
1340
null = DirState.NULL_PARENT_DETAILS
1782
1341
for old_path, new_path, file_id, _, real_delete in deletes:
1783
if real_delete != (new_path is None):
1784
self._raise_invalid(old_path, file_id, "bad delete delta")
1343
assert new_path is None
1345
assert new_path is not None
1785
1346
# the entry for this file_id must be in tree 1.
1786
1347
dirname, basename = osutils.split(old_path)
1787
1348
block_index, entry_index, dir_present, file_present = \
1788
1349
self._get_block_entry_index(dirname, basename, 1)
1789
1350
if not file_present:
1790
self._raise_invalid(old_path, file_id,
1791
'basis tree does not contain removed entry')
1351
raise errors.BzrError('dirstate: cannot apply delta, basis'
1352
' tree does not contain new entry %r %r' %
1353
(old_path, file_id))
1792
1354
entry = self._dirblocks[block_index][1][entry_index]
1793
# The state of the entry in the 'active' WT
1794
active_kind = entry[1][0][0]
1795
1355
if entry[0][2] != file_id:
1796
self._raise_invalid(old_path, file_id,
1797
'mismatched file_id in tree 1')
1799
old_kind = entry[1][1][0]
1800
if active_kind in 'ar':
1801
# The active tree doesn't have this file_id.
1802
# The basis tree is changing this record. If this is a
1803
# rename, then we don't want the record here at all
1804
# anymore. If it is just an in-place change, we want the
1805
# record here, but we'll add it if we need to. So we just
1807
if active_kind == 'r':
1808
active_path = entry[1][0][1]
1809
active_entry = self._get_entry(0, file_id, active_path)
1810
if active_entry[1][1][0] != 'r':
1811
self._raise_invalid(old_path, file_id,
1812
"Dirstate did not have matching rename entries")
1813
elif active_entry[1][0][0] in 'ar':
1814
self._raise_invalid(old_path, file_id,
1815
"Dirstate had a rename pointing at an inactive"
1817
active_entry[1][1] = null
1356
raise errors.BzrError('mismatched file_id in tree 1 %r %r' %
1357
(old_path, file_id))
1359
if entry[1][0][0] != 'a':
1360
raise errors.BzrError('dirstate: inconsistent delta, with '
1361
'tree 0. %r %r' % (old_path, file_id))
1818
1362
del self._dirblocks[block_index][1][entry_index]
1820
# This was a directory, and the active tree says it
1821
# doesn't exist, and now the basis tree says it doesn't
1822
# exist. Remove its dirblock if present
1824
present) = self._find_block_index_from_key(
1827
dir_block = self._dirblocks[dir_block_index][1]
1829
# This entry is empty, go ahead and just remove it
1830
del self._dirblocks[dir_block_index]
1832
# There is still an active record, so just mark this
1835
block_i, entry_i, d_present, f_present = \
1836
self._get_block_entry_index(old_path, '', 1)
1838
dir_block = self._dirblocks[block_i][1]
1839
for child_entry in dir_block:
1840
child_basis_kind = child_entry[1][1][0]
1841
if child_basis_kind not in 'ar':
1842
self._raise_invalid(old_path, file_id,
1843
"The file id was deleted but its children were "
1846
def _after_delta_check_parents(self, parents, index):
1847
"""Check that parents required by the delta are all intact.
1849
:param parents: An iterable of (path_utf8, file_id) tuples which are
1850
required to be present in tree 'index' at path_utf8 with id file_id
1852
:param index: The column in the dirstate to check for parents in.
1854
for dirname_utf8, file_id in parents:
1855
# Get the entry - the ensures that file_id, dirname_utf8 exists and
1856
# has the right file id.
1857
entry = self._get_entry(index, file_id, dirname_utf8)
1858
if entry[1] is None:
1859
self._raise_invalid(dirname_utf8.decode('utf8'),
1860
file_id, "This parent is not present.")
1861
# Parents of things must be directories
1862
if entry[1][index][0] != 'd':
1863
self._raise_invalid(dirname_utf8.decode('utf8'),
1864
file_id, "This parent is not a directory.")
1866
def _observed_sha1(self, entry, sha1, stat_value,
1867
_stat_to_minikind=_stat_to_minikind):
1868
"""Note the sha1 of a file.
1870
:param entry: The entry the sha1 is for.
1871
:param sha1: The observed sha1.
1872
:param stat_value: The os.lstat for the file.
1364
if entry[1][0][0] == 'a':
1365
raise errors.BzrError('dirstate: inconsistent delta, with '
1366
'tree 0. %r %r' % (old_path, file_id))
1367
elif entry[1][0][0] == 'r':
1368
# implement the rename
1369
del self._dirblocks[block_index][1][entry_index]
1371
# it is being resurrected here, so blank it out temporarily.
1372
self._dirblocks[block_index][1][entry_index][1][1] = null
1374
def update_entry(self, entry, abspath, stat_value,
1375
_stat_to_minikind=_stat_to_minikind,
1376
_pack_stat=pack_stat):
1377
"""Update the entry based on what is actually on disk.
1379
:param entry: This is the dirblock entry for the file in question.
1380
:param abspath: The path on disk for this file.
1381
:param stat_value: (optional) if we already have done a stat on the
1383
:return: The sha1 hexdigest of the file (40 bytes) or link target of a
1875
1387
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
1876
1388
except KeyError:
1877
1389
# Unhandled kind
1391
packed_stat = _pack_stat(stat_value)
1392
(saved_minikind, saved_link_or_sha1, saved_file_size,
1393
saved_executable, saved_packed_stat) = entry[1][0]
1395
if (minikind == saved_minikind
1396
and packed_stat == saved_packed_stat):
1397
# The stat hasn't changed since we saved, so we can re-use the
1402
# size should also be in packed_stat
1403
if saved_file_size == stat_value.st_size:
1404
return saved_link_or_sha1
1406
# If we have gotten this far, that means that we need to actually
1407
# process this entry.
1879
1409
if minikind == 'f':
1880
if self._cutoff_time is None:
1881
self._sha_cutoff_time()
1882
if (stat_value.st_mtime < self._cutoff_time
1883
and stat_value.st_ctime < self._cutoff_time):
1884
entry[1][0] = ('f', sha1, stat_value.st_size, entry[1][0][3],
1885
pack_stat(stat_value))
1886
self._mark_modified([entry])
1410
link_or_sha1 = self._sha1_file(abspath)
1411
executable = self._is_executable(stat_value.st_mode,
1413
if self._cutoff_time is None:
1414
self._sha_cutoff_time()
1415
if (stat_value.st_mtime < self._cutoff_time
1416
and stat_value.st_ctime < self._cutoff_time):
1417
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
1418
executable, packed_stat)
1420
entry[1][0] = ('f', '', stat_value.st_size,
1421
executable, DirState.NULLSTAT)
1422
elif minikind == 'd':
1424
entry[1][0] = ('d', '', 0, False, packed_stat)
1425
if saved_minikind != 'd':
1426
# This changed from something into a directory. Make sure we
1427
# have a directory block for it. This doesn't happen very
1428
# often, so this doesn't have to be super fast.
1429
block_index, entry_index, dir_present, file_present = \
1430
self._get_block_entry_index(entry[0][0], entry[0][1], 0)
1431
self._ensure_block(block_index, entry_index,
1432
osutils.pathjoin(entry[0][0], entry[0][1]))
1433
elif minikind == 'l':
1434
link_or_sha1 = self._read_link(abspath, saved_link_or_sha1)
1435
if self._cutoff_time is None:
1436
self._sha_cutoff_time()
1437
if (stat_value.st_mtime < self._cutoff_time
1438
and stat_value.st_ctime < self._cutoff_time):
1439
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
1442
entry[1][0] = ('l', '', stat_value.st_size,
1443
False, DirState.NULLSTAT)
1444
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
1888
1447
def _sha_cutoff_time(self):
1889
1448
"""Return cutoff time.
3366
2653
raise errors.ObjectNotLocked(self)
3369
def py_update_entry(state, entry, abspath, stat_value,
3370
_stat_to_minikind=DirState._stat_to_minikind):
3371
"""Update the entry based on what is actually on disk.
3373
This function only calculates the sha if it needs to - if the entry is
3374
uncachable, or clearly different to the first parent's entry, no sha
3375
is calculated, and None is returned.
3377
:param state: The dirstate this entry is in.
3378
:param entry: This is the dirblock entry for the file in question.
3379
:param abspath: The path on disk for this file.
3380
:param stat_value: The stat value done on the path.
3381
:return: None, or The sha1 hexdigest of the file (40 bytes) or link
3382
target of a symlink.
3385
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
3389
packed_stat = pack_stat(stat_value)
3390
(saved_minikind, saved_link_or_sha1, saved_file_size,
3391
saved_executable, saved_packed_stat) = entry[1][0]
3393
if minikind == 'd' and saved_minikind == 't':
3395
if (minikind == saved_minikind
3396
and packed_stat == saved_packed_stat):
3397
# The stat hasn't changed since we saved, so we can re-use the
3402
# size should also be in packed_stat
3403
if saved_file_size == stat_value.st_size:
3404
return saved_link_or_sha1
3406
# If we have gotten this far, that means that we need to actually
3407
# process this entry.
3411
executable = state._is_executable(stat_value.st_mode,
3413
if state._cutoff_time is None:
3414
state._sha_cutoff_time()
3415
if (stat_value.st_mtime < state._cutoff_time
3416
and stat_value.st_ctime < state._cutoff_time
3417
and len(entry[1]) > 1
3418
and entry[1][1][0] != 'a'):
3419
# Could check for size changes for further optimised
3420
# avoidance of sha1's. However the most prominent case of
3421
# over-shaing is during initial add, which this catches.
3422
# Besides, if content filtering happens, size and sha
3423
# are calculated at the same time, so checking just the size
3424
# gains nothing w.r.t. performance.
3425
link_or_sha1 = state._sha1_file(abspath)
3426
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3427
executable, packed_stat)
3429
entry[1][0] = ('f', '', stat_value.st_size,
3430
executable, DirState.NULLSTAT)
3431
worth_saving = False
3432
elif minikind == 'd':
3434
entry[1][0] = ('d', '', 0, False, packed_stat)
3435
if saved_minikind != 'd':
3436
# This changed from something into a directory. Make sure we
3437
# have a directory block for it. This doesn't happen very
3438
# often, so this doesn't have to be super fast.
3439
block_index, entry_index, dir_present, file_present = \
3440
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3441
state._ensure_block(block_index, entry_index,
3442
osutils.pathjoin(entry[0][0], entry[0][1]))
3444
worth_saving = False
3445
elif minikind == 'l':
3446
if saved_minikind == 'l':
3447
worth_saving = False
3448
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3449
if state._cutoff_time is None:
3450
state._sha_cutoff_time()
3451
if (stat_value.st_mtime < state._cutoff_time
3452
and stat_value.st_ctime < state._cutoff_time):
3453
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3456
entry[1][0] = ('l', '', stat_value.st_size,
3457
False, DirState.NULLSTAT)
3459
state._mark_modified([entry])
3463
class ProcessEntryPython(object):
3465
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3466
"last_source_parent", "last_target_parent", "include_unchanged",
3467
"partial", "use_filesystem_for_exec", "utf8_decode",
3468
"searched_specific_files", "search_specific_files",
3469
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3470
"state", "source_index", "target_index", "want_unversioned", "tree"]
3472
def __init__(self, include_unchanged, use_filesystem_for_exec,
3473
search_specific_files, state, source_index, target_index,
3474
want_unversioned, tree):
3475
self.old_dirname_to_file_id = {}
3476
self.new_dirname_to_file_id = {}
3477
# Are we doing a partial iter_changes?
3478
self.partial = search_specific_files != set([''])
3479
# Using a list so that we can access the values and change them in
3480
# nested scope. Each one is [path, file_id, entry]
3481
self.last_source_parent = [None, None]
3482
self.last_target_parent = [None, None]
3483
self.include_unchanged = include_unchanged
3484
self.use_filesystem_for_exec = use_filesystem_for_exec
3485
self.utf8_decode = cache_utf8._utf8_decode
3486
# for all search_indexs in each path at or under each element of
3487
# search_specific_files, if the detail is relocated: add the id, and
3488
# add the relocated path as one to search if its not searched already.
3489
# If the detail is not relocated, add the id.
3490
self.searched_specific_files = set()
3491
# When we search exact paths without expanding downwards, we record
3493
self.searched_exact_paths = set()
3494
self.search_specific_files = search_specific_files
3495
# The parents up to the root of the paths we are searching.
3496
# After all normal paths are returned, these specific items are returned.
3497
self.search_specific_file_parents = set()
3498
# The ids we've sent out in the delta.
3499
self.seen_ids = set()
3501
self.source_index = source_index
3502
self.target_index = target_index
3503
if target_index != 0:
3504
# A lot of code in here depends on target_index == 0
3505
raise errors.BzrError('unsupported target index')
3506
self.want_unversioned = want_unversioned
3509
def _process_entry(self, entry, path_info, pathjoin=osutils.pathjoin):
3510
"""Compare an entry and real disk to generate delta information.
3512
:param path_info: top_relpath, basename, kind, lstat, abspath for
3513
the path of entry. If None, then the path is considered absent in
3514
the target (Perhaps we should pass in a concrete entry for this ?)
3515
Basename is returned as a utf8 string because we expect this
3516
tuple will be ignored, and don't want to take the time to
3518
:return: (iter_changes_result, changed). If the entry has not been
3519
handled then changed is None. Otherwise it is False if no content
3520
or metadata changes have occurred, and True if any content or
3521
metadata change has occurred. If self.include_unchanged is True then
3522
if changed is not None, iter_changes_result will always be a result
3523
tuple. Otherwise, iter_changes_result is None unless changed is
3526
if self.source_index is None:
3527
source_details = DirState.NULL_PARENT_DETAILS
3529
source_details = entry[1][self.source_index]
3530
target_details = entry[1][self.target_index]
3531
target_minikind = target_details[0]
3532
if path_info is not None and target_minikind in 'fdlt':
3533
if not (self.target_index == 0):
3534
raise AssertionError()
3535
link_or_sha1 = update_entry(self.state, entry,
3536
abspath=path_info[4], stat_value=path_info[3])
3537
# The entry may have been modified by update_entry
3538
target_details = entry[1][self.target_index]
3539
target_minikind = target_details[0]
3542
file_id = entry[0][2]
3543
source_minikind = source_details[0]
3544
if source_minikind in 'fdltr' and target_minikind in 'fdlt':
3545
# claimed content in both: diff
3546
# r | fdlt | | add source to search, add id path move and perform
3547
# | | | diff check on source-target
3548
# r | fdlt | a | dangling file that was present in the basis.
3550
if source_minikind in 'r':
3551
# add the source to the search path to find any children it
3552
# has. TODO ? : only add if it is a container ?
3553
if not osutils.is_inside_any(self.searched_specific_files,
3555
self.search_specific_files.add(source_details[1])
3556
# generate the old path; this is needed for stating later
3558
old_path = source_details[1]
3559
old_dirname, old_basename = os.path.split(old_path)
3560
path = pathjoin(entry[0][0], entry[0][1])
3561
old_entry = self.state._get_entry(self.source_index,
3563
# update the source details variable to be the real
3565
if old_entry == (None, None):
3566
raise errors.CorruptDirstate(self.state._filename,
3567
"entry '%s/%s' is considered renamed from %r"
3568
" but source does not exist\n"
3569
"entry: %s" % (entry[0][0], entry[0][1], old_path, entry))
3570
source_details = old_entry[1][self.source_index]
3571
source_minikind = source_details[0]
3573
old_dirname = entry[0][0]
3574
old_basename = entry[0][1]
3575
old_path = path = None
3576
if path_info is None:
3577
# the file is missing on disk, show as removed.
3578
content_change = True
3582
# source and target are both versioned and disk file is present.
3583
target_kind = path_info[2]
3584
if target_kind == 'directory':
3586
old_path = path = pathjoin(old_dirname, old_basename)
3587
self.new_dirname_to_file_id[path] = file_id
3588
if source_minikind != 'd':
3589
content_change = True
3591
# directories have no fingerprint
3592
content_change = False
3594
elif target_kind == 'file':
3595
if source_minikind != 'f':
3596
content_change = True
3598
# Check the sha. We can't just rely on the size as
3599
# content filtering may mean differ sizes actually
3600
# map to the same content
3601
if link_or_sha1 is None:
3603
statvalue, link_or_sha1 = \
3604
self.state._sha1_provider.stat_and_sha1(
3606
self.state._observed_sha1(entry, link_or_sha1,
3608
content_change = (link_or_sha1 != source_details[1])
3609
# Target details is updated at update_entry time
3610
if self.use_filesystem_for_exec:
3611
# We don't need S_ISREG here, because we are sure
3612
# we are dealing with a file.
3613
target_exec = bool(stat.S_IEXEC & path_info[3].st_mode)
3615
target_exec = target_details[3]
3616
elif target_kind == 'symlink':
3617
if source_minikind != 'l':
3618
content_change = True
3620
content_change = (link_or_sha1 != source_details[1])
3622
elif target_kind == 'tree-reference':
3623
if source_minikind != 't':
3624
content_change = True
3626
content_change = False
3630
path = pathjoin(old_dirname, old_basename)
3631
raise errors.BadFileKindError(path, path_info[2])
3632
if source_minikind == 'd':
3634
old_path = path = pathjoin(old_dirname, old_basename)
3635
self.old_dirname_to_file_id[old_path] = file_id
3636
# parent id is the entry for the path in the target tree
3637
if old_basename and old_dirname == self.last_source_parent[0]:
3638
source_parent_id = self.last_source_parent[1]
3641
source_parent_id = self.old_dirname_to_file_id[old_dirname]
3643
source_parent_entry = self.state._get_entry(self.source_index,
3644
path_utf8=old_dirname)
3645
source_parent_id = source_parent_entry[0][2]
3646
if source_parent_id == entry[0][2]:
3647
# This is the root, so the parent is None
3648
source_parent_id = None
3650
self.last_source_parent[0] = old_dirname
3651
self.last_source_parent[1] = source_parent_id
3652
new_dirname = entry[0][0]
3653
if entry[0][1] and new_dirname == self.last_target_parent[0]:
3654
target_parent_id = self.last_target_parent[1]
3657
target_parent_id = self.new_dirname_to_file_id[new_dirname]
3659
# TODO: We don't always need to do the lookup, because the
3660
# parent entry will be the same as the source entry.
3661
target_parent_entry = self.state._get_entry(self.target_index,
3662
path_utf8=new_dirname)
3663
if target_parent_entry == (None, None):
3664
raise AssertionError(
3665
"Could not find target parent in wt: %s\nparent of: %s"
3666
% (new_dirname, entry))
3667
target_parent_id = target_parent_entry[0][2]
3668
if target_parent_id == entry[0][2]:
3669
# This is the root, so the parent is None
3670
target_parent_id = None
3672
self.last_target_parent[0] = new_dirname
3673
self.last_target_parent[1] = target_parent_id
3675
source_exec = source_details[3]
3676
changed = (content_change
3677
or source_parent_id != target_parent_id
3678
or old_basename != entry[0][1]
3679
or source_exec != target_exec
3681
if not changed and not self.include_unchanged:
3684
if old_path is None:
3685
old_path = path = pathjoin(old_dirname, old_basename)
3686
old_path_u = self.utf8_decode(old_path)[0]
3689
old_path_u = self.utf8_decode(old_path)[0]
3690
if old_path == path:
3693
path_u = self.utf8_decode(path)[0]
3694
source_kind = DirState._minikind_to_kind[source_minikind]
3695
return (entry[0][2],
3696
(old_path_u, path_u),
3699
(source_parent_id, target_parent_id),
3700
(self.utf8_decode(old_basename)[0], self.utf8_decode(entry[0][1])[0]),
3701
(source_kind, target_kind),
3702
(source_exec, target_exec)), changed
3703
elif source_minikind in 'a' and target_minikind in 'fdlt':
3704
# looks like a new file
3705
path = pathjoin(entry[0][0], entry[0][1])
3706
# parent id is the entry for the path in the target tree
3707
# TODO: these are the same for an entire directory: cache em.
3708
parent_id = self.state._get_entry(self.target_index,
3709
path_utf8=entry[0][0])[0][2]
3710
if parent_id == entry[0][2]:
3712
if path_info is not None:
3714
if self.use_filesystem_for_exec:
3715
# We need S_ISREG here, because we aren't sure if this
3718
stat.S_ISREG(path_info[3].st_mode)
3719
and stat.S_IEXEC & path_info[3].st_mode)
3721
target_exec = target_details[3]
3722
return (entry[0][2],
3723
(None, self.utf8_decode(path)[0]),
3727
(None, self.utf8_decode(entry[0][1])[0]),
3728
(None, path_info[2]),
3729
(None, target_exec)), True
3731
# Its a missing file, report it as such.
3732
return (entry[0][2],
3733
(None, self.utf8_decode(path)[0]),
3737
(None, self.utf8_decode(entry[0][1])[0]),
3739
(None, False)), True
3740
elif source_minikind in 'fdlt' and target_minikind in 'a':
3741
# unversioned, possibly, or possibly not deleted: we dont care.
3742
# if its still on disk, *and* theres no other entry at this
3743
# path [we dont know this in this routine at the moment -
3744
# perhaps we should change this - then it would be an unknown.
3745
old_path = pathjoin(entry[0][0], entry[0][1])
3746
# parent id is the entry for the path in the target tree
3747
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3748
if parent_id == entry[0][2]:
3750
return (entry[0][2],
3751
(self.utf8_decode(old_path)[0], None),
3755
(self.utf8_decode(entry[0][1])[0], None),
3756
(DirState._minikind_to_kind[source_minikind], None),
3757
(source_details[3], None)), True
3758
elif source_minikind in 'fdlt' and target_minikind in 'r':
3759
# a rename; could be a true rename, or a rename inherited from
3760
# a renamed parent. TODO: handle this efficiently. Its not
3761
# common case to rename dirs though, so a correct but slow
3762
# implementation will do.
3763
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3764
self.search_specific_files.add(target_details[1])
3765
elif source_minikind in 'ra' and target_minikind in 'ra':
3766
# neither of the selected trees contain this file,
3767
# so skip over it. This is not currently directly tested, but
3768
# is indirectly via test_too_much.TestCommands.test_conflicts.
3771
raise AssertionError("don't know how to compare "
3772
"source_minikind=%r, target_minikind=%r"
3773
% (source_minikind, target_minikind))
3779
def _gather_result_for_consistency(self, result):
3780
"""Check a result we will yield to make sure we are consistent later.
3782
This gathers result's parents into a set to output later.
3784
:param result: A result tuple.
3786
if not self.partial or not result[0]:
3788
self.seen_ids.add(result[0])
3789
new_path = result[1][1]
3791
# Not the root and not a delete: queue up the parents of the path.
3792
self.search_specific_file_parents.update(
3793
osutils.parent_directories(new_path.encode('utf8')))
3794
# Add the root directory which parent_directories does not
3796
self.search_specific_file_parents.add('')
3798
def iter_changes(self):
3799
"""Iterate over the changes."""
3800
utf8_decode = cache_utf8._utf8_decode
3801
_cmp_by_dirs = cmp_by_dirs
3802
_process_entry = self._process_entry
3803
search_specific_files = self.search_specific_files
3804
searched_specific_files = self.searched_specific_files
3805
splitpath = osutils.splitpath
3807
# compare source_index and target_index at or under each element of search_specific_files.
3808
# follow the following comparison table. Note that we only want to do diff operations when
3809
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3813
# Source | Target | disk | action
3814
# r | fdlt | | add source to search, add id path move and perform
3815
# | | | diff check on source-target
3816
# r | fdlt | a | dangling file that was present in the basis.
3818
# r | a | | add source to search
3820
# r | r | | this path is present in a non-examined tree, skip.
3821
# r | r | a | this path is present in a non-examined tree, skip.
3822
# a | fdlt | | add new id
3823
# a | fdlt | a | dangling locally added file, skip
3824
# a | a | | not present in either tree, skip
3825
# a | a | a | not present in any tree, skip
3826
# a | r | | not present in either tree at this path, skip as it
3827
# | | | may not be selected by the users list of paths.
3828
# a | r | a | not present in either tree at this path, skip as it
3829
# | | | may not be selected by the users list of paths.
3830
# fdlt | fdlt | | content in both: diff them
3831
# fdlt | fdlt | a | deleted locally, but not unversioned - show as deleted ?
3832
# fdlt | a | | unversioned: output deleted id for now
3833
# fdlt | a | a | unversioned and deleted: output deleted id
3834
# fdlt | r | | relocated in this tree, so add target to search.
3835
# | | | Dont diff, we will see an r,fd; pair when we reach
3836
# | | | this id at the other path.
3837
# fdlt | r | a | relocated in this tree, so add target to search.
3838
# | | | Dont diff, we will see an r,fd; pair when we reach
3839
# | | | this id at the other path.
3841
# TODO: jam 20070516 - Avoid the _get_entry lookup overhead by
3842
# keeping a cache of directories that we have seen.
3844
while search_specific_files:
3845
# TODO: the pending list should be lexically sorted? the
3846
# interface doesn't require it.
3847
current_root = search_specific_files.pop()
3848
current_root_unicode = current_root.decode('utf8')
3849
searched_specific_files.add(current_root)
3850
# process the entries for this containing directory: the rest will be
3851
# found by their parents recursively.
3852
root_entries = self.state._entries_for_path(current_root)
3853
root_abspath = self.tree.abspath(current_root_unicode)
3855
root_stat = os.lstat(root_abspath)
3857
if e.errno == errno.ENOENT:
3858
# the path does not exist: let _process_entry know that.
3859
root_dir_info = None
3861
# some other random error: hand it up.
3864
root_dir_info = ('', current_root,
3865
osutils.file_kind_from_stat_mode(root_stat.st_mode), root_stat,
3867
if root_dir_info[2] == 'directory':
3868
if self.tree._directory_is_tree_reference(
3869
current_root.decode('utf8')):
3870
root_dir_info = root_dir_info[:2] + \
3871
('tree-reference',) + root_dir_info[3:]
3873
if not root_entries and not root_dir_info:
3874
# this specified path is not present at all, skip it.
3876
path_handled = False
3877
for entry in root_entries:
3878
result, changed = _process_entry(entry, root_dir_info)
3879
if changed is not None:
3882
self._gather_result_for_consistency(result)
3883
if changed or self.include_unchanged:
3885
if self.want_unversioned and not path_handled and root_dir_info:
3886
new_executable = bool(
3887
stat.S_ISREG(root_dir_info[3].st_mode)
3888
and stat.S_IEXEC & root_dir_info[3].st_mode)
3890
(None, current_root_unicode),
3894
(None, splitpath(current_root_unicode)[-1]),
3895
(None, root_dir_info[2]),
3896
(None, new_executable)
3898
initial_key = (current_root, '', '')
3899
block_index, _ = self.state._find_block_index_from_key(initial_key)
3900
if block_index == 0:
3901
# we have processed the total root already, but because the
3902
# initial key matched it we should skip it here.
3904
if root_dir_info and root_dir_info[2] == 'tree-reference':
3905
current_dir_info = None
3907
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3909
current_dir_info = dir_iterator.next()
3911
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3912
# python 2.5 has e.errno == EINVAL,
3913
# and e.winerror == ERROR_DIRECTORY
3914
e_winerror = getattr(e, 'winerror', None)
3915
win_errors = (ERROR_DIRECTORY, ERROR_PATH_NOT_FOUND)
3916
# there may be directories in the inventory even though
3917
# this path is not a file on disk: so mark it as end of
3919
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3920
current_dir_info = None
3921
elif (sys.platform == 'win32'
3922
and (e.errno in win_errors
3923
or e_winerror in win_errors)):
3924
current_dir_info = None
3928
if current_dir_info[0][0] == '':
3929
# remove .bzr from iteration
3930
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3931
if current_dir_info[1][bzr_index][0] != '.bzr':
3932
raise AssertionError()
3933
del current_dir_info[1][bzr_index]
3934
# walk until both the directory listing and the versioned metadata
3936
if (block_index < len(self.state._dirblocks) and
3937
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
3938
current_block = self.state._dirblocks[block_index]
3940
current_block = None
3941
while (current_dir_info is not None or
3942
current_block is not None):
3943
if (current_dir_info and current_block
3944
and current_dir_info[0][0] != current_block[0]):
3945
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
3946
# filesystem data refers to paths not covered by the dirblock.
3947
# this has two possibilities:
3948
# A) it is versioned but empty, so there is no block for it
3949
# B) it is not versioned.
3951
# if (A) then we need to recurse into it to check for
3952
# new unknown files or directories.
3953
# if (B) then we should ignore it, because we don't
3954
# recurse into unknown directories.
3956
while path_index < len(current_dir_info[1]):
3957
current_path_info = current_dir_info[1][path_index]
3958
if self.want_unversioned:
3959
if current_path_info[2] == 'directory':
3960
if self.tree._directory_is_tree_reference(
3961
current_path_info[0].decode('utf8')):
3962
current_path_info = current_path_info[:2] + \
3963
('tree-reference',) + current_path_info[3:]
3964
new_executable = bool(
3965
stat.S_ISREG(current_path_info[3].st_mode)
3966
and stat.S_IEXEC & current_path_info[3].st_mode)
3968
(None, utf8_decode(current_path_info[0])[0]),
3972
(None, utf8_decode(current_path_info[1])[0]),
3973
(None, current_path_info[2]),
3974
(None, new_executable))
3975
# dont descend into this unversioned path if it is
3977
if current_path_info[2] in ('directory',
3979
del current_dir_info[1][path_index]
3983
# This dir info has been handled, go to the next
3985
current_dir_info = dir_iterator.next()
3986
except StopIteration:
3987
current_dir_info = None
3989
# We have a dirblock entry for this location, but there
3990
# is no filesystem path for this. This is most likely
3991
# because a directory was removed from the disk.
3992
# We don't have to report the missing directory,
3993
# because that should have already been handled, but we
3994
# need to handle all of the files that are contained
3996
for current_entry in current_block[1]:
3997
# entry referring to file not present on disk.
3998
# advance the entry only, after processing.
3999
result, changed = _process_entry(current_entry, None)
4000
if changed is not None:
4002
self._gather_result_for_consistency(result)
4003
if changed or self.include_unchanged:
4006
if (block_index < len(self.state._dirblocks) and
4007
osutils.is_inside(current_root,
4008
self.state._dirblocks[block_index][0])):
4009
current_block = self.state._dirblocks[block_index]
4011
current_block = None
4014
if current_block and entry_index < len(current_block[1]):
4015
current_entry = current_block[1][entry_index]
4017
current_entry = None
4018
advance_entry = True
4020
if current_dir_info and path_index < len(current_dir_info[1]):
4021
current_path_info = current_dir_info[1][path_index]
4022
if current_path_info[2] == 'directory':
4023
if self.tree._directory_is_tree_reference(
4024
current_path_info[0].decode('utf8')):
4025
current_path_info = current_path_info[:2] + \
4026
('tree-reference',) + current_path_info[3:]
4028
current_path_info = None
4030
path_handled = False
4031
while (current_entry is not None or
4032
current_path_info is not None):
4033
if current_entry is None:
4034
# the check for path_handled when the path is advanced
4035
# will yield this path if needed.
4037
elif current_path_info is None:
4038
# no path is fine: the per entry code will handle it.
4039
result, changed = _process_entry(current_entry, current_path_info)
4040
if changed is not None:
4042
self._gather_result_for_consistency(result)
4043
if changed or self.include_unchanged:
4045
elif (current_entry[0][1] != current_path_info[1]
4046
or current_entry[1][self.target_index][0] in 'ar'):
4047
# The current path on disk doesn't match the dirblock
4048
# record. Either the dirblock is marked as absent, or
4049
# the file on disk is not present at all in the
4050
# dirblock. Either way, report about the dirblock
4051
# entry, and let other code handle the filesystem one.
4053
# Compare the basename for these files to determine
4055
if current_path_info[1] < current_entry[0][1]:
4056
# extra file on disk: pass for now, but only
4057
# increment the path, not the entry
4058
advance_entry = False
4060
# entry referring to file not present on disk.
4061
# advance the entry only, after processing.
4062
result, changed = _process_entry(current_entry, None)
4063
if changed is not None:
4065
self._gather_result_for_consistency(result)
4066
if changed or self.include_unchanged:
4068
advance_path = False
4070
result, changed = _process_entry(current_entry, current_path_info)
4071
if changed is not None:
4074
self._gather_result_for_consistency(result)
4075
if changed or self.include_unchanged:
4077
if advance_entry and current_entry is not None:
4079
if entry_index < len(current_block[1]):
4080
current_entry = current_block[1][entry_index]
4082
current_entry = None
4084
advance_entry = True # reset the advance flaga
4085
if advance_path and current_path_info is not None:
4086
if not path_handled:
4087
# unversioned in all regards
4088
if self.want_unversioned:
4089
new_executable = bool(
4090
stat.S_ISREG(current_path_info[3].st_mode)
4091
and stat.S_IEXEC & current_path_info[3].st_mode)
4093
relpath_unicode = utf8_decode(current_path_info[0])[0]
4094
except UnicodeDecodeError:
4095
raise errors.BadFilenameEncoding(
4096
current_path_info[0], osutils._fs_enc)
4098
(None, relpath_unicode),
4102
(None, utf8_decode(current_path_info[1])[0]),
4103
(None, current_path_info[2]),
4104
(None, new_executable))
4105
# dont descend into this unversioned path if it is
4107
if current_path_info[2] in ('directory'):
4108
del current_dir_info[1][path_index]
4110
# dont descend the disk iterator into any tree
4112
if current_path_info[2] == 'tree-reference':
4113
del current_dir_info[1][path_index]
4116
if path_index < len(current_dir_info[1]):
4117
current_path_info = current_dir_info[1][path_index]
4118
if current_path_info[2] == 'directory':
4119
if self.tree._directory_is_tree_reference(
4120
current_path_info[0].decode('utf8')):
4121
current_path_info = current_path_info[:2] + \
4122
('tree-reference',) + current_path_info[3:]
4124
current_path_info = None
4125
path_handled = False
4127
advance_path = True # reset the advance flagg.
4128
if current_block is not None:
4130
if (block_index < len(self.state._dirblocks) and
4131
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4132
current_block = self.state._dirblocks[block_index]
4134
current_block = None
4135
if current_dir_info is not None:
4137
current_dir_info = dir_iterator.next()
4138
except StopIteration:
4139
current_dir_info = None
4140
for result in self._iter_specific_file_parents():
4143
def _iter_specific_file_parents(self):
4144
"""Iter over the specific file parents."""
4145
while self.search_specific_file_parents:
4146
# Process the parent directories for the paths we were iterating.
4147
# Even in extremely large trees this should be modest, so currently
4148
# no attempt is made to optimise.
4149
path_utf8 = self.search_specific_file_parents.pop()
4150
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4151
# We've examined this path.
4153
if path_utf8 in self.searched_exact_paths:
4154
# We've examined this path.
4156
path_entries = self.state._entries_for_path(path_utf8)
4157
# We need either one or two entries. If the path in
4158
# self.target_index has moved (so the entry in source_index is in
4159
# 'ar') then we need to also look for the entry for this path in
4160
# self.source_index, to output the appropriate delete-or-rename.
4161
selected_entries = []
4163
for candidate_entry in path_entries:
4164
# Find entries present in target at this path:
4165
if candidate_entry[1][self.target_index][0] not in 'ar':
4167
selected_entries.append(candidate_entry)
4168
# Find entries present in source at this path:
4169
elif (self.source_index is not None and
4170
candidate_entry[1][self.source_index][0] not in 'ar'):
4172
if candidate_entry[1][self.target_index][0] == 'a':
4173
# Deleted, emit it here.
4174
selected_entries.append(candidate_entry)
4176
# renamed, emit it when we process the directory it
4178
self.search_specific_file_parents.add(
4179
candidate_entry[1][self.target_index][1])
4181
raise AssertionError(
4182
"Missing entry for specific path parent %r, %r" % (
4183
path_utf8, path_entries))
4184
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4185
for entry in selected_entries:
4186
if entry[0][2] in self.seen_ids:
4188
result, changed = self._process_entry(entry, path_info)
4190
raise AssertionError(
4191
"Got entry<->path mismatch for specific path "
4192
"%r entry %r path_info %r " % (
4193
path_utf8, entry, path_info))
4194
# Only include changes - we're outside the users requested
4197
self._gather_result_for_consistency(result)
4198
if (result[6][0] == 'directory' and
4199
result[6][1] != 'directory'):
4200
# This stopped being a directory, the old children have
4202
if entry[1][self.source_index][0] == 'r':
4203
# renamed, take the source path
4204
entry_path_utf8 = entry[1][self.source_index][1]
4206
entry_path_utf8 = path_utf8
4207
initial_key = (entry_path_utf8, '', '')
4208
block_index, _ = self.state._find_block_index_from_key(
4210
if block_index == 0:
4211
# The children of the root are in block index 1.
4213
current_block = None
4214
if block_index < len(self.state._dirblocks):
4215
current_block = self.state._dirblocks[block_index]
4216
if not osutils.is_inside(
4217
entry_path_utf8, current_block[0]):
4218
# No entries for this directory at all.
4219
current_block = None
4220
if current_block is not None:
4221
for entry in current_block[1]:
4222
if entry[1][self.source_index][0] in 'ar':
4223
# Not in the source tree, so doesn't have to be
4226
# Path of the entry itself.
4228
self.search_specific_file_parents.add(
4229
osutils.pathjoin(*entry[0][:2]))
4230
if changed or self.include_unchanged:
4232
self.searched_exact_paths.add(path_utf8)
4234
def _path_info(self, utf8_path, unicode_path):
4235
"""Generate path_info for unicode_path.
4237
:return: None if unicode_path does not exist, or a path_info tuple.
4239
abspath = self.tree.abspath(unicode_path)
4241
stat = os.lstat(abspath)
4243
if e.errno == errno.ENOENT:
4244
# the path does not exist.
4248
utf8_basename = utf8_path.rsplit('/', 1)[-1]
4249
dir_info = (utf8_path, utf8_basename,
4250
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4252
if dir_info[2] == 'directory':
4253
if self.tree._directory_is_tree_reference(
4255
self.root_dir_info = self.root_dir_info[:2] + \
4256
('tree-reference',) + self.root_dir_info[3:]
4260
2656
# Try to load the compiled form if possible
4262
from bzrlib._dirstate_helpers_pyx import (
4269
ProcessEntryC as _process_entry,
4270
update_entry as update_entry,
2658
from bzrlib._dirstate_helpers_c import (
2659
_read_dirblocks_c as _read_dirblocks,
2660
bisect_dirblock_c as bisect_dirblock,
2661
_bisect_path_left_c as _bisect_path_left,
2662
_bisect_path_right_c as _bisect_path_right,
2663
cmp_by_dirs_c as cmp_by_dirs,
4272
except ImportError, e:
4273
osutils.failed_to_load_extension(e)
4274
2666
from bzrlib._dirstate_helpers_py import (
2667
_read_dirblocks_py as _read_dirblocks,
2668
bisect_dirblock_py as bisect_dirblock,
2669
_bisect_path_left_py as _bisect_path_left,
2670
_bisect_path_right_py as _bisect_path_right,
2671
cmp_by_dirs_py as cmp_by_dirs,
4282
# FIXME: It would be nice to be able to track moved lines so that the
4283
# corresponding python code can be moved to the _dirstate_helpers_py
4284
# module. I don't want to break the history for this important piece of
4285
# code so I left the code here -- vila 20090622
4286
update_entry = py_update_entry
4287
_process_entry = ProcessEntryPython