20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
94
The entries on disk and in memory are ordered according to the following keys:
108
96
directory, as a list of components
112
100
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
108
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
109
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
3396
2774
raise errors.ObjectNotLocked(self)
3399
def py_update_entry(state, entry, abspath, stat_value,
3400
_stat_to_minikind=DirState._stat_to_minikind,
3401
_pack_stat=pack_stat):
3402
"""Update the entry based on what is actually on disk.
3404
This function only calculates the sha if it needs to - if the entry is
3405
uncachable, or clearly different to the first parent's entry, no sha
3406
is calculated, and None is returned.
3408
:param state: The dirstate this entry is in.
3409
:param entry: This is the dirblock entry for the file in question.
3410
:param abspath: The path on disk for this file.
3411
:param stat_value: The stat value done on the path.
3412
:return: None, or The sha1 hexdigest of the file (40 bytes) or link
3413
target of a symlink.
3416
minikind = _stat_to_minikind[stat_value.st_mode & 0170000]
3420
packed_stat = _pack_stat(stat_value)
3421
(saved_minikind, saved_link_or_sha1, saved_file_size,
3422
saved_executable, saved_packed_stat) = entry[1][0]
3424
if minikind == 'd' and saved_minikind == 't':
3426
if (minikind == saved_minikind
3427
and packed_stat == saved_packed_stat):
3428
# The stat hasn't changed since we saved, so we can re-use the
3433
# size should also be in packed_stat
3434
if saved_file_size == stat_value.st_size:
3435
return saved_link_or_sha1
3437
# If we have gotten this far, that means that we need to actually
3438
# process this entry.
3442
executable = state._is_executable(stat_value.st_mode,
3444
if state._cutoff_time is None:
3445
state._sha_cutoff_time()
3446
if (stat_value.st_mtime < state._cutoff_time
3447
and stat_value.st_ctime < state._cutoff_time
3448
and len(entry[1]) > 1
3449
and entry[1][1][0] != 'a'):
3450
# Could check for size changes for further optimised
3451
# avoidance of sha1's. However the most prominent case of
3452
# over-shaing is during initial add, which this catches.
3453
# Besides, if content filtering happens, size and sha
3454
# are calculated at the same time, so checking just the size
3455
# gains nothing w.r.t. performance.
3456
link_or_sha1 = state._sha1_file(abspath)
3457
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3458
executable, packed_stat)
3460
entry[1][0] = ('f', '', stat_value.st_size,
3461
executable, DirState.NULLSTAT)
3462
worth_saving = False
3463
elif minikind == 'd':
3465
entry[1][0] = ('d', '', 0, False, packed_stat)
3466
if saved_minikind != 'd':
3467
# This changed from something into a directory. Make sure we
3468
# have a directory block for it. This doesn't happen very
3469
# often, so this doesn't have to be super fast.
3470
block_index, entry_index, dir_present, file_present = \
3471
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3472
state._ensure_block(block_index, entry_index,
3473
osutils.pathjoin(entry[0][0], entry[0][1]))
3475
worth_saving = False
3476
elif minikind == 'l':
3477
if saved_minikind == 'l':
3478
worth_saving = False
3479
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3480
if state._cutoff_time is None:
3481
state._sha_cutoff_time()
3482
if (stat_value.st_mtime < state._cutoff_time
3483
and stat_value.st_ctime < state._cutoff_time):
3484
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3487
entry[1][0] = ('l', '', stat_value.st_size,
3488
False, DirState.NULLSTAT)
3490
state._mark_modified([entry])
3494
class ProcessEntryPython(object):
3496
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3497
"last_source_parent", "last_target_parent", "include_unchanged",
3498
"partial", "use_filesystem_for_exec", "utf8_decode",
3499
"searched_specific_files", "search_specific_files",
3500
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3501
"state", "source_index", "target_index", "want_unversioned", "tree"]
3503
def __init__(self, include_unchanged, use_filesystem_for_exec,
3504
search_specific_files, state, source_index, target_index,
3505
want_unversioned, tree):
3506
self.old_dirname_to_file_id = {}
3507
self.new_dirname_to_file_id = {}
3508
# Are we doing a partial iter_changes?
3509
self.partial = search_specific_files != set([''])
3510
# Using a list so that we can access the values and change them in
3511
# nested scope. Each one is [path, file_id, entry]
3512
self.last_source_parent = [None, None]
3513
self.last_target_parent = [None, None]
3514
self.include_unchanged = include_unchanged
3515
self.use_filesystem_for_exec = use_filesystem_for_exec
3516
self.utf8_decode = cache_utf8._utf8_decode
3517
# for all search_indexs in each path at or under each element of
3518
# search_specific_files, if the detail is relocated: add the id, and
3519
# add the relocated path as one to search if its not searched already.
3520
# If the detail is not relocated, add the id.
3521
self.searched_specific_files = set()
3522
# When we search exact paths without expanding downwards, we record
3524
self.searched_exact_paths = set()
3525
self.search_specific_files = search_specific_files
3526
# The parents up to the root of the paths we are searching.
3527
# After all normal paths are returned, these specific items are returned.
3528
self.search_specific_file_parents = set()
3529
# The ids we've sent out in the delta.
3530
self.seen_ids = set()
3532
self.source_index = source_index
3533
self.target_index = target_index
3534
if target_index != 0:
3535
# A lot of code in here depends on target_index == 0
3536
raise errors.BzrError('unsupported target index')
3537
self.want_unversioned = want_unversioned
3540
def _process_entry(self, entry, path_info, pathjoin=osutils.pathjoin):
3541
"""Compare an entry and real disk to generate delta information.
3543
:param path_info: top_relpath, basename, kind, lstat, abspath for
3544
the path of entry. If None, then the path is considered absent in
3545
the target (Perhaps we should pass in a concrete entry for this ?)
3546
Basename is returned as a utf8 string because we expect this
3547
tuple will be ignored, and don't want to take the time to
3549
:return: (iter_changes_result, changed). If the entry has not been
3550
handled then changed is None. Otherwise it is False if no content
3551
or metadata changes have occurred, and True if any content or
3552
metadata change has occurred. If self.include_unchanged is True then
3553
if changed is not None, iter_changes_result will always be a result
3554
tuple. Otherwise, iter_changes_result is None unless changed is
3557
if self.source_index is None:
3558
source_details = DirState.NULL_PARENT_DETAILS
3560
source_details = entry[1][self.source_index]
3561
target_details = entry[1][self.target_index]
3562
target_minikind = target_details[0]
3563
if path_info is not None and target_minikind in 'fdlt':
3564
if not (self.target_index == 0):
3565
raise AssertionError()
3566
link_or_sha1 = update_entry(self.state, entry,
3567
abspath=path_info[4], stat_value=path_info[3])
3568
# The entry may have been modified by update_entry
3569
target_details = entry[1][self.target_index]
3570
target_minikind = target_details[0]
3573
file_id = entry[0][2]
3574
source_minikind = source_details[0]
3575
if source_minikind in 'fdltr' and target_minikind in 'fdlt':
3576
# claimed content in both: diff
3577
# r | fdlt | | add source to search, add id path move and perform
3578
# | | | diff check on source-target
3579
# r | fdlt | a | dangling file that was present in the basis.
3581
if source_minikind in 'r':
3582
# add the source to the search path to find any children it
3583
# has. TODO ? : only add if it is a container ?
3584
if not osutils.is_inside_any(self.searched_specific_files,
3586
self.search_specific_files.add(source_details[1])
3587
# generate the old path; this is needed for stating later
3589
old_path = source_details[1]
3590
old_dirname, old_basename = os.path.split(old_path)
3591
path = pathjoin(entry[0][0], entry[0][1])
3592
old_entry = self.state._get_entry(self.source_index,
3594
# update the source details variable to be the real
3596
if old_entry == (None, None):
3597
raise errors.CorruptDirstate(self.state._filename,
3598
"entry '%s/%s' is considered renamed from %r"
3599
" but source does not exist\n"
3600
"entry: %s" % (entry[0][0], entry[0][1], old_path, entry))
3601
source_details = old_entry[1][self.source_index]
3602
source_minikind = source_details[0]
3604
old_dirname = entry[0][0]
3605
old_basename = entry[0][1]
3606
old_path = path = None
3607
if path_info is None:
3608
# the file is missing on disk, show as removed.
3609
content_change = True
3613
# source and target are both versioned and disk file is present.
3614
target_kind = path_info[2]
3615
if target_kind == 'directory':
3617
old_path = path = pathjoin(old_dirname, old_basename)
3618
self.new_dirname_to_file_id[path] = file_id
3619
if source_minikind != 'd':
3620
content_change = True
3622
# directories have no fingerprint
3623
content_change = False
3625
elif target_kind == 'file':
3626
if source_minikind != 'f':
3627
content_change = True
3629
# Check the sha. We can't just rely on the size as
3630
# content filtering may mean differ sizes actually
3631
# map to the same content
3632
if link_or_sha1 is None:
3634
statvalue, link_or_sha1 = \
3635
self.state._sha1_provider.stat_and_sha1(
3637
self.state._observed_sha1(entry, link_or_sha1,
3639
content_change = (link_or_sha1 != source_details[1])
3640
# Target details is updated at update_entry time
3641
if self.use_filesystem_for_exec:
3642
# We don't need S_ISREG here, because we are sure
3643
# we are dealing with a file.
3644
target_exec = bool(stat.S_IEXEC & path_info[3].st_mode)
3646
target_exec = target_details[3]
3647
elif target_kind == 'symlink':
3648
if source_minikind != 'l':
3649
content_change = True
3651
content_change = (link_or_sha1 != source_details[1])
3653
elif target_kind == 'tree-reference':
3654
if source_minikind != 't':
3655
content_change = True
3657
content_change = False
3661
path = pathjoin(old_dirname, old_basename)
3662
raise errors.BadFileKindError(path, path_info[2])
3663
if source_minikind == 'd':
3665
old_path = path = pathjoin(old_dirname, old_basename)
3666
self.old_dirname_to_file_id[old_path] = file_id
3667
# parent id is the entry for the path in the target tree
3668
if old_basename and old_dirname == self.last_source_parent[0]:
3669
source_parent_id = self.last_source_parent[1]
3672
source_parent_id = self.old_dirname_to_file_id[old_dirname]
3674
source_parent_entry = self.state._get_entry(self.source_index,
3675
path_utf8=old_dirname)
3676
source_parent_id = source_parent_entry[0][2]
3677
if source_parent_id == entry[0][2]:
3678
# This is the root, so the parent is None
3679
source_parent_id = None
3681
self.last_source_parent[0] = old_dirname
3682
self.last_source_parent[1] = source_parent_id
3683
new_dirname = entry[0][0]
3684
if entry[0][1] and new_dirname == self.last_target_parent[0]:
3685
target_parent_id = self.last_target_parent[1]
3688
target_parent_id = self.new_dirname_to_file_id[new_dirname]
3690
# TODO: We don't always need to do the lookup, because the
3691
# parent entry will be the same as the source entry.
3692
target_parent_entry = self.state._get_entry(self.target_index,
3693
path_utf8=new_dirname)
3694
if target_parent_entry == (None, None):
3695
raise AssertionError(
3696
"Could not find target parent in wt: %s\nparent of: %s"
3697
% (new_dirname, entry))
3698
target_parent_id = target_parent_entry[0][2]
3699
if target_parent_id == entry[0][2]:
3700
# This is the root, so the parent is None
3701
target_parent_id = None
3703
self.last_target_parent[0] = new_dirname
3704
self.last_target_parent[1] = target_parent_id
3706
source_exec = source_details[3]
3707
changed = (content_change
3708
or source_parent_id != target_parent_id
3709
or old_basename != entry[0][1]
3710
or source_exec != target_exec
3712
if not changed and not self.include_unchanged:
3715
if old_path is None:
3716
old_path = path = pathjoin(old_dirname, old_basename)
3717
old_path_u = self.utf8_decode(old_path)[0]
3720
old_path_u = self.utf8_decode(old_path)[0]
3721
if old_path == path:
3724
path_u = self.utf8_decode(path)[0]
3725
source_kind = DirState._minikind_to_kind[source_minikind]
3726
return (entry[0][2],
3727
(old_path_u, path_u),
3730
(source_parent_id, target_parent_id),
3731
(self.utf8_decode(old_basename)[0], self.utf8_decode(entry[0][1])[0]),
3732
(source_kind, target_kind),
3733
(source_exec, target_exec)), changed
3734
elif source_minikind in 'a' and target_minikind in 'fdlt':
3735
# looks like a new file
3736
path = pathjoin(entry[0][0], entry[0][1])
3737
# parent id is the entry for the path in the target tree
3738
# TODO: these are the same for an entire directory: cache em.
3739
parent_id = self.state._get_entry(self.target_index,
3740
path_utf8=entry[0][0])[0][2]
3741
if parent_id == entry[0][2]:
3743
if path_info is not None:
3745
if self.use_filesystem_for_exec:
3746
# We need S_ISREG here, because we aren't sure if this
3749
stat.S_ISREG(path_info[3].st_mode)
3750
and stat.S_IEXEC & path_info[3].st_mode)
3752
target_exec = target_details[3]
3753
return (entry[0][2],
3754
(None, self.utf8_decode(path)[0]),
3758
(None, self.utf8_decode(entry[0][1])[0]),
3759
(None, path_info[2]),
3760
(None, target_exec)), True
3762
# Its a missing file, report it as such.
3763
return (entry[0][2],
3764
(None, self.utf8_decode(path)[0]),
3768
(None, self.utf8_decode(entry[0][1])[0]),
3770
(None, False)), True
3771
elif source_minikind in 'fdlt' and target_minikind in 'a':
3772
# unversioned, possibly, or possibly not deleted: we dont care.
3773
# if its still on disk, *and* theres no other entry at this
3774
# path [we dont know this in this routine at the moment -
3775
# perhaps we should change this - then it would be an unknown.
3776
old_path = pathjoin(entry[0][0], entry[0][1])
3777
# parent id is the entry for the path in the target tree
3778
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3779
if parent_id == entry[0][2]:
3781
return (entry[0][2],
3782
(self.utf8_decode(old_path)[0], None),
3786
(self.utf8_decode(entry[0][1])[0], None),
3787
(DirState._minikind_to_kind[source_minikind], None),
3788
(source_details[3], None)), True
3789
elif source_minikind in 'fdlt' and target_minikind in 'r':
3790
# a rename; could be a true rename, or a rename inherited from
3791
# a renamed parent. TODO: handle this efficiently. Its not
3792
# common case to rename dirs though, so a correct but slow
3793
# implementation will do.
3794
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3795
self.search_specific_files.add(target_details[1])
3796
elif source_minikind in 'ra' and target_minikind in 'ra':
3797
# neither of the selected trees contain this file,
3798
# so skip over it. This is not currently directly tested, but
3799
# is indirectly via test_too_much.TestCommands.test_conflicts.
3802
raise AssertionError("don't know how to compare "
3803
"source_minikind=%r, target_minikind=%r"
3804
% (source_minikind, target_minikind))
3810
def _gather_result_for_consistency(self, result):
3811
"""Check a result we will yield to make sure we are consistent later.
3813
This gathers result's parents into a set to output later.
3815
:param result: A result tuple.
3817
if not self.partial or not result[0]:
3819
self.seen_ids.add(result[0])
3820
new_path = result[1][1]
3822
# Not the root and not a delete: queue up the parents of the path.
3823
self.search_specific_file_parents.update(
3824
osutils.parent_directories(new_path.encode('utf8')))
3825
# Add the root directory which parent_directories does not
3827
self.search_specific_file_parents.add('')
3829
def iter_changes(self):
3830
"""Iterate over the changes."""
3831
utf8_decode = cache_utf8._utf8_decode
3832
_cmp_by_dirs = cmp_by_dirs
3833
_process_entry = self._process_entry
3834
search_specific_files = self.search_specific_files
3835
searched_specific_files = self.searched_specific_files
3836
splitpath = osutils.splitpath
3838
# compare source_index and target_index at or under each element of search_specific_files.
3839
# follow the following comparison table. Note that we only want to do diff operations when
3840
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3844
# Source | Target | disk | action
3845
# r | fdlt | | add source to search, add id path move and perform
3846
# | | | diff check on source-target
3847
# r | fdlt | a | dangling file that was present in the basis.
3849
# r | a | | add source to search
3851
# r | r | | this path is present in a non-examined tree, skip.
3852
# r | r | a | this path is present in a non-examined tree, skip.
3853
# a | fdlt | | add new id
3854
# a | fdlt | a | dangling locally added file, skip
3855
# a | a | | not present in either tree, skip
3856
# a | a | a | not present in any tree, skip
3857
# a | r | | not present in either tree at this path, skip as it
3858
# | | | may not be selected by the users list of paths.
3859
# a | r | a | not present in either tree at this path, skip as it
3860
# | | | may not be selected by the users list of paths.
3861
# fdlt | fdlt | | content in both: diff them
3862
# fdlt | fdlt | a | deleted locally, but not unversioned - show as deleted ?
3863
# fdlt | a | | unversioned: output deleted id for now
3864
# fdlt | a | a | unversioned and deleted: output deleted id
3865
# fdlt | r | | relocated in this tree, so add target to search.
3866
# | | | Dont diff, we will see an r,fd; pair when we reach
3867
# | | | this id at the other path.
3868
# fdlt | r | a | relocated in this tree, so add target to search.
3869
# | | | Dont diff, we will see an r,fd; pair when we reach
3870
# | | | this id at the other path.
3872
# TODO: jam 20070516 - Avoid the _get_entry lookup overhead by
3873
# keeping a cache of directories that we have seen.
3875
while search_specific_files:
3876
# TODO: the pending list should be lexically sorted? the
3877
# interface doesn't require it.
3878
current_root = search_specific_files.pop()
3879
current_root_unicode = current_root.decode('utf8')
3880
searched_specific_files.add(current_root)
3881
# process the entries for this containing directory: the rest will be
3882
# found by their parents recursively.
3883
root_entries = self.state._entries_for_path(current_root)
3884
root_abspath = self.tree.abspath(current_root_unicode)
3886
root_stat = os.lstat(root_abspath)
3888
if e.errno == errno.ENOENT:
3889
# the path does not exist: let _process_entry know that.
3890
root_dir_info = None
3892
# some other random error: hand it up.
3895
root_dir_info = ('', current_root,
3896
osutils.file_kind_from_stat_mode(root_stat.st_mode), root_stat,
3898
if root_dir_info[2] == 'directory':
3899
if self.tree._directory_is_tree_reference(
3900
current_root.decode('utf8')):
3901
root_dir_info = root_dir_info[:2] + \
3902
('tree-reference',) + root_dir_info[3:]
3904
if not root_entries and not root_dir_info:
3905
# this specified path is not present at all, skip it.
3907
path_handled = False
3908
for entry in root_entries:
3909
result, changed = _process_entry(entry, root_dir_info)
3910
if changed is not None:
3913
self._gather_result_for_consistency(result)
3914
if changed or self.include_unchanged:
3916
if self.want_unversioned and not path_handled and root_dir_info:
3917
new_executable = bool(
3918
stat.S_ISREG(root_dir_info[3].st_mode)
3919
and stat.S_IEXEC & root_dir_info[3].st_mode)
3921
(None, current_root_unicode),
3925
(None, splitpath(current_root_unicode)[-1]),
3926
(None, root_dir_info[2]),
3927
(None, new_executable)
3929
initial_key = (current_root, '', '')
3930
block_index, _ = self.state._find_block_index_from_key(initial_key)
3931
if block_index == 0:
3932
# we have processed the total root already, but because the
3933
# initial key matched it we should skip it here.
3935
if root_dir_info and root_dir_info[2] == 'tree-reference':
3936
current_dir_info = None
3938
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3940
current_dir_info = dir_iterator.next()
3942
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3943
# python 2.5 has e.errno == EINVAL,
3944
# and e.winerror == ERROR_DIRECTORY
3945
e_winerror = getattr(e, 'winerror', None)
3946
win_errors = (ERROR_DIRECTORY, ERROR_PATH_NOT_FOUND)
3947
# there may be directories in the inventory even though
3948
# this path is not a file on disk: so mark it as end of
3950
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3951
current_dir_info = None
3952
elif (sys.platform == 'win32'
3953
and (e.errno in win_errors
3954
or e_winerror in win_errors)):
3955
current_dir_info = None
3959
if current_dir_info[0][0] == '':
3960
# remove .bzr from iteration
3961
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3962
if current_dir_info[1][bzr_index][0] != '.bzr':
3963
raise AssertionError()
3964
del current_dir_info[1][bzr_index]
3965
# walk until both the directory listing and the versioned metadata
3967
if (block_index < len(self.state._dirblocks) and
3968
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
3969
current_block = self.state._dirblocks[block_index]
3971
current_block = None
3972
while (current_dir_info is not None or
3973
current_block is not None):
3974
if (current_dir_info and current_block
3975
and current_dir_info[0][0] != current_block[0]):
3976
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
3977
# filesystem data refers to paths not covered by the dirblock.
3978
# this has two possibilities:
3979
# A) it is versioned but empty, so there is no block for it
3980
# B) it is not versioned.
3982
# if (A) then we need to recurse into it to check for
3983
# new unknown files or directories.
3984
# if (B) then we should ignore it, because we don't
3985
# recurse into unknown directories.
3987
while path_index < len(current_dir_info[1]):
3988
current_path_info = current_dir_info[1][path_index]
3989
if self.want_unversioned:
3990
if current_path_info[2] == 'directory':
3991
if self.tree._directory_is_tree_reference(
3992
current_path_info[0].decode('utf8')):
3993
current_path_info = current_path_info[:2] + \
3994
('tree-reference',) + current_path_info[3:]
3995
new_executable = bool(
3996
stat.S_ISREG(current_path_info[3].st_mode)
3997
and stat.S_IEXEC & current_path_info[3].st_mode)
3999
(None, utf8_decode(current_path_info[0])[0]),
4003
(None, utf8_decode(current_path_info[1])[0]),
4004
(None, current_path_info[2]),
4005
(None, new_executable))
4006
# dont descend into this unversioned path if it is
4008
if current_path_info[2] in ('directory',
4010
del current_dir_info[1][path_index]
4014
# This dir info has been handled, go to the next
4016
current_dir_info = dir_iterator.next()
4017
except StopIteration:
4018
current_dir_info = None
4020
# We have a dirblock entry for this location, but there
4021
# is no filesystem path for this. This is most likely
4022
# because a directory was removed from the disk.
4023
# We don't have to report the missing directory,
4024
# because that should have already been handled, but we
4025
# need to handle all of the files that are contained
4027
for current_entry in current_block[1]:
4028
# entry referring to file not present on disk.
4029
# advance the entry only, after processing.
4030
result, changed = _process_entry(current_entry, None)
4031
if changed is not None:
4033
self._gather_result_for_consistency(result)
4034
if changed or self.include_unchanged:
4037
if (block_index < len(self.state._dirblocks) and
4038
osutils.is_inside(current_root,
4039
self.state._dirblocks[block_index][0])):
4040
current_block = self.state._dirblocks[block_index]
4042
current_block = None
4045
if current_block and entry_index < len(current_block[1]):
4046
current_entry = current_block[1][entry_index]
4048
current_entry = None
4049
advance_entry = True
4051
if current_dir_info and path_index < len(current_dir_info[1]):
4052
current_path_info = current_dir_info[1][path_index]
4053
if current_path_info[2] == 'directory':
4054
if self.tree._directory_is_tree_reference(
4055
current_path_info[0].decode('utf8')):
4056
current_path_info = current_path_info[:2] + \
4057
('tree-reference',) + current_path_info[3:]
4059
current_path_info = None
4061
path_handled = False
4062
while (current_entry is not None or
4063
current_path_info is not None):
4064
if current_entry is None:
4065
# the check for path_handled when the path is advanced
4066
# will yield this path if needed.
4068
elif current_path_info is None:
4069
# no path is fine: the per entry code will handle it.
4070
result, changed = _process_entry(current_entry, current_path_info)
4071
if changed is not None:
4073
self._gather_result_for_consistency(result)
4074
if changed or self.include_unchanged:
4076
elif (current_entry[0][1] != current_path_info[1]
4077
or current_entry[1][self.target_index][0] in 'ar'):
4078
# The current path on disk doesn't match the dirblock
4079
# record. Either the dirblock is marked as absent, or
4080
# the file on disk is not present at all in the
4081
# dirblock. Either way, report about the dirblock
4082
# entry, and let other code handle the filesystem one.
4084
# Compare the basename for these files to determine
4086
if current_path_info[1] < current_entry[0][1]:
4087
# extra file on disk: pass for now, but only
4088
# increment the path, not the entry
4089
advance_entry = False
4091
# entry referring to file not present on disk.
4092
# advance the entry only, after processing.
4093
result, changed = _process_entry(current_entry, None)
4094
if changed is not None:
4096
self._gather_result_for_consistency(result)
4097
if changed or self.include_unchanged:
4099
advance_path = False
4101
result, changed = _process_entry(current_entry, current_path_info)
4102
if changed is not None:
4105
self._gather_result_for_consistency(result)
4106
if changed or self.include_unchanged:
4108
if advance_entry and current_entry is not None:
4110
if entry_index < len(current_block[1]):
4111
current_entry = current_block[1][entry_index]
4113
current_entry = None
4115
advance_entry = True # reset the advance flaga
4116
if advance_path and current_path_info is not None:
4117
if not path_handled:
4118
# unversioned in all regards
4119
if self.want_unversioned:
4120
new_executable = bool(
4121
stat.S_ISREG(current_path_info[3].st_mode)
4122
and stat.S_IEXEC & current_path_info[3].st_mode)
4124
relpath_unicode = utf8_decode(current_path_info[0])[0]
4125
except UnicodeDecodeError:
4126
raise errors.BadFilenameEncoding(
4127
current_path_info[0], osutils._fs_enc)
4129
(None, relpath_unicode),
4133
(None, utf8_decode(current_path_info[1])[0]),
4134
(None, current_path_info[2]),
4135
(None, new_executable))
4136
# dont descend into this unversioned path if it is
4138
if current_path_info[2] in ('directory'):
4139
del current_dir_info[1][path_index]
4141
# dont descend the disk iterator into any tree
4143
if current_path_info[2] == 'tree-reference':
4144
del current_dir_info[1][path_index]
4147
if path_index < len(current_dir_info[1]):
4148
current_path_info = current_dir_info[1][path_index]
4149
if current_path_info[2] == 'directory':
4150
if self.tree._directory_is_tree_reference(
4151
current_path_info[0].decode('utf8')):
4152
current_path_info = current_path_info[:2] + \
4153
('tree-reference',) + current_path_info[3:]
4155
current_path_info = None
4156
path_handled = False
4158
advance_path = True # reset the advance flagg.
4159
if current_block is not None:
4161
if (block_index < len(self.state._dirblocks) and
4162
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4163
current_block = self.state._dirblocks[block_index]
4165
current_block = None
4166
if current_dir_info is not None:
4168
current_dir_info = dir_iterator.next()
4169
except StopIteration:
4170
current_dir_info = None
4171
for result in self._iter_specific_file_parents():
4174
def _iter_specific_file_parents(self):
4175
"""Iter over the specific file parents."""
4176
while self.search_specific_file_parents:
4177
# Process the parent directories for the paths we were iterating.
4178
# Even in extremely large trees this should be modest, so currently
4179
# no attempt is made to optimise.
4180
path_utf8 = self.search_specific_file_parents.pop()
4181
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4182
# We've examined this path.
4184
if path_utf8 in self.searched_exact_paths:
4185
# We've examined this path.
4187
path_entries = self.state._entries_for_path(path_utf8)
4188
# We need either one or two entries. If the path in
4189
# self.target_index has moved (so the entry in source_index is in
4190
# 'ar') then we need to also look for the entry for this path in
4191
# self.source_index, to output the appropriate delete-or-rename.
4192
selected_entries = []
4194
for candidate_entry in path_entries:
4195
# Find entries present in target at this path:
4196
if candidate_entry[1][self.target_index][0] not in 'ar':
4198
selected_entries.append(candidate_entry)
4199
# Find entries present in source at this path:
4200
elif (self.source_index is not None and
4201
candidate_entry[1][self.source_index][0] not in 'ar'):
4203
if candidate_entry[1][self.target_index][0] == 'a':
4204
# Deleted, emit it here.
4205
selected_entries.append(candidate_entry)
4207
# renamed, emit it when we process the directory it
4209
self.search_specific_file_parents.add(
4210
candidate_entry[1][self.target_index][1])
4212
raise AssertionError(
4213
"Missing entry for specific path parent %r, %r" % (
4214
path_utf8, path_entries))
4215
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4216
for entry in selected_entries:
4217
if entry[0][2] in self.seen_ids:
4219
result, changed = self._process_entry(entry, path_info)
4221
raise AssertionError(
4222
"Got entry<->path mismatch for specific path "
4223
"%r entry %r path_info %r " % (
4224
path_utf8, entry, path_info))
4225
# Only include changes - we're outside the users requested
4228
self._gather_result_for_consistency(result)
4229
if (result[6][0] == 'directory' and
4230
result[6][1] != 'directory'):
4231
# This stopped being a directory, the old children have
4233
if entry[1][self.source_index][0] == 'r':
4234
# renamed, take the source path
4235
entry_path_utf8 = entry[1][self.source_index][1]
4237
entry_path_utf8 = path_utf8
4238
initial_key = (entry_path_utf8, '', '')
4239
block_index, _ = self.state._find_block_index_from_key(
4241
if block_index == 0:
4242
# The children of the root are in block index 1.
4244
current_block = None
4245
if block_index < len(self.state._dirblocks):
4246
current_block = self.state._dirblocks[block_index]
4247
if not osutils.is_inside(
4248
entry_path_utf8, current_block[0]):
4249
# No entries for this directory at all.
4250
current_block = None
4251
if current_block is not None:
4252
for entry in current_block[1]:
4253
if entry[1][self.source_index][0] in 'ar':
4254
# Not in the source tree, so doesn't have to be
4257
# Path of the entry itself.
4259
self.search_specific_file_parents.add(
4260
osutils.pathjoin(*entry[0][:2]))
4261
if changed or self.include_unchanged:
4263
self.searched_exact_paths.add(path_utf8)
4265
def _path_info(self, utf8_path, unicode_path):
4266
"""Generate path_info for unicode_path.
4268
:return: None if unicode_path does not exist, or a path_info tuple.
4270
abspath = self.tree.abspath(unicode_path)
4272
stat = os.lstat(abspath)
4274
if e.errno == errno.ENOENT:
4275
# the path does not exist.
4279
utf8_basename = utf8_path.rsplit('/', 1)[-1]
4280
dir_info = (utf8_path, utf8_basename,
4281
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4283
if dir_info[2] == 'directory':
4284
if self.tree._directory_is_tree_reference(
4286
self.root_dir_info = self.root_dir_info[:2] + \
4287
('tree-reference',) + self.root_dir_info[3:]
4291
2777
# Try to load the compiled form if possible
4293
from bzrlib._dirstate_helpers_pyx import (
4299
ProcessEntryC as _process_entry,
4300
update_entry as update_entry,
2779
from bzrlib._dirstate_helpers_c import (
2780
_read_dirblocks_c as _read_dirblocks,
2781
bisect_dirblock_c as bisect_dirblock,
2782
_bisect_path_left_c as _bisect_path_left,
2783
_bisect_path_right_c as _bisect_path_right,
2784
cmp_by_dirs_c as cmp_by_dirs,
4302
except ImportError, e:
4303
osutils.failed_to_load_extension(e)
4304
2787
from bzrlib._dirstate_helpers_py import (
2788
_read_dirblocks_py as _read_dirblocks,
2789
bisect_dirblock_py as bisect_dirblock,
2790
_bisect_path_left_py as _bisect_path_left,
2791
_bisect_path_right_py as _bisect_path_right,
2792
cmp_by_dirs_py as cmp_by_dirs,
4311
# FIXME: It would be nice to be able to track moved lines so that the
4312
# corresponding python code can be moved to the _dirstate_helpers_py
4313
# module. I don't want to break the history for this important piece of
4314
# code so I left the code here -- vila 20090622
4315
update_entry = py_update_entry
4316
_process_entry = ProcessEntryPython