20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
'a' is an absent entry: In that tree the id is not present at this path.
93
'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
't' is a reference to a nested subtree; the fingerprint is the referenced
95
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
97
108
directory, as a list of components
101
112
--- Format 1 had the following different definition: ---
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
109
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
110
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
412
445
self._last_block_index = None
413
446
self._last_entry_index = None
447
# The set of known hash changes
448
self._known_hash_changes = set()
449
# How many hash changed entries can we have without saving
450
self._worth_saving_limit = worth_saving_limit
415
452
def __repr__(self):
416
453
return "%s(%r)" % \
417
454
(self.__class__.__name__, self._filename)
456
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
457
"""Mark this dirstate as modified.
459
:param hash_changed_entries: if non-None, mark just these entries as
460
having their hash modified.
461
:param header_modified: mark the header modified as well, not just the
464
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
465
if hash_changed_entries:
466
self._known_hash_changes.update([e[0] for e in hash_changed_entries])
467
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
468
DirState.IN_MEMORY_UNMODIFIED):
469
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
470
# that takes precedence.
471
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
473
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
474
# should fail noisily if someone tries to set
475
# IN_MEMORY_MODIFIED but we don't have a write-lock!
476
# We don't know exactly what changed so disable smart saving
477
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
479
self._header_state = DirState.IN_MEMORY_MODIFIED
481
def _mark_unmodified(self):
482
"""Mark this dirstate as unmodified."""
483
self._header_state = DirState.IN_MEMORY_UNMODIFIED
484
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
485
self._known_hash_changes = set()
419
487
def add(self, path, file_id, kind, stat, fingerprint):
420
488
"""Add a path to be tracked.
1484
1550
if basename_utf8:
1485
1551
parents.add((dirname_utf8, inv_entry.parent_id))
1486
1552
if old_path is None:
1487
adds.append((None, encode(new_path), file_id,
1553
old_path_utf8 = None
1555
old_path_utf8 = encode(old_path)
1556
if old_path is None:
1557
adds.append((None, new_path_utf8, file_id,
1488
1558
inv_to_entry(inv_entry), True))
1489
1559
new_ids.add(file_id)
1490
1560
elif new_path is None:
1491
deletes.append((encode(old_path), None, file_id, None, True))
1492
elif (old_path, new_path) != root_only:
1561
deletes.append((old_path_utf8, None, file_id, None, True))
1562
elif (old_path, new_path) == root_only:
1563
# change things in-place
1564
# Note: the case of a parent directory changing its file_id
1565
# tends to break optimizations here, because officially
1566
# the file has actually been moved, it just happens to
1567
# end up at the same path. If we can figure out how to
1568
# handle that case, we can avoid a lot of add+delete
1569
# pairs for objects that stay put.
1570
# elif old_path == new_path:
1571
changes.append((old_path_utf8, new_path_utf8, file_id,
1572
inv_to_entry(inv_entry)))
1494
1575
# Because renames must preserve their children we must have
1495
1576
# processed all relocations and removes before hand. The sort
1505
1586
self._update_basis_apply_deletes(deletes)
1507
1588
# Split into an add/delete pair recursively.
1508
adds.append((None, new_path_utf8, file_id,
1509
inv_to_entry(inv_entry), False))
1589
adds.append((old_path_utf8, new_path_utf8, file_id,
1590
inv_to_entry(inv_entry), False))
1510
1591
# Expunge deletes that we've seen so that deleted/renamed
1511
1592
# children of a rename directory are handled correctly.
1512
new_deletes = reversed(list(self._iter_child_entries(1,
1593
new_deletes = reversed(list(
1594
self._iter_child_entries(1, old_path_utf8)))
1514
1595
# Remove the current contents of the tree at orig_path, and
1515
1596
# reinsert at the correct new path.
1516
1597
for entry in new_deletes:
1518
source_path = entry[0][0] + '/' + entry[0][1]
1598
child_dirname, child_basename, child_file_id = entry[0]
1600
source_path = child_dirname + '/' + child_basename
1520
source_path = entry[0][1]
1602
source_path = child_basename
1521
1603
if new_path_utf8:
1522
1604
target_path = new_path_utf8 + source_path[len(old_path):]
1524
1606
if old_path == '':
1525
1607
raise AssertionError("cannot rename directory to"
1527
1609
target_path = source_path[len(old_path) + 1:]
1528
1610
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1529
1611
deletes.append(
1530
1612
(source_path, target_path, entry[0][2], None, False))
1532
(encode(old_path), new_path, file_id, None, False))
1534
# changes to just the root should not require remove/insertion
1536
changes.append((encode(old_path), encode(new_path), file_id,
1537
inv_to_entry(inv_entry)))
1613
deletes.append((old_path_utf8, new_path, file_id, None, False))
1538
1614
self._check_delta_ids_absent(new_ids, delta, 1)
1540
1616
# Finish expunging deletes/first half of renames.
1598
1673
# Adds are accumulated partly from renames, so can be in any input
1599
1674
# order - sort it.
1675
# TODO: we may want to sort in dirblocks order. That way each entry
1676
# will end up in the same directory, allowing the _get_entry
1677
# fast-path for looking up 2 items in the same dir work.
1678
adds.sort(key=lambda x: x[1])
1601
1679
# adds is now in lexographic order, which places all parents before
1602
1680
# their children, so we can process it linearly.
1682
st = static_tuple.StaticTuple
1604
1683
for old_path, new_path, file_id, new_details, real_add in adds:
1605
# the entry for this file_id must be in tree 0.
1606
entry = self._get_entry(0, file_id, new_path)
1607
if entry[0] is None or entry[0][2] != file_id:
1608
self._changes_aborted = True
1609
raise errors.InconsistentDelta(new_path, file_id,
1610
'working tree does not contain new entry')
1611
if real_add and entry[1][1][0] not in absent:
1612
self._changes_aborted = True
1613
raise errors.InconsistentDelta(new_path, file_id,
1614
'The entry was considered to be a genuinely new record,'
1615
' but there was already an old record for it.')
1616
# We don't need to update the target of an 'r' because the handling
1617
# of renames turns all 'r' situations into a delete at the original
1619
entry[1][1] = new_details
1684
dirname, basename = osutils.split(new_path)
1685
entry_key = st(dirname, basename, file_id)
1686
block_index, present = self._find_block_index_from_key(entry_key)
1688
self._raise_invalid(new_path, file_id,
1689
"Unable to find block for this record."
1690
" Was the parent added?")
1691
block = self._dirblocks[block_index][1]
1692
entry_index, present = self._find_entry_index(entry_key, block)
1694
if old_path is not None:
1695
self._raise_invalid(new_path, file_id,
1696
'considered a real add but still had old_path at %s'
1699
entry = block[entry_index]
1700
basis_kind = entry[1][1][0]
1701
if basis_kind == 'a':
1702
entry[1][1] = new_details
1703
elif basis_kind == 'r':
1704
raise NotImplementedError()
1706
self._raise_invalid(new_path, file_id,
1707
"An entry was marked as a new add"
1708
" but the basis target already existed")
1710
# The exact key was not found in the block. However, we need to
1711
# check if there is a key next to us that would have matched.
1712
# We only need to check 2 locations, because there are only 2
1714
for maybe_index in range(entry_index-1, entry_index+1):
1715
if maybe_index < 0 or maybe_index >= len(block):
1717
maybe_entry = block[maybe_index]
1718
if maybe_entry[0][:2] != (dirname, basename):
1719
# Just a random neighbor
1721
if maybe_entry[0][2] == file_id:
1722
raise AssertionError(
1723
'_find_entry_index didnt find a key match'
1724
' but walking the data did, for %s'
1726
basis_kind = maybe_entry[1][1][0]
1727
if basis_kind not in 'ar':
1728
self._raise_invalid(new_path, file_id,
1729
"we have an add record for path, but the path"
1730
" is already present with another file_id %s"
1731
% (maybe_entry[0][2],))
1733
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1735
block.insert(entry_index, entry)
1737
active_kind = entry[1][0][0]
1738
if active_kind == 'a':
1739
# The active record shows up as absent, this could be genuine,
1740
# or it could be present at some other location. We need to
1742
id_index = self._get_id_index()
1743
# The id_index may not be perfectly accurate for tree1, because
1744
# we haven't been keeping it updated. However, it should be
1745
# fine for tree0, and that gives us enough info for what we
1747
keys = id_index.get(file_id, ())
1749
block_i, entry_i, d_present, f_present = \
1750
self._get_block_entry_index(key[0], key[1], 0)
1753
active_entry = self._dirblocks[block_i][1][entry_i]
1754
if (active_entry[0][2] != file_id):
1755
# Some other file is at this path, we don't need to
1758
real_active_kind = active_entry[1][0][0]
1759
if real_active_kind in 'ar':
1760
# We found a record, which was not *this* record,
1761
# which matches the file_id, but is not actually
1762
# present. Something seems *really* wrong.
1763
self._raise_invalid(new_path, file_id,
1764
"We found a tree0 entry that doesnt make sense")
1765
# Now, we've found a tree0 entry which matches the file_id
1766
# but is at a different location. So update them to be
1768
active_dir, active_name = active_entry[0][:2]
1770
active_path = active_dir + '/' + active_name
1772
active_path = active_name
1773
active_entry[1][1] = st('r', new_path, 0, False, '')
1774
entry[1][0] = st('r', active_path, 0, False, '')
1775
elif active_kind == 'r':
1776
raise NotImplementedError()
1778
new_kind = new_details[0]
1780
self._ensure_block(block_index, entry_index, new_path)
1621
1782
def _update_basis_apply_changes(self, changes):
1622
1783
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1654
1809
null = DirState.NULL_PARENT_DETAILS
1655
1810
for old_path, new_path, file_id, _, real_delete in deletes:
1656
1811
if real_delete != (new_path is None):
1657
self._changes_aborted = True
1658
raise AssertionError("bad delete delta")
1812
self._raise_invalid(old_path, file_id, "bad delete delta")
1659
1813
# the entry for this file_id must be in tree 1.
1660
1814
dirname, basename = osutils.split(old_path)
1661
1815
block_index, entry_index, dir_present, file_present = \
1662
1816
self._get_block_entry_index(dirname, basename, 1)
1663
1817
if not file_present:
1664
self._changes_aborted = True
1665
raise errors.InconsistentDelta(old_path, file_id,
1818
self._raise_invalid(old_path, file_id,
1666
1819
'basis tree does not contain removed entry')
1667
1820
entry = self._dirblocks[block_index][1][entry_index]
1821
# The state of the entry in the 'active' WT
1822
active_kind = entry[1][0][0]
1668
1823
if entry[0][2] != file_id:
1669
self._changes_aborted = True
1670
raise errors.InconsistentDelta(old_path, file_id,
1824
self._raise_invalid(old_path, file_id,
1671
1825
'mismatched file_id in tree 1')
1673
if entry[1][0][0] != 'a':
1674
self._changes_aborted = True
1675
raise errors.InconsistentDelta(old_path, file_id,
1676
'This was marked as a real delete, but the WT state'
1677
' claims that it still exists and is versioned.')
1827
old_kind = entry[1][1][0]
1828
if active_kind in 'ar':
1829
# The active tree doesn't have this file_id.
1830
# The basis tree is changing this record. If this is a
1831
# rename, then we don't want the record here at all
1832
# anymore. If it is just an in-place change, we want the
1833
# record here, but we'll add it if we need to. So we just
1835
if active_kind == 'r':
1836
active_path = entry[1][0][1]
1837
active_entry = self._get_entry(0, file_id, active_path)
1838
if active_entry[1][1][0] != 'r':
1839
self._raise_invalid(old_path, file_id,
1840
"Dirstate did not have matching rename entries")
1841
elif active_entry[1][0][0] in 'ar':
1842
self._raise_invalid(old_path, file_id,
1843
"Dirstate had a rename pointing at an inactive"
1845
active_entry[1][1] = null
1678
1846
del self._dirblocks[block_index][1][entry_index]
1848
# This was a directory, and the active tree says it
1849
# doesn't exist, and now the basis tree says it doesn't
1850
# exist. Remove its dirblock if present
1852
present) = self._find_block_index_from_key(
1855
dir_block = self._dirblocks[dir_block_index][1]
1857
# This entry is empty, go ahead and just remove it
1858
del self._dirblocks[dir_block_index]
1680
if entry[1][0][0] == 'a':
1681
self._changes_aborted = True
1682
raise errors.InconsistentDelta(old_path, file_id,
1683
'The entry was considered a rename, but the source path'
1684
' is marked as absent.')
1685
# For whatever reason, we were asked to rename an entry
1686
# that was originally marked as deleted. This could be
1687
# because we are renaming the parent directory, and the WT
1688
# current state has the file marked as deleted.
1689
elif entry[1][0][0] == 'r':
1690
# implement the rename
1691
del self._dirblocks[block_index][1][entry_index]
1693
# it is being resurrected here, so blank it out temporarily.
1694
self._dirblocks[block_index][1][entry_index][1][1] = null
1860
# There is still an active record, so just mark this
1863
block_i, entry_i, d_present, f_present = \
1864
self._get_block_entry_index(old_path, '', 1)
1866
dir_block = self._dirblocks[block_i][1]
1867
for child_entry in dir_block:
1868
child_basis_kind = child_entry[1][1][0]
1869
if child_basis_kind not in 'ar':
1870
self._raise_invalid(old_path, file_id,
1871
"The file id was deleted but its children were "
1696
1874
def _after_delta_check_parents(self, parents, index):
1697
1875
"""Check that parents required by the delta are all intact.