5557.1.7
by John Arbash Meinel
Merge in the bzr.dev 5582 |
1 |
# Copyright (C) 2005-2011 Canonical Ltd
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
2 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
3 |
# This program is free software; you can redistribute it and/or modify
|
4 |
# it under the terms of the GNU General Public License as published by
|
|
5 |
# the Free Software Foundation; either version 2 of the License, or
|
|
6 |
# (at your option) any later version.
|
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
7 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
8 |
# This program is distributed in the hope that it will be useful,
|
9 |
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
10 |
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
11 |
# GNU General Public License for more details.
|
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
12 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
13 |
# You should have received a copy of the GNU General Public License
|
14 |
# along with this program; if not, write to the Free Software
|
|
4183.7.1
by Sabin Iacob
update FSF mailing address |
15 |
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
1218
by Martin Pool
- fix up import |
16 |
|
1231
by Martin Pool
- more progress on fetch on top of weaves |
17 |
|
18 |
"""Copying of history from one branch to another.
|
|
19 |
||
20 |
The basic plan is that every branch knows the history of everything
|
|
21 |
that has merged into it. As the first step of a merge, pull, or
|
|
22 |
branch operation we copy history from the source into the destination
|
|
23 |
branch.
|
|
24 |
"""
|
|
25 |
||
3350.6.4
by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores. |
26 |
import operator |
27 |
||
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
28 |
from bzrlib.lazy_import import lazy_import |
29 |
lazy_import(globals(), """ |
|
30 |
from bzrlib import (
|
|
5539.2.1
by Andrew Bennetts
Start defining an 'everything' fetch spec. |
31 |
graph,
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
32 |
tsort,
|
33 |
versionedfile,
|
|
34 |
)
|
|
35 |
""") |
|
4110.2.4
by Martin Pool
Deprecate passing a pb in to RepoFetcher |
36 |
from bzrlib import ( |
37 |
errors, |
|
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
38 |
ui, |
4110.2.4
by Martin Pool
Deprecate passing a pb in to RepoFetcher |
39 |
)
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
40 |
from bzrlib.revision import NULL_REVISION |
2094.3.5
by John Arbash Meinel
Fix imports to ensure modules are loaded before they are used |
41 |
from bzrlib.trace import mutter |
1534.1.31
by Robert Collins
Deprecated fetch.fetch and fetch.greedy_fetch for branch.fetch, and move the Repository.fetch internals to InterRepo and InterWeaveRepo. |
42 |
|
1238
by Martin Pool
- remove a lot of dead code from fetch |
43 |
|
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
44 |
class RepoFetcher(object): |
45 |
"""Pull revisions and texts from one repository to another.
|
|
46 |
||
2592.4.5
by Martin Pool
Add Repository.base on all repositories. |
47 |
This should not be used directly, it's essential a object to encapsulate
|
1534.1.33
by Robert Collins
Move copy_content_into into InterRepository and InterWeaveRepo, and disable the default codepath test as we have optimised paths for all current combinations. |
48 |
the logic in InterRepository.fetch().
|
1260
by Martin Pool
- some updates for fetch/update function |
49 |
"""
|
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
50 |
|
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
51 |
def __init__(self, to_repository, from_repository, last_revision=None, |
4961.2.3
by Martin Pool
Delete deprecated pb parameter to RepoFetcher |
52 |
find_ghosts=True, fetch_spec=None): |
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
53 |
"""Create a repo fetcher.
|
54 |
||
4110.2.2
by Martin Pool
Remove obsolete comments |
55 |
:param last_revision: If set, try to limit to the data this revision
|
56 |
references.
|
|
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
57 |
:param fetch_spec: A SearchResult specifying which revisions to fetch.
|
58 |
If set, this overrides last_revision.
|
|
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
59 |
:param find_ghosts: If True search the entire history for ghosts.
|
60 |
"""
|
|
4509.3.18
by Martin Pool
RepoFetcher relies on Repository.fetch to shortcircuit no-op fetches |
61 |
# repository.fetch has the responsibility for short-circuiting
|
62 |
# attempts to copy between a repository and itself.
|
|
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
63 |
self.to_repository = to_repository |
64 |
self.from_repository = from_repository |
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
65 |
self.sink = to_repository._get_sink() |
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
66 |
# must not mutate self._last_revision as its potentially a shared instance
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
67 |
self._last_revision = last_revision |
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
68 |
self._fetch_spec = fetch_spec |
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
69 |
self.find_ghosts = find_ghosts |
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
70 |
self.from_repository.lock_read() |
4110.2.22
by Martin Pool
Re-add mutter calls during fetch |
71 |
mutter("Using fetch logic to copy between %s(%s) and %s(%s)", |
72 |
self.from_repository, self.from_repository._format, |
|
73 |
self.to_repository, self.to_repository._format) |
|
3842.3.5
by Andrew Bennetts
Remove some debugging cruft, make more tests pass. |
74 |
try: |
4110.2.3
by Martin Pool
Remove redundant variable from fetch. |
75 |
self.__fetch() |
3842.3.5
by Andrew Bennetts
Remove some debugging cruft, make more tests pass. |
76 |
finally: |
77 |
self.from_repository.unlock() |
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
78 |
|
79 |
def __fetch(self): |
|
80 |
"""Primary worker function.
|
|
81 |
||
3943.8.1
by Marius Kruger
remove all trailing whitespace from bzr source |
82 |
This initialises all the needed variables, and then fetches the
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
83 |
requested revisions, finally clearing the progress bar.
|
84 |
"""
|
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
85 |
# Roughly this is what we're aiming for fetch to become:
|
86 |
#
|
|
87 |
# missing = self.sink.insert_stream(self.source.get_stream(search))
|
|
88 |
# if missing:
|
|
89 |
# missing = self.sink.insert_stream(self.source.get_items(missing))
|
|
90 |
# assert not missing
|
|
1240
by Martin Pool
- clean up fetch code and add progress bar |
91 |
self.count_total = 0 |
1185.33.55
by Martin Pool
[patch] weave fetch optimizations (Goffredo Baroncelli) |
92 |
self.file_ids_names = {} |
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
93 |
pb = ui.ui_factory.nested_progress_bar() |
4110.2.14
by Martin Pool
Small fetch progress tweaks |
94 |
pb.show_pct = pb.show_count = False |
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
95 |
try: |
4110.2.14
by Martin Pool
Small fetch progress tweaks |
96 |
pb.update("Finding revisions", 0, 2) |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
97 |
search_result = self._revids_to_fetch() |
98 |
mutter('fetching: %s', search_result) |
|
99 |
if search_result.is_empty(): |
|
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
100 |
return
|
4110.2.14
by Martin Pool
Small fetch progress tweaks |
101 |
pb.update("Fetching revisions", 1, 2) |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
102 |
self._fetch_everything_for_search(search_result) |
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
103 |
finally: |
104 |
pb.finished() |
|
2535.3.6
by Andrew Bennetts
Move some "what repo data to fetch logic" from RepoFetcher to Repository. |
105 |
|
4110.2.6
by Martin Pool
Remove more progressbar cruft from fetch |
106 |
def _fetch_everything_for_search(self, search): |
2535.3.6
by Andrew Bennetts
Move some "what repo data to fetch logic" from RepoFetcher to Repository. |
107 |
"""Fetch all data for the given set of revisions."""
|
2535.3.9
by Andrew Bennetts
More comments. |
108 |
# The first phase is "file". We pass the progress bar for it directly
|
2668.2.8
by Andrew Bennetts
Rename get_data_to_fetch_for_revision_ids as item_keys_introduced_by. |
109 |
# into item_keys_introduced_by, which has more information about how
|
2535.3.9
by Andrew Bennetts
More comments. |
110 |
# that phase is progressing than we do. Progress updates for the other
|
111 |
# phases are taken care of in this function.
|
|
112 |
# XXX: there should be a clear owner of the progress reporting. Perhaps
|
|
2668.2.8
by Andrew Bennetts
Rename get_data_to_fetch_for_revision_ids as item_keys_introduced_by. |
113 |
# item_keys_introduced_by should have a richer API than it does at the
|
114 |
# moment, so that it can feed the progress information back to this
|
|
2535.3.9
by Andrew Bennetts
More comments. |
115 |
# function?
|
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
116 |
if (self.from_repository._format.rich_root_data and |
117 |
not self.to_repository._format.rich_root_data): |
|
118 |
raise errors.IncompatibleRepositories( |
|
119 |
self.from_repository, self.to_repository, |
|
120 |
"different rich-root support") |
|
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
121 |
pb = ui.ui_factory.nested_progress_bar() |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
122 |
try: |
4110.2.12
by Martin Pool
Add more fetch progress |
123 |
pb.update("Get stream source") |
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
124 |
source = self.from_repository._get_source( |
125 |
self.to_repository._format) |
|
126 |
stream = source.get_stream(search) |
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
127 |
from_format = self.from_repository._format |
4110.2.12
by Martin Pool
Add more fetch progress |
128 |
pb.update("Inserting stream") |
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
129 |
resume_tokens, missing_keys = self.sink.insert_stream( |
5195.3.14
by Parth Malwankar
optimized to use "revisions" insert_record_stream for counting #records |
130 |
stream, from_format, []) |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
131 |
if missing_keys: |
4110.2.12
by Martin Pool
Add more fetch progress |
132 |
pb.update("Missing keys") |
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
133 |
stream = source.get_stream_for_missing_keys(missing_keys) |
4110.2.12
by Martin Pool
Add more fetch progress |
134 |
pb.update("Inserting missing keys") |
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
135 |
resume_tokens, missing_keys = self.sink.insert_stream( |
5195.3.14
by Parth Malwankar
optimized to use "revisions" insert_record_stream for counting #records |
136 |
stream, from_format, resume_tokens) |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
137 |
if missing_keys: |
138 |
raise AssertionError( |
|
139 |
"second push failed to complete a fetch %r." % ( |
|
140 |
missing_keys,)) |
|
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
141 |
if resume_tokens: |
142 |
raise AssertionError( |
|
143 |
"second push failed to commit the fetch %r." % ( |
|
144 |
resume_tokens,)) |
|
4110.2.12
by Martin Pool
Add more fetch progress |
145 |
pb.update("Finishing stream") |
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
146 |
self.sink.finished() |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
147 |
finally: |
4110.2.6
by Martin Pool
Remove more progressbar cruft from fetch |
148 |
pb.finished() |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
149 |
|
1185.65.30
by Robert Collins
Merge integration. |
150 |
def _revids_to_fetch(self): |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
151 |
"""Determines the exact revisions needed from self.from_repository to
|
152 |
install self._last_revision in self.to_repository.
|
|
153 |
||
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
154 |
:returns: A SearchResult of some sort. (Possibly a
|
5535.3.21
by Andrew Bennetts
Cosmetic tweaks. |
155 |
PendingAncestryResult, EmptySearchResult, etc.)
|
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
156 |
"""
|
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
157 |
if self._fetch_spec is not None: |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
158 |
# The fetch spec is already a concrete search result.
|
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
159 |
return self._fetch_spec |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
160 |
elif self._last_revision == NULL_REVISION: |
161 |
# fetch_spec is None + last_revision is null => empty fetch.
|
|
1534.4.50
by Robert Collins
Got the bzrdir api straightened out, plenty of refactoring to use it pending, but the api is up and running. |
162 |
# explicit limit of no revisions needed
|
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
163 |
return graph.EmptySearchResult() |
164 |
elif self._last_revision is not None: |
|
5539.2.10
by Andrew Bennetts
s/NotInOtherForRev/NotInOtherForRevs/, and allow passing multiple revision_ids to search_missing_revision_ids. |
165 |
return graph.NotInOtherForRevs(self.to_repository, |
166 |
self.from_repository, [self._last_revision], |
|
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
167 |
find_ghosts=self.find_ghosts).execute() |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
168 |
else: # self._last_revision is None: |
169 |
return graph.EverythingNotInOther(self.to_repository, |
|
5539.2.19
by Andrew Bennetts
Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name. |
170 |
self.from_repository, |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
171 |
find_ghosts=self.find_ghosts).execute() |
1185.64.3
by Goffredo Baroncelli
This patch changes the fetch code. Before, the original code expanded every inventory and |
172 |
|
3565.3.3
by Robert Collins
* Fetching data between repositories that have the same model but no |
173 |
|
1910.2.24
by Aaron Bentley
Got intra-repository fetch working between model1 and 2 for all types |
174 |
class Inter1and2Helper(object): |
1910.2.48
by Aaron Bentley
Update from review comments |
175 |
"""Helper for operations that convert data from model 1 and 2
|
3943.8.1
by Marius Kruger
remove all trailing whitespace from bzr source |
176 |
|
1910.2.48
by Aaron Bentley
Update from review comments |
177 |
This is for use by fetchers and converters.
|
178 |
"""
|
|
179 |
||
5050.32.1
by Andrew Bennetts
Fix fetching more than 100 revisions from non-rich-root to rich-root repositories. |
180 |
# This is a class variable so that the test suite can override it.
|
181 |
known_graph_threshold = 100 |
|
182 |
||
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
183 |
def __init__(self, source): |
1910.2.48
by Aaron Bentley
Update from review comments |
184 |
"""Constructor.
|
185 |
||
186 |
:param source: The repository data comes from
|
|
187 |
"""
|
|
188 |
self.source = source |
|
189 |
||
190 |
def iter_rev_trees(self, revs): |
|
191 |
"""Iterate through RevisionTrees efficiently.
|
|
192 |
||
193 |
Additionally, the inventory's revision_id is set if unset.
|
|
194 |
||
195 |
Trees are retrieved in batches of 100, and then yielded in the order
|
|
196 |
they were requested.
|
|
197 |
||
198 |
:param revs: A list of revision ids
|
|
199 |
"""
|
|
3172.4.4
by Robert Collins
Review feedback. |
200 |
# In case that revs is not a list.
|
201 |
revs = list(revs) |
|
1910.2.48
by Aaron Bentley
Update from review comments |
202 |
while revs: |
203 |
for tree in self.source.revision_trees(revs[:100]): |
|
1910.2.44
by Aaron Bentley
Retrieve only 500 revision trees at once |
204 |
if tree.inventory.revision_id is None: |
205 |
tree.inventory.revision_id = tree.get_revision_id() |
|
206 |
yield tree |
|
1910.2.48
by Aaron Bentley
Update from review comments |
207 |
revs = revs[100:] |
1910.2.44
by Aaron Bentley
Retrieve only 500 revision trees at once |
208 |
|
3380.2.4
by Aaron Bentley
Updates from review |
209 |
def _find_root_ids(self, revs, parent_map, graph): |
210 |
revision_root = {} |
|
1910.2.48
by Aaron Bentley
Update from review comments |
211 |
for tree in self.iter_rev_trees(revs): |
1910.2.18
by Aaron Bentley
Implement creation of knits for tree roots |
212 |
revision_id = tree.inventory.root.revision |
2946.3.3
by John Arbash Meinel
Prefer tree.get_root_id() as more explicit than tree.path2id('') |
213 |
root_id = tree.get_root_id() |
3380.1.3
by Aaron Bentley
Fix model-change fetching with ghosts and when fetch is resumed |
214 |
revision_root[revision_id] = root_id |
215 |
# Find out which parents we don't already know root ids for
|
|
216 |
parents = set() |
|
217 |
for revision_parents in parent_map.itervalues(): |
|
218 |
parents.update(revision_parents) |
|
219 |
parents.difference_update(revision_root.keys() + [NULL_REVISION]) |
|
3380.2.7
by Aaron Bentley
Update docs |
220 |
# Limit to revisions present in the versionedfile
|
3380.1.3
by Aaron Bentley
Fix model-change fetching with ghosts and when fetch is resumed |
221 |
parents = graph.get_parent_map(parents).keys() |
222 |
for tree in self.iter_rev_trees(parents): |
|
223 |
root_id = tree.get_root_id() |
|
224 |
revision_root[tree.get_revision_id()] = root_id |
|
4476.3.11
by Andrew Bennetts
All fetch and interrepo tests passing. |
225 |
return revision_root |
3380.2.4
by Aaron Bentley
Updates from review |
226 |
|
227 |
def generate_root_texts(self, revs): |
|
228 |
"""Generate VersionedFiles for all root ids.
|
|
229 |
||
230 |
:param revs: the revisions to include
|
|
231 |
"""
|
|
232 |
graph = self.source.get_graph() |
|
233 |
parent_map = graph.get_parent_map(revs) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
234 |
rev_order = tsort.topo_sort(parent_map) |
4476.3.11
by Andrew Bennetts
All fetch and interrepo tests passing. |
235 |
rev_id_to_root_id = self._find_root_ids(revs, parent_map, graph) |
3350.6.4
by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores. |
236 |
root_id_order = [(rev_id_to_root_id[rev_id], rev_id) for rev_id in |
237 |
rev_order] |
|
238 |
# Guaranteed stable, this groups all the file id operations together
|
|
239 |
# retaining topological order within the revisions of a file id.
|
|
240 |
# File id splits and joins would invalidate this, but they don't exist
|
|
241 |
# yet, and are unlikely to in non-rich-root environments anyway.
|
|
242 |
root_id_order.sort(key=operator.itemgetter(0)) |
|
243 |
# Create a record stream containing the roots to create.
|
|
5050.32.1
by Andrew Bennetts
Fix fetching more than 100 revisions from non-rich-root to rich-root repositories. |
244 |
if len(revs) > self.known_graph_threshold: |
245 |
graph = self.source.get_known_graph_ancestry(revs) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
246 |
new_roots_stream = _new_root_data_stream( |
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
247 |
root_id_order, rev_id_to_root_id, parent_map, self.source, graph) |
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
248 |
return [('texts', new_roots_stream)] |
249 |
||
250 |
||
251 |
def _new_root_data_stream( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
252 |
root_keys_to_create, rev_id_to_root_id_map, parent_map, repo, graph=None): |
4476.3.69
by Andrew Bennetts
Elaborate some docstrings. |
253 |
"""Generate a texts substream of synthesised root entries.
|
254 |
||
255 |
Used in fetches that do rich-root upgrades.
|
|
256 |
|
|
257 |
:param root_keys_to_create: iterable of (root_id, rev_id) pairs describing
|
|
258 |
the root entries to create.
|
|
259 |
:param rev_id_to_root_id_map: dict of known rev_id -> root_id mappings for
|
|
260 |
calculating the parents. If a parent rev_id is not found here then it
|
|
261 |
will be recalculated.
|
|
262 |
:param parent_map: a parent map for all the revisions in
|
|
263 |
root_keys_to_create.
|
|
264 |
:param graph: a graph to use instead of repo.get_graph().
|
|
265 |
"""
|
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
266 |
for root_key in root_keys_to_create: |
267 |
root_id, rev_id = root_key |
|
268 |
parent_keys = _parent_keys_for_root_version( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
269 |
root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph) |
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
270 |
yield versionedfile.FulltextContentFactory( |
271 |
root_key, parent_keys, None, '') |
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
272 |
|
273 |
||
274 |
def _parent_keys_for_root_version( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
275 |
root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph=None): |
4476.3.69
by Andrew Bennetts
Elaborate some docstrings. |
276 |
"""Get the parent keys for a given root id.
|
277 |
|
|
278 |
A helper function for _new_root_data_stream.
|
|
279 |
"""
|
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
280 |
# Include direct parents of the revision, but only if they used the same
|
281 |
# root_id and are heads.
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
282 |
rev_parents = parent_map[rev_id] |
283 |
parent_ids = [] |
|
284 |
for parent_id in rev_parents: |
|
285 |
if parent_id == NULL_REVISION: |
|
286 |
continue
|
|
287 |
if parent_id not in rev_id_to_root_id_map: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
288 |
# We probably didn't read this revision, go spend the extra effort
|
289 |
# to actually check
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
290 |
try: |
291 |
tree = repo.revision_tree(parent_id) |
|
292 |
except errors.NoSuchRevision: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
293 |
# Ghost, fill out rev_id_to_root_id in case we encounter this
|
294 |
# again.
|
|
295 |
# But set parent_root_id to None since we don't really know
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
296 |
parent_root_id = None |
297 |
else: |
|
298 |
parent_root_id = tree.get_root_id() |
|
299 |
rev_id_to_root_id_map[parent_id] = None |
|
4476.3.21
by Andrew Bennetts
Clarify some code and comments, and s/1.17/1.18/ in a few places. |
300 |
# XXX: why not:
|
301 |
# rev_id_to_root_id_map[parent_id] = parent_root_id
|
|
302 |
# memory consumption maybe?
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
303 |
else: |
304 |
parent_root_id = rev_id_to_root_id_map[parent_id] |
|
305 |
if root_id == parent_root_id: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
306 |
# With stacking we _might_ want to refer to a non-local revision,
|
307 |
# but this code path only applies when we have the full content
|
|
308 |
# available, so ghosts really are ghosts, not just the edge of
|
|
309 |
# local data.
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
310 |
parent_ids.append(parent_id) |
311 |
else: |
|
312 |
# root_id may be in the parent anyway.
|
|
313 |
try: |
|
314 |
tree = repo.revision_tree(parent_id) |
|
315 |
except errors.NoSuchRevision: |
|
316 |
# ghost, can't refer to it.
|
|
317 |
pass
|
|
318 |
else: |
|
319 |
try: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
320 |
parent_ids.append(tree.inventory[root_id].revision) |
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
321 |
except errors.NoSuchId: |
322 |
# not in the tree
|
|
323 |
pass
|
|
324 |
# Drop non-head parents
|
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
325 |
if graph is None: |
326 |
graph = repo.get_graph() |
|
327 |
heads = graph.heads(parent_ids) |
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
328 |
selected_ids = [] |
329 |
for parent_id in parent_ids: |
|
330 |
if parent_id in heads and parent_id not in selected_ids: |
|
331 |
selected_ids.append(parent_id) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
332 |
parent_keys = [(root_id, parent_id) for parent_id in selected_ids] |
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
333 |
return parent_keys |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
334 |
|
335 |
||
336 |
class TargetRepoKinds(object): |
|
337 |
"""An enum-like set of constants.
|
|
338 |
|
|
339 |
They are the possible values of FetchSpecFactory.target_repo_kinds.
|
|
340 |
"""
|
|
341 |
||
342 |
PREEXISTING = 'preexisting' |
|
343 |
STACKED = 'stacked' |
|
344 |
EMPTY = 'empty' |
|
345 |
||
346 |
||
347 |
class FetchSpecFactory(object): |
|
348 |
"""A helper for building the best fetch spec for a sprout call.
|
|
349 |
||
350 |
Factors that go into determining the sort of fetch to perform:
|
|
351 |
* did the caller specify any revision IDs?
|
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
352 |
* did the caller specify a source branch (need to fetch its
|
353 |
heads_to_fetch(), usually the tip + tags)
|
|
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
354 |
* is there an existing target repo (don't need to refetch revs it
|
355 |
already has)
|
|
356 |
* target is stacked? (similar to pre-existing target repo: even if
|
|
357 |
the target itself is new don't want to refetch existing revs)
|
|
358 |
||
359 |
:ivar source_branch: the source branch if one specified, else None.
|
|
360 |
:ivar source_branch_stop_revision_id: fetch up to this revision of
|
|
361 |
source_branch, rather than its tip.
|
|
362 |
:ivar source_repo: the source repository if one found, else None.
|
|
363 |
:ivar target_repo: the target repository acquired by sprout.
|
|
364 |
:ivar target_repo_kind: one of the TargetRepoKinds constants.
|
|
365 |
"""
|
|
366 |
||
367 |
def __init__(self): |
|
368 |
self._explicit_rev_ids = set() |
|
369 |
self.source_branch = None |
|
370 |
self.source_branch_stop_revision_id = None |
|
371 |
self.source_repo = None |
|
372 |
self.target_repo = None |
|
373 |
self.target_repo_kind = None |
|
374 |
||
375 |
def add_revision_ids(self, revision_ids): |
|
376 |
"""Add revision_ids to the set of revision_ids to be fetched."""
|
|
377 |
self._explicit_rev_ids.update(revision_ids) |
|
378 |
||
379 |
def make_fetch_spec(self): |
|
380 |
"""Build a SearchResult or PendingAncestryResult or etc."""
|
|
381 |
if self.target_repo_kind is None or self.source_repo is None: |
|
382 |
raise AssertionError( |
|
383 |
'Incomplete FetchSpecFactory: %r' % (self.__dict__,)) |
|
384 |
if len(self._explicit_rev_ids) == 0 and self.source_branch is None: |
|
385 |
# Caller hasn't specified any revisions or source branch
|
|
386 |
if self.target_repo_kind == TargetRepoKinds.EMPTY: |
|
387 |
return graph.EverythingResult(self.source_repo) |
|
388 |
else: |
|
389 |
# We want everything not already in the target (or target's
|
|
390 |
# fallbacks).
|
|
391 |
return graph.EverythingNotInOther( |
|
5535.4.25
by Andrew Bennetts
Update some more code paths for the change to only accepting SearchResults to fetch(). |
392 |
self.target_repo, self.source_repo).execute() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
393 |
heads_to_fetch = set(self._explicit_rev_ids) |
394 |
if self.source_branch is not None: |
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
395 |
must_fetch, if_present_fetch = self.source_branch.heads_to_fetch() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
396 |
if self.source_branch_stop_revision_id is not None: |
5672.1.1
by Andrew Bennetts
Refactor some of FetchSpecFactory into new Branch.heads_to_fetch method so that branch implementations like looms can override it. |
397 |
# Replace the tip rev from must_fetch with the stop revision
|
398 |
# XXX: this might be wrong if the tip rev is also in the
|
|
399 |
# must_fetch set for other reasons (e.g. it's the tip of
|
|
400 |
# multiple loom threads?), but then it's pretty unclear what it
|
|
401 |
# should mean to specify a stop_revision in that case anyway.
|
|
402 |
must_fetch.discard(self.source_branch.last_revision()) |
|
403 |
must_fetch.add(self.source_branch_stop_revision_id) |
|
404 |
heads_to_fetch.update(must_fetch) |
|
405 |
else: |
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
406 |
if_present_fetch = set() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
407 |
if self.target_repo_kind == TargetRepoKinds.EMPTY: |
408 |
# PendingAncestryResult does not raise errors if a requested head
|
|
409 |
# is absent. Ideally it would support the
|
|
410 |
# required_ids/if_present_ids distinction, but in practice
|
|
411 |
# heads_to_fetch will almost certainly be present so this doesn't
|
|
412 |
# matter much.
|
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
413 |
all_heads = heads_to_fetch.union(if_present_fetch) |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
414 |
return graph.PendingAncestryResult(all_heads, self.source_repo) |
415 |
return graph.NotInOtherForRevs(self.target_repo, self.source_repo, |
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
416 |
required_ids=heads_to_fetch, if_present_ids=if_present_fetch |
5535.4.25
by Andrew Bennetts
Update some more code paths for the change to only accepting SearchResults to fetch(). |
417 |
).execute() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
418 |
|
419 |