5557.1.7
by John Arbash Meinel
Merge in the bzr.dev 5582 |
1 |
# Copyright (C) 2005-2011 Canonical Ltd
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
2 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
3 |
# This program is free software; you can redistribute it and/or modify
|
4 |
# it under the terms of the GNU General Public License as published by
|
|
5 |
# the Free Software Foundation; either version 2 of the License, or
|
|
6 |
# (at your option) any later version.
|
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
7 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
8 |
# This program is distributed in the hope that it will be useful,
|
9 |
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
10 |
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
11 |
# GNU General Public License for more details.
|
|
1887.1.1
by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines, |
12 |
#
|
974.1.27
by aaron.bentley at utoronto
Initial greedy fetch work |
13 |
# You should have received a copy of the GNU General Public License
|
14 |
# along with this program; if not, write to the Free Software
|
|
4183.7.1
by Sabin Iacob
update FSF mailing address |
15 |
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
1218
by Martin Pool
- fix up import |
16 |
|
1231
by Martin Pool
- more progress on fetch on top of weaves |
17 |
"""Copying of history from one branch to another.
|
18 |
||
19 |
The basic plan is that every branch knows the history of everything
|
|
20 |
that has merged into it. As the first step of a merge, pull, or
|
|
21 |
branch operation we copy history from the source into the destination
|
|
22 |
branch.
|
|
23 |
"""
|
|
24 |
||
6379.6.7
by Jelmer Vernooij
Move importing from future until after doc string, otherwise the doc string will disappear. |
25 |
from __future__ import absolute_import |
26 |
||
3350.6.4
by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores. |
27 |
import operator |
28 |
||
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
29 |
from bzrlib.lazy_import import lazy_import |
30 |
lazy_import(globals(), """ |
|
31 |
from bzrlib import (
|
|
32 |
tsort,
|
|
33 |
versionedfile,
|
|
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
34 |
vf_search,
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
35 |
)
|
36 |
""") |
|
4110.2.4
by Martin Pool
Deprecate passing a pb in to RepoFetcher |
37 |
from bzrlib import ( |
38 |
errors, |
|
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
39 |
ui, |
4110.2.4
by Martin Pool
Deprecate passing a pb in to RepoFetcher |
40 |
)
|
6138.4.2
by Jonathan Riddell
add import gettext |
41 |
from bzrlib.i18n import gettext |
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
42 |
from bzrlib.revision import NULL_REVISION |
2094.3.5
by John Arbash Meinel
Fix imports to ensure modules are loaded before they are used |
43 |
from bzrlib.trace import mutter |
1534.1.31
by Robert Collins
Deprecated fetch.fetch and fetch.greedy_fetch for branch.fetch, and move the Repository.fetch internals to InterRepo and InterWeaveRepo. |
44 |
|
1238
by Martin Pool
- remove a lot of dead code from fetch |
45 |
|
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
46 |
class RepoFetcher(object): |
47 |
"""Pull revisions and texts from one repository to another.
|
|
48 |
||
2592.4.5
by Martin Pool
Add Repository.base on all repositories. |
49 |
This should not be used directly, it's essential a object to encapsulate
|
1534.1.33
by Robert Collins
Move copy_content_into into InterRepository and InterWeaveRepo, and disable the default codepath test as we have optimised paths for all current combinations. |
50 |
the logic in InterRepository.fetch().
|
1260
by Martin Pool
- some updates for fetch/update function |
51 |
"""
|
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
52 |
|
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
53 |
def __init__(self, to_repository, from_repository, last_revision=None, |
4961.2.3
by Martin Pool
Delete deprecated pb parameter to RepoFetcher |
54 |
find_ghosts=True, fetch_spec=None): |
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
55 |
"""Create a repo fetcher.
|
56 |
||
4110.2.2
by Martin Pool
Remove obsolete comments |
57 |
:param last_revision: If set, try to limit to the data this revision
|
58 |
references.
|
|
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
59 |
:param fetch_spec: A SearchResult specifying which revisions to fetch.
|
60 |
If set, this overrides last_revision.
|
|
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
61 |
:param find_ghosts: If True search the entire history for ghosts.
|
62 |
"""
|
|
4509.3.18
by Martin Pool
RepoFetcher relies on Repository.fetch to shortcircuit no-op fetches |
63 |
# repository.fetch has the responsibility for short-circuiting
|
64 |
# attempts to copy between a repository and itself.
|
|
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
65 |
self.to_repository = to_repository |
66 |
self.from_repository = from_repository |
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
67 |
self.sink = to_repository._get_sink() |
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
68 |
# must not mutate self._last_revision as its potentially a shared instance
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
69 |
self._last_revision = last_revision |
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
70 |
self._fetch_spec = fetch_spec |
3172.4.1
by Robert Collins
* Fetching via bzr+ssh will no longer fill ghosts by default (this is |
71 |
self.find_ghosts = find_ghosts |
1534.4.41
by Robert Collins
Branch now uses BzrDir reasonably sanely. |
72 |
self.from_repository.lock_read() |
4110.2.22
by Martin Pool
Re-add mutter calls during fetch |
73 |
mutter("Using fetch logic to copy between %s(%s) and %s(%s)", |
74 |
self.from_repository, self.from_repository._format, |
|
75 |
self.to_repository, self.to_repository._format) |
|
3842.3.5
by Andrew Bennetts
Remove some debugging cruft, make more tests pass. |
76 |
try: |
4110.2.3
by Martin Pool
Remove redundant variable from fetch. |
77 |
self.__fetch() |
3842.3.5
by Andrew Bennetts
Remove some debugging cruft, make more tests pass. |
78 |
finally: |
79 |
self.from_repository.unlock() |
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
80 |
|
81 |
def __fetch(self): |
|
82 |
"""Primary worker function.
|
|
83 |
||
3943.8.1
by Marius Kruger
remove all trailing whitespace from bzr source |
84 |
This initialises all the needed variables, and then fetches the
|
1185.65.27
by Robert Collins
Tweak storage towards mergability. |
85 |
requested revisions, finally clearing the progress bar.
|
86 |
"""
|
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
87 |
# Roughly this is what we're aiming for fetch to become:
|
88 |
#
|
|
89 |
# missing = self.sink.insert_stream(self.source.get_stream(search))
|
|
90 |
# if missing:
|
|
91 |
# missing = self.sink.insert_stream(self.source.get_items(missing))
|
|
92 |
# assert not missing
|
|
1240
by Martin Pool
- clean up fetch code and add progress bar |
93 |
self.count_total = 0 |
1185.33.55
by Martin Pool
[patch] weave fetch optimizations (Goffredo Baroncelli) |
94 |
self.file_ids_names = {} |
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
95 |
pb = ui.ui_factory.nested_progress_bar() |
4110.2.14
by Martin Pool
Small fetch progress tweaks |
96 |
pb.show_pct = pb.show_count = False |
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
97 |
try: |
6138.4.1
by Jonathan Riddell
add gettext to progress bar strings |
98 |
pb.update(gettext("Finding revisions"), 0, 2) |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
99 |
search_result = self._revids_to_fetch() |
100 |
mutter('fetching: %s', search_result) |
|
101 |
if search_result.is_empty(): |
|
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
102 |
return
|
6138.4.1
by Jonathan Riddell
add gettext to progress bar strings |
103 |
pb.update(gettext("Fetching revisions"), 1, 2) |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
104 |
self._fetch_everything_for_search(search_result) |
4110.2.9
by Martin Pool
Re-add very basic top-level pb for fetch |
105 |
finally: |
106 |
pb.finished() |
|
2535.3.6
by Andrew Bennetts
Move some "what repo data to fetch logic" from RepoFetcher to Repository. |
107 |
|
4110.2.6
by Martin Pool
Remove more progressbar cruft from fetch |
108 |
def _fetch_everything_for_search(self, search): |
2535.3.6
by Andrew Bennetts
Move some "what repo data to fetch logic" from RepoFetcher to Repository. |
109 |
"""Fetch all data for the given set of revisions."""
|
2535.3.9
by Andrew Bennetts
More comments. |
110 |
# The first phase is "file". We pass the progress bar for it directly
|
2668.2.8
by Andrew Bennetts
Rename get_data_to_fetch_for_revision_ids as item_keys_introduced_by. |
111 |
# into item_keys_introduced_by, which has more information about how
|
2535.3.9
by Andrew Bennetts
More comments. |
112 |
# that phase is progressing than we do. Progress updates for the other
|
113 |
# phases are taken care of in this function.
|
|
114 |
# XXX: there should be a clear owner of the progress reporting. Perhaps
|
|
2668.2.8
by Andrew Bennetts
Rename get_data_to_fetch_for_revision_ids as item_keys_introduced_by. |
115 |
# item_keys_introduced_by should have a richer API than it does at the
|
116 |
# moment, so that it can feed the progress information back to this
|
|
2535.3.9
by Andrew Bennetts
More comments. |
117 |
# function?
|
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
118 |
if (self.from_repository._format.rich_root_data and |
119 |
not self.to_repository._format.rich_root_data): |
|
120 |
raise errors.IncompatibleRepositories( |
|
121 |
self.from_repository, self.to_repository, |
|
122 |
"different rich-root support") |
|
4819.2.4
by John Arbash Meinel
Factor out the common code into a helper so that smart streaming also benefits. |
123 |
pb = ui.ui_factory.nested_progress_bar() |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
124 |
try: |
4110.2.12
by Martin Pool
Add more fetch progress |
125 |
pb.update("Get stream source") |
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
126 |
source = self.from_repository._get_source( |
127 |
self.to_repository._format) |
|
128 |
stream = source.get_stream(search) |
|
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
129 |
from_format = self.from_repository._format |
4110.2.12
by Martin Pool
Add more fetch progress |
130 |
pb.update("Inserting stream") |
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
131 |
resume_tokens, missing_keys = self.sink.insert_stream( |
5195.3.14
by Parth Malwankar
optimized to use "revisions" insert_record_stream for counting #records |
132 |
stream, from_format, []) |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
133 |
if missing_keys: |
4110.2.12
by Martin Pool
Add more fetch progress |
134 |
pb.update("Missing keys") |
4060.1.3
by Robert Collins
Implement the separate source component for fetch - repository.StreamSource. |
135 |
stream = source.get_stream_for_missing_keys(missing_keys) |
4110.2.12
by Martin Pool
Add more fetch progress |
136 |
pb.update("Inserting missing keys") |
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
137 |
resume_tokens, missing_keys = self.sink.insert_stream( |
5195.3.14
by Parth Malwankar
optimized to use "revisions" insert_record_stream for counting #records |
138 |
stream, from_format, resume_tokens) |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
139 |
if missing_keys: |
140 |
raise AssertionError( |
|
141 |
"second push failed to complete a fetch %r." % ( |
|
142 |
missing_keys,)) |
|
4032.3.7
by Robert Collins
Move write locking and write group responsibilities into the Sink objects themselves, allowing complete avoidance of unnecessary calls when the sink is a RemoteSink. |
143 |
if resume_tokens: |
144 |
raise AssertionError( |
|
145 |
"second push failed to commit the fetch %r." % ( |
|
146 |
resume_tokens,)) |
|
4110.2.12
by Martin Pool
Add more fetch progress |
147 |
pb.update("Finishing stream") |
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
148 |
self.sink.finished() |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
149 |
finally: |
4110.2.6
by Martin Pool
Remove more progressbar cruft from fetch |
150 |
pb.finished() |
4029.2.1
by Robert Collins
Support streaming push to stacked branches. |
151 |
|
1185.65.30
by Robert Collins
Merge integration. |
152 |
def _revids_to_fetch(self): |
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
153 |
"""Determines the exact revisions needed from self.from_repository to
|
154 |
install self._last_revision in self.to_repository.
|
|
155 |
||
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
156 |
:returns: A SearchResult of some sort. (Possibly a
|
5535.3.21
by Andrew Bennetts
Cosmetic tweaks. |
157 |
PendingAncestryResult, EmptySearchResult, etc.)
|
2535.3.7
by Andrew Bennetts
Remove now unused _fetch_weave_texts, make progress reporting closer to how it was before I refactored __fetch. |
158 |
"""
|
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
159 |
if self._fetch_spec is not None: |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
160 |
# The fetch spec is already a concrete search result.
|
4070.9.2
by Andrew Bennetts
Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations. |
161 |
return self._fetch_spec |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
162 |
elif self._last_revision == NULL_REVISION: |
163 |
# fetch_spec is None + last_revision is null => empty fetch.
|
|
1534.4.50
by Robert Collins
Got the bzrdir api straightened out, plenty of refactoring to use it pending, but the api is up and running. |
164 |
# explicit limit of no revisions needed
|
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
165 |
return vf_search.EmptySearchResult() |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
166 |
elif self._last_revision is not None: |
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
167 |
return vf_search.NotInOtherForRevs(self.to_repository, |
5539.2.10
by Andrew Bennetts
s/NotInOtherForRev/NotInOtherForRevs/, and allow passing multiple revision_ids to search_missing_revision_ids. |
168 |
self.from_repository, [self._last_revision], |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
169 |
find_ghosts=self.find_ghosts).execute() |
5539.2.8
by Andrew Bennetts
Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'. |
170 |
else: # self._last_revision is None: |
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
171 |
return vf_search.EverythingNotInOther(self.to_repository, |
5539.2.19
by Andrew Bennetts
Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name. |
172 |
self.from_repository, |
5536.3.1
by Andrew Bennetts
Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch. |
173 |
find_ghosts=self.find_ghosts).execute() |
1185.64.3
by Goffredo Baroncelli
This patch changes the fetch code. Before, the original code expanded every inventory and |
174 |
|
3565.3.3
by Robert Collins
* Fetching data between repositories that have the same model but no |
175 |
|
1910.2.24
by Aaron Bentley
Got intra-repository fetch working between model1 and 2 for all types |
176 |
class Inter1and2Helper(object): |
1910.2.48
by Aaron Bentley
Update from review comments |
177 |
"""Helper for operations that convert data from model 1 and 2
|
3943.8.1
by Marius Kruger
remove all trailing whitespace from bzr source |
178 |
|
1910.2.48
by Aaron Bentley
Update from review comments |
179 |
This is for use by fetchers and converters.
|
180 |
"""
|
|
181 |
||
5050.32.1
by Andrew Bennetts
Fix fetching more than 100 revisions from non-rich-root to rich-root repositories. |
182 |
# This is a class variable so that the test suite can override it.
|
183 |
known_graph_threshold = 100 |
|
184 |
||
4022.1.1
by Robert Collins
Refactoring of fetch to have a sender and sink component enabling splitting the logic over a network stream. (Robert Collins, Andrew Bennetts) |
185 |
def __init__(self, source): |
1910.2.48
by Aaron Bentley
Update from review comments |
186 |
"""Constructor.
|
187 |
||
188 |
:param source: The repository data comes from
|
|
189 |
"""
|
|
190 |
self.source = source |
|
191 |
||
192 |
def iter_rev_trees(self, revs): |
|
193 |
"""Iterate through RevisionTrees efficiently.
|
|
194 |
||
195 |
Additionally, the inventory's revision_id is set if unset.
|
|
196 |
||
197 |
Trees are retrieved in batches of 100, and then yielded in the order
|
|
198 |
they were requested.
|
|
199 |
||
200 |
:param revs: A list of revision ids
|
|
201 |
"""
|
|
3172.4.4
by Robert Collins
Review feedback. |
202 |
# In case that revs is not a list.
|
203 |
revs = list(revs) |
|
1910.2.48
by Aaron Bentley
Update from review comments |
204 |
while revs: |
205 |
for tree in self.source.revision_trees(revs[:100]): |
|
6405.2.6
by Jelmer Vernooij
Lots of test fixes. |
206 |
if tree.root_inventory.revision_id is None: |
207 |
tree.root_inventory.revision_id = tree.get_revision_id() |
|
1910.2.44
by Aaron Bentley
Retrieve only 500 revision trees at once |
208 |
yield tree |
1910.2.48
by Aaron Bentley
Update from review comments |
209 |
revs = revs[100:] |
1910.2.44
by Aaron Bentley
Retrieve only 500 revision trees at once |
210 |
|
3380.2.4
by Aaron Bentley
Updates from review |
211 |
def _find_root_ids(self, revs, parent_map, graph): |
212 |
revision_root = {} |
|
1910.2.48
by Aaron Bentley
Update from review comments |
213 |
for tree in self.iter_rev_trees(revs): |
2946.3.3
by John Arbash Meinel
Prefer tree.get_root_id() as more explicit than tree.path2id('') |
214 |
root_id = tree.get_root_id() |
6405.2.6
by Jelmer Vernooij
Lots of test fixes. |
215 |
revision_id = tree.get_file_revision(root_id, u"") |
3380.1.3
by Aaron Bentley
Fix model-change fetching with ghosts and when fetch is resumed |
216 |
revision_root[revision_id] = root_id |
217 |
# Find out which parents we don't already know root ids for
|
|
218 |
parents = set() |
|
219 |
for revision_parents in parent_map.itervalues(): |
|
220 |
parents.update(revision_parents) |
|
221 |
parents.difference_update(revision_root.keys() + [NULL_REVISION]) |
|
3380.2.7
by Aaron Bentley
Update docs |
222 |
# Limit to revisions present in the versionedfile
|
3380.1.3
by Aaron Bentley
Fix model-change fetching with ghosts and when fetch is resumed |
223 |
parents = graph.get_parent_map(parents).keys() |
224 |
for tree in self.iter_rev_trees(parents): |
|
225 |
root_id = tree.get_root_id() |
|
226 |
revision_root[tree.get_revision_id()] = root_id |
|
4476.3.11
by Andrew Bennetts
All fetch and interrepo tests passing. |
227 |
return revision_root |
3380.2.4
by Aaron Bentley
Updates from review |
228 |
|
229 |
def generate_root_texts(self, revs): |
|
230 |
"""Generate VersionedFiles for all root ids.
|
|
231 |
||
232 |
:param revs: the revisions to include
|
|
233 |
"""
|
|
234 |
graph = self.source.get_graph() |
|
235 |
parent_map = graph.get_parent_map(revs) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
236 |
rev_order = tsort.topo_sort(parent_map) |
4476.3.11
by Andrew Bennetts
All fetch and interrepo tests passing. |
237 |
rev_id_to_root_id = self._find_root_ids(revs, parent_map, graph) |
3350.6.4
by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores. |
238 |
root_id_order = [(rev_id_to_root_id[rev_id], rev_id) for rev_id in |
239 |
rev_order] |
|
240 |
# Guaranteed stable, this groups all the file id operations together
|
|
241 |
# retaining topological order within the revisions of a file id.
|
|
242 |
# File id splits and joins would invalidate this, but they don't exist
|
|
243 |
# yet, and are unlikely to in non-rich-root environments anyway.
|
|
244 |
root_id_order.sort(key=operator.itemgetter(0)) |
|
245 |
# Create a record stream containing the roots to create.
|
|
5050.32.1
by Andrew Bennetts
Fix fetching more than 100 revisions from non-rich-root to rich-root repositories. |
246 |
if len(revs) > self.known_graph_threshold: |
247 |
graph = self.source.get_known_graph_ancestry(revs) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
248 |
new_roots_stream = _new_root_data_stream( |
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
249 |
root_id_order, rev_id_to_root_id, parent_map, self.source, graph) |
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
250 |
return [('texts', new_roots_stream)] |
251 |
||
252 |
||
253 |
def _new_root_data_stream( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
254 |
root_keys_to_create, rev_id_to_root_id_map, parent_map, repo, graph=None): |
4476.3.69
by Andrew Bennetts
Elaborate some docstrings. |
255 |
"""Generate a texts substream of synthesised root entries.
|
256 |
||
257 |
Used in fetches that do rich-root upgrades.
|
|
258 |
|
|
259 |
:param root_keys_to_create: iterable of (root_id, rev_id) pairs describing
|
|
260 |
the root entries to create.
|
|
261 |
:param rev_id_to_root_id_map: dict of known rev_id -> root_id mappings for
|
|
262 |
calculating the parents. If a parent rev_id is not found here then it
|
|
263 |
will be recalculated.
|
|
264 |
:param parent_map: a parent map for all the revisions in
|
|
265 |
root_keys_to_create.
|
|
266 |
:param graph: a graph to use instead of repo.get_graph().
|
|
267 |
"""
|
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
268 |
for root_key in root_keys_to_create: |
269 |
root_id, rev_id = root_key |
|
270 |
parent_keys = _parent_keys_for_root_version( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
271 |
root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph) |
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
272 |
yield versionedfile.FulltextContentFactory( |
273 |
root_key, parent_keys, None, '') |
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
274 |
|
275 |
||
276 |
def _parent_keys_for_root_version( |
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
277 |
root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph=None): |
4476.3.69
by Andrew Bennetts
Elaborate some docstrings. |
278 |
"""Get the parent keys for a given root id.
|
279 |
|
|
280 |
A helper function for _new_root_data_stream.
|
|
281 |
"""
|
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
282 |
# Include direct parents of the revision, but only if they used the same
|
283 |
# root_id and are heads.
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
284 |
rev_parents = parent_map[rev_id] |
285 |
parent_ids = [] |
|
286 |
for parent_id in rev_parents: |
|
287 |
if parent_id == NULL_REVISION: |
|
288 |
continue
|
|
289 |
if parent_id not in rev_id_to_root_id_map: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
290 |
# We probably didn't read this revision, go spend the extra effort
|
291 |
# to actually check
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
292 |
try: |
293 |
tree = repo.revision_tree(parent_id) |
|
294 |
except errors.NoSuchRevision: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
295 |
# Ghost, fill out rev_id_to_root_id in case we encounter this
|
296 |
# again.
|
|
297 |
# But set parent_root_id to None since we don't really know
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
298 |
parent_root_id = None |
299 |
else: |
|
300 |
parent_root_id = tree.get_root_id() |
|
301 |
rev_id_to_root_id_map[parent_id] = None |
|
4476.3.21
by Andrew Bennetts
Clarify some code and comments, and s/1.17/1.18/ in a few places. |
302 |
# XXX: why not:
|
303 |
# rev_id_to_root_id_map[parent_id] = parent_root_id
|
|
304 |
# memory consumption maybe?
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
305 |
else: |
306 |
parent_root_id = rev_id_to_root_id_map[parent_id] |
|
307 |
if root_id == parent_root_id: |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
308 |
# With stacking we _might_ want to refer to a non-local revision,
|
309 |
# but this code path only applies when we have the full content
|
|
310 |
# available, so ghosts really are ghosts, not just the edge of
|
|
311 |
# local data.
|
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
312 |
parent_ids.append(parent_id) |
313 |
else: |
|
314 |
# root_id may be in the parent anyway.
|
|
315 |
try: |
|
316 |
tree = repo.revision_tree(parent_id) |
|
317 |
except errors.NoSuchRevision: |
|
318 |
# ghost, can't refer to it.
|
|
319 |
pass
|
|
320 |
else: |
|
321 |
try: |
|
5793.2.3
by Jelmer Vernooij
Add a RevisionTree.get_file_revision() method. |
322 |
parent_ids.append(tree.get_file_revision(root_id)) |
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
323 |
except errors.NoSuchId: |
324 |
# not in the tree
|
|
325 |
pass
|
|
326 |
# Drop non-head parents
|
|
4476.3.41
by Andrew Bennetts
Use FrozenHeadsCache to speed up root generation. |
327 |
if graph is None: |
328 |
graph = repo.get_graph() |
|
329 |
heads = graph.heads(parent_ids) |
|
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
330 |
selected_ids = [] |
331 |
for parent_id in parent_ids: |
|
332 |
if parent_id in heads and parent_id not in selected_ids: |
|
333 |
selected_ids.append(parent_id) |
|
4476.3.9
by Andrew Bennetts
Further reduce duplication. |
334 |
parent_keys = [(root_id, parent_id) for parent_id in selected_ids] |
4476.3.6
by Andrew Bennetts
Refactor out duplicated get parent keys logic from Inter1and2Helper and InterDifferingSerializer. |
335 |
return parent_keys |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
336 |
|
337 |
||
338 |
class TargetRepoKinds(object): |
|
339 |
"""An enum-like set of constants.
|
|
340 |
|
|
341 |
They are the possible values of FetchSpecFactory.target_repo_kinds.
|
|
342 |
"""
|
|
343 |
||
344 |
PREEXISTING = 'preexisting' |
|
345 |
STACKED = 'stacked' |
|
346 |
EMPTY = 'empty' |
|
347 |
||
348 |
||
349 |
class FetchSpecFactory(object): |
|
350 |
"""A helper for building the best fetch spec for a sprout call.
|
|
351 |
||
352 |
Factors that go into determining the sort of fetch to perform:
|
|
353 |
* did the caller specify any revision IDs?
|
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
354 |
* did the caller specify a source branch (need to fetch its
|
355 |
heads_to_fetch(), usually the tip + tags)
|
|
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
356 |
* is there an existing target repo (don't need to refetch revs it
|
357 |
already has)
|
|
358 |
* target is stacked? (similar to pre-existing target repo: even if
|
|
359 |
the target itself is new don't want to refetch existing revs)
|
|
360 |
||
361 |
:ivar source_branch: the source branch if one specified, else None.
|
|
362 |
:ivar source_branch_stop_revision_id: fetch up to this revision of
|
|
363 |
source_branch, rather than its tip.
|
|
364 |
:ivar source_repo: the source repository if one found, else None.
|
|
365 |
:ivar target_repo: the target repository acquired by sprout.
|
|
366 |
:ivar target_repo_kind: one of the TargetRepoKinds constants.
|
|
367 |
"""
|
|
368 |
||
369 |
def __init__(self): |
|
370 |
self._explicit_rev_ids = set() |
|
371 |
self.source_branch = None |
|
372 |
self.source_branch_stop_revision_id = None |
|
373 |
self.source_repo = None |
|
374 |
self.target_repo = None |
|
375 |
self.target_repo_kind = None |
|
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
376 |
self.limit = None |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
377 |
|
378 |
def add_revision_ids(self, revision_ids): |
|
379 |
"""Add revision_ids to the set of revision_ids to be fetched."""
|
|
380 |
self._explicit_rev_ids.update(revision_ids) |
|
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
381 |
|
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
382 |
def make_fetch_spec(self): |
383 |
"""Build a SearchResult or PendingAncestryResult or etc."""
|
|
384 |
if self.target_repo_kind is None or self.source_repo is None: |
|
385 |
raise AssertionError( |
|
386 |
'Incomplete FetchSpecFactory: %r' % (self.__dict__,)) |
|
387 |
if len(self._explicit_rev_ids) == 0 and self.source_branch is None: |
|
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
388 |
if self.limit is not None: |
5852.1.11
by Jelmer Vernooij
Review feedback from Andrew: use NotImplementedError instead of UnsupportedOperation, use simpler build_commit rather than build_snapshot. |
389 |
raise NotImplementedError( |
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
390 |
"limit is only supported with a source branch set") |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
391 |
# Caller hasn't specified any revisions or source branch
|
392 |
if self.target_repo_kind == TargetRepoKinds.EMPTY: |
|
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
393 |
return vf_search.EverythingResult(self.source_repo) |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
394 |
else: |
395 |
# We want everything not already in the target (or target's
|
|
396 |
# fallbacks).
|
|
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
397 |
return vf_search.EverythingNotInOther( |
5535.4.25
by Andrew Bennetts
Update some more code paths for the change to only accepting SearchResults to fetch(). |
398 |
self.target_repo, self.source_repo).execute() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
399 |
heads_to_fetch = set(self._explicit_rev_ids) |
400 |
if self.source_branch is not None: |
|
5741.1.11
by Jelmer Vernooij
Don't make heads_to_fetch() take a stop_revision. |
401 |
must_fetch, if_present_fetch = self.source_branch.heads_to_fetch() |
402 |
if self.source_branch_stop_revision_id is not None: |
|
403 |
# Replace the tip rev from must_fetch with the stop revision
|
|
404 |
# XXX: this might be wrong if the tip rev is also in the
|
|
405 |
# must_fetch set for other reasons (e.g. it's the tip of
|
|
406 |
# multiple loom threads?), but then it's pretty unclear what it
|
|
407 |
# should mean to specify a stop_revision in that case anyway.
|
|
408 |
must_fetch.discard(self.source_branch.last_revision()) |
|
409 |
must_fetch.add(self.source_branch_stop_revision_id) |
|
5672.1.1
by Andrew Bennetts
Refactor some of FetchSpecFactory into new Branch.heads_to_fetch method so that branch implementations like looms can override it. |
410 |
heads_to_fetch.update(must_fetch) |
411 |
else: |
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
412 |
if_present_fetch = set() |
5535.4.23
by Andrew Bennetts
Move FetchSpecFactory and TargetRepoKinds to bzrlib.fetch (from bzrlib.controldir). |
413 |
if self.target_repo_kind == TargetRepoKinds.EMPTY: |
414 |
# PendingAncestryResult does not raise errors if a requested head
|
|
415 |
# is absent. Ideally it would support the
|
|
416 |
# required_ids/if_present_ids distinction, but in practice
|
|
417 |
# heads_to_fetch will almost certainly be present so this doesn't
|
|
418 |
# matter much.
|
|
5672.1.3
by Andrew Bennetts
Rename a variable, update a docstring. |
419 |
all_heads = heads_to_fetch.union(if_present_fetch) |
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
420 |
ret = vf_search.PendingAncestryResult(all_heads, self.source_repo) |
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
421 |
if self.limit is not None: |
422 |
graph = self.source_repo.get_graph() |
|
423 |
topo_order = list(graph.iter_topo_order(ret.get_keys())) |
|
5852.1.6
by Jelmer Vernooij
Add extra test for Repository.search_missing_revision_ids. |
424 |
result_set = topo_order[:self.limit] |
425 |
ret = self.source_repo.revision_ids_to_search_result(result_set) |
|
426 |
return ret |
|
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
427 |
else: |
6341.1.4
by Jelmer Vernooij
Move more functionality to vf_search. |
428 |
return vf_search.NotInOtherForRevs(self.target_repo, self.source_repo, |
5852.1.5
by Andrew Bennetts, Jelmer Vernooij
Support limit= for fetching between Bazaar branches. |
429 |
required_ids=heads_to_fetch, if_present_ids=if_present_fetch, |
430 |
limit=self.limit).execute() |