~bzr-pqm/bzr/bzr.dev : contents of bzrlib/graph.py at revision 6234.4.1

~bzr-pqm/bzr/bzr.dev : (revision 6234.4.1)

5557.1.7 by John Arbash Meinel Merge in the bzr.dev 5582	1	# Copyright (C) 2007-2011 Canonical Ltd
2490.2.5 by Aaron Bentley Use GraphWalker.unique_ancestor to determine merge base	2	#
	3	# This program is free software; you can redistribute it and/or modify
	4	# it under the terms of the GNU General Public License as published by
	5	# the Free Software Foundation; either version 2 of the License, or
	6	# (at your option) any later version.
	7	#
	8	# This program is distributed in the hope that it will be useful,
	9	# but WITHOUT ANY WARRANTY; without even the implied warranty of
	10	# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	11	# GNU General Public License for more details.
	12	#
	13	# You should have received a copy of the GNU General Public License
	14	# along with this program; if not, write to the Free Software
4183.7.1 by Sabin Iacob update FSF mailing address	15	# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
2490.2.5 by Aaron Bentley Use GraphWalker.unique_ancestor to determine merge base	16
3377.4.5 by John Arbash Meinel Several updates. A bit more debug logging, only step the all_unique searcher 1/10th of the time.	17	import time
	18
2490.2.30 by Aaron Bentley Add functionality for tsorting graphs	19	from bzrlib import (
3377.3.33 by John Arbash Meinel Add some logging with -Dgraph	20	debug,
2490.2.30 by Aaron Bentley Add functionality for tsorting graphs	21	errors,
4574.3.6 by Martin Pool More warnings when failing to load extensions	22	osutils,
3052.1.3 by John Arbash Meinel deprecate revision.is_ancestor, update the callers and the tests.	23	revision,
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	24	trace,
2490.2.30 by Aaron Bentley Add functionality for tsorting graphs	25	)
2490.2.1 by Aaron Bentley Start work on GraphWalker	26
3377.4.9 by John Arbash Meinel STEP every 5	27	STEP_UNIQUE_SEARCHER_EVERY = 5
3377.4.5 by John Arbash Meinel Several updates. A bit more debug logging, only step the all_unique searcher 1/10th of the time.	28
2490.2.25 by Aaron Bentley Update from review	29	# DIAGRAM of terminology
	30	# A
	31	# /\
	32	# B C
	33	# \| \|\
	34	# D E F
	35	# \|\/\| \|
	36	# \|/\\|/
	37	# G H
	38	#
	39	# In this diagram, relative to G and H:
	40	# A, B, C, D, E are common ancestors.
	41	# C, D and E are border ancestors, because each has a non-common descendant.
	42	# D and E are least common ancestors because none of their descendants are
	43	# common ancestors.
	44	# C is not a least common ancestor because its descendant, E, is a common
	45	# ancestor.
	46	#
	47	# The find_unique_lca algorithm will pick A in two steps:
	48	# 1. find_lca('G', 'H') => ['D', 'E']
	49	# 2. Since len(['D', 'E']) > 1, find_lca('D', 'E') => ['A']
	50
	51
2988.1.3 by Robert Collins Add a new repositoy method _generate_text_key_index for use by reconcile/check.	52	class DictParentsProvider(object):
3172.1.2 by Robert Collins Parent Providers should now implement ``get_parent_map`` returning a	53	"""A parents provider for Graph objects."""
2988.1.3 by Robert Collins Add a new repositoy method _generate_text_key_index for use by reconcile/check.	54
	55	def __init__(self, ancestry):
	56	self.ancestry = ancestry
	57
	58	def __repr__(self):
	59	return 'DictParentsProvider(%r)' % self.ancestry
	60
6015.24.8 by John Arbash Meinel Some cleanups suggested by Vincent.	61	# Note: DictParentsProvider does not implement get_cached_parent_map
	62	# Arguably, the data is clearly cached in memory. However, this class
	63	# is mostly used for testing, and it keeps the tests clean to not
	64	# change it.
	65
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	66	def get_parent_map(self, keys):
4379.3.3 by Gary van der Merwe Rename and add doc string for StackedParentsProvider.	67	"""See StackedParentsProvider.get_parent_map"""
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	68	ancestry = self.ancestry
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	69	return dict([(k, ancestry[k]) for k in keys if k in ancestry])
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	70
2490.2.5 by Aaron Bentley Use GraphWalker.unique_ancestor to determine merge base	71
4379.3.3 by Gary van der Merwe Rename and add doc string for StackedParentsProvider.	72	class StackedParentsProvider(object):
	73	"""A parents provider which stacks (or unions) multiple providers.
6015.24.3 by John Arbash Meinel Implement get_cached_parent_map	74
4379.3.3 by Gary van der Merwe Rename and add doc string for StackedParentsProvider.	75	The providers are queries in the order of the provided parent_providers.
	76	"""
6015.24.3 by John Arbash Meinel Implement get_cached_parent_map	77
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	78	def __init__(self, parent_providers):
	79	self._parent_providers = parent_providers
	80
2490.2.28 by Aaron Bentley Fix handling of null revision	81	def __repr__(self):
4379.3.4 by Gary van der Merwe Make StackedParentsProvider.__repr__ more dynamic.	82	return "%s(%r)" % (self.__class__.__name__, self._parent_providers)
2490.2.28 by Aaron Bentley Fix handling of null revision	83
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	84	def get_parent_map(self, keys):
	85	"""Get a mapping of keys => parents
	86
	87	A dictionary is returned with an entry for each key present in this
	88	source. If this source doesn't have information about a key, it should
	89	not include an entry.
	90
	91	[NULL_REVISION] is used as the parent of the first user-committed
	92	revision. Its parent list is empty.
	93
	94	:param keys: An iterable returning keys to check (eg revision_ids)
	95	:return: A dictionary mapping each key to its parents
	96	"""
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	97	found = {}
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	98	remaining = set(keys)
6015.24.9 by John Arbash Meinel Discuss why the extra getattr() calls should be fine.	99	# This adds getattr() overhead to each get_parent_map call. However,
	100	# this is StackedParentsProvider, which means we're dealing with I/O
	101	# (either local indexes, or remote RPCs), so CPU overhead should be
	102	# minimal.
6015.24.3 by John Arbash Meinel Implement get_cached_parent_map	103	for parents_provider in self._parent_providers:
	104	get_cached = getattr(parents_provider, 'get_cached_parent_map',
	105	None)
	106	if get_cached is None:
	107	continue
	108	new_found = get_cached(remaining)
	109	found.update(new_found)
	110	remaining.difference_update(new_found)
	111	if not remaining:
	112	break
	113	if not remaining:
	114	return found
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	115	for parents_provider in self._parent_providers:
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	116	new_found = parents_provider.get_parent_map(remaining)
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	117	found.update(new_found)
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	118	remaining.difference_update(new_found)
	119	if not remaining:
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	120	break
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	121	return found
	122
	123
	124	class CachingParentsProvider(object):
3835.1.13 by Aaron Bentley Update documentation	125	"""A parents provider which will cache the revision => parents as a dict.
	126
	127	This is useful for providers which have an expensive look up.
	128
	129	Either a ParentsProvider or a get_parent_map-like callback may be
	130	supplied. If it provides extra un-asked-for parents, they will be cached,
	131	but filtered out of get_parent_map.
3835.1.16 by Aaron Bentley Updates from review	132
3835.1.16 by Aaron Bentley Updates from review	133	The cache is enabled by default, but may be disabled and re-enabled.
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	134	"""
3896.1.1 by Andrew Bennetts Remove broken debugging cruft, and some unused imports.	135	def __init__(self, parent_provider=None, get_parent_map=None):
3835.1.13 by Aaron Bentley Update documentation	136	"""Constructor.
	137
	138	:param parent_provider: The ParentProvider to use. It or
	139	get_parent_map must be supplied.
	140	:param get_parent_map: The get_parent_map callback to use. It or
	141	parent_provider must be supplied.
	142	"""
3835.1.12 by Aaron Bentley Unify CachingExtraParentsProvider and CachingParentsProvider.	143	self._real_provider = parent_provider
	144	if get_parent_map is None:
	145	self._get_parent_map = self._real_provider.get_parent_map
	146	else:
	147	self._get_parent_map = get_parent_map
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	148	self._cache = None
	149	self.enable_cache(True)
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	150
3835.1.12 by Aaron Bentley Unify CachingExtraParentsProvider and CachingParentsProvider.	151	def __repr__(self):
	152	return "%s(%r)" % (self.__class__.__name__, self._real_provider)
	153
3835.1.15 by Aaron Bentley Allow miss caching to be disabled.	154	def enable_cache(self, cache_misses=True):
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	155	"""Enable cache."""
3835.1.19 by Aaron Bentley Raise exception when caching is enabled twice.	156	if self._cache is not None:
3835.1.20 by Aaron Bentley Change custom error to an AssertionError.	157	raise AssertionError('Cache enabled when already enabled.')
3835.1.16 by Aaron Bentley Updates from review	158	self._cache = {}
3835.1.15 by Aaron Bentley Allow miss caching to be disabled.	159	self._cache_misses = cache_misses
4190.1.4 by Robert Collins Cache ghosts when we can get them from a RemoteRepository in get_parent_map.	160	self.missing_keys = set()
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	161
	162	def disable_cache(self):
3835.1.16 by Aaron Bentley Updates from review	163	"""Disable and clear the cache."""
3835.1.16 by Aaron Bentley Updates from review	164	self._cache = None
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	165	self._cache_misses = None
4190.1.4 by Robert Collins Cache ghosts when we can get them from a RemoteRepository in get_parent_map.	166	self.missing_keys = set()
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	167
	168	def get_cached_map(self):
	169	"""Return any cached get_parent_map values."""
3835.1.16 by Aaron Bentley Updates from review	170	if self._cache is None:
3835.1.12 by Aaron Bentley Unify CachingExtraParentsProvider and CachingParentsProvider.	171	return None
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	172	return dict(self._cache)
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	173
6015.24.3 by John Arbash Meinel Implement get_cached_parent_map	174	def get_cached_parent_map(self, keys):
6015.24.7 by John Arbash Meinel Update some comments and docstrings.	175	"""Return items from the cache.
	176
	177	This returns the same info as get_parent_map, but explicitly does not
	178	invoke the supplied ParentsProvider to search for uncached values.
	179	"""
6015.24.3 by John Arbash Meinel Implement get_cached_parent_map	180	cache = self._cache
	181	if cache is None:
	182	return {}
	183	return dict([(key, cache[key]) for key in keys if key in cache])
	184
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	185	def get_parent_map(self, keys):
4379.3.3 by Gary van der Merwe Rename and add doc string for StackedParentsProvider.	186	"""See StackedParentsProvider.get_parent_map."""
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	187	cache = self._cache
	188	if cache is None:
	189	cache = self._get_parent_map(keys)
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	190	else:
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	191	needed_revisions = set(key for key in keys if key not in cache)
	192	# Do not ask for negatively cached keys
4190.1.4 by Robert Collins Cache ghosts when we can get them from a RemoteRepository in get_parent_map.	193	needed_revisions.difference_update(self.missing_keys)
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	194	if needed_revisions:
	195	parent_map = self._get_parent_map(needed_revisions)
	196	cache.update(parent_map)
	197	if self._cache_misses:
	198	for key in needed_revisions:
	199	if key not in parent_map:
4190.1.4 by Robert Collins Cache ghosts when we can get them from a RemoteRepository in get_parent_map.	200	self.note_missing_key(key)
4190.1.1 by Robert Collins Negatively cache misses during read-locks in RemoteRepository.	201	result = {}
	202	for key in keys:
	203	value = cache.get(key)
	204	if value is not None:
	205	result[key] = value
	206	return result
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	207
4190.1.4 by Robert Collins Cache ghosts when we can get them from a RemoteRepository in get_parent_map.	208	def note_missing_key(self, key):
	209	"""Note that key is a missing key."""
	210	if self._cache_misses:
	211	self.missing_keys.add(key)
	212
3835.1.10 by Aaron Bentley Move CachingExtraParentsProvider to Graph	213
5816.8.1 by Andrew Bennetts Be a little more clever about constructing a parents provider for stacked repositories, so that get_parent_map with local-stacked-on-remote doesn't use HPSS VFS calls.	214	class CallableToParentsProviderAdapter(object):
	215	"""A parents provider that adapts any callable to the parents provider API.
	216
	217	i.e. it accepts calls to self.get_parent_map and relays them to the
	218	callable it was constructed with.
	219	"""
	220
	221	def __init__(self, a_callable):
	222	self.callable = a_callable
	223
	224	def __repr__(self):
	225	return "%s(%r)" % (self.__class__.__name__, self.callable)
	226
	227	def get_parent_map(self, keys):
	228	return self.callable(keys)
	229
	230
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	231	class Graph(object):
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	232	"""Provide incremental access to revision graphs.
	233
	234	This is the generic implementation; it is intended to be subclassed to
	235	specialize it for other repository types.
	236	"""
2490.2.1 by Aaron Bentley Start work on GraphWalker	237
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	238	def __init__(self, parents_provider):
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	239	"""Construct a Graph that uses several graphs as its input
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	240
	241	This should not normally be invoked directly, because there may be
	242	specialized implementations for particular repository types. See
3172.1.2 by Robert Collins Parent Providers should now implement ``get_parent_map`` returning a	243	Repository.get_graph().
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	244
3172.1.2 by Robert Collins Parent Providers should now implement ``get_parent_map`` returning a	245	:param parents_provider: An object providing a get_parent_map call
	246	conforming to the behavior of
	247	StackedParentsProvider.get_parent_map.
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	248	"""
3172.1.2 by Robert Collins Parent Providers should now implement ``get_parent_map`` returning a	249	if getattr(parents_provider, 'get_parents', None) is not None:
	250	self.get_parents = parents_provider.get_parents
	251	if getattr(parents_provider, 'get_parent_map', None) is not None:
	252	self.get_parent_map = parents_provider.get_parent_map
2490.2.29 by Aaron Bentley Make parents provider private	253	self._parents_provider = parents_provider
2490.2.28 by Aaron Bentley Fix handling of null revision	254
2490.2.28 by Aaron Bentley Fix handling of null revision	255	def __repr__(self):
2490.2.29 by Aaron Bentley Make parents provider private	256	return 'Graph(%r)' % self._parents_provider
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	257
	258	def find_lca(self, *revisions):
	259	"""Determine the lowest common ancestors of the provided revisions
	260
	261	A lowest common ancestor is a common ancestor none of whose
	262	descendants are common ancestors. In graphs, unlike trees, there may
	263	be multiple lowest common ancestors.
2490.2.12 by Aaron Bentley Improve documentation	264
2490.2.12 by Aaron Bentley Improve documentation	265	This algorithm has two phases. Phase 1 identifies border ancestors,
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	266	and phase 2 filters border ancestors to determine lowest common
	267	ancestors.
2490.2.12 by Aaron Bentley Improve documentation	268
	269	In phase 1, border ancestors are identified, using a breadth-first
	270	search starting at the bottom of the graph. Searches are stopped
	271	whenever a node or one of its descendants is determined to be common
	272
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	273	In phase 2, the border ancestors are filtered to find the least
2490.2.12 by Aaron Bentley Improve documentation	274	common ancestors. This is done by searching the ancestries of each
	275	border ancestor.
	276
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	277	Phase 2 is perfomed on the principle that a border ancestor that is
	278	not an ancestor of any other border ancestor is a least common
	279	ancestor.
2490.2.12 by Aaron Bentley Improve documentation	280
	281	Searches are stopped when they find a node that is determined to be a
	282	common ancestor of all border ancestors, because this shows that it
	283	cannot be a descendant of any border ancestor.
	284
5891.1.2 by Andrew Bennetts Fix a bunch of docstring formatting nits, making pydoctor a bit happier.	285	The scaling of this operation should be proportional to:
	286
2490.2.12 by Aaron Bentley Improve documentation	287	1. The number of uncommon ancestors
	288	2. The number of border ancestors
	289	3. The length of the shortest path between a border ancestor and an
	290	ancestor of all border ancestors.
2490.2.3 by Aaron Bentley Implement new merge base picker	291	"""
2490.2.23 by Aaron Bentley Adapt find_borders to produce a graph difference	292	border_common, common, sides = self._find_border_ancestors(revisions)
2776.3.1 by Robert Collins * Deprecated method ``find_previous_heads`` on	293	# We may have common ancestors that can be reached from each other.
	294	# - ask for the heads of them to filter it down to only ones that
	295	# cannot be reached from each other - phase 2.
	296	return self.heads(border_common)
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	297
2490.2.23 by Aaron Bentley Adapt find_borders to produce a graph difference	298	def find_difference(self, left_revision, right_revision):
2490.2.25 by Aaron Bentley Update from review	299	"""Determine the graph difference between two revisions"""
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	300	border, common, searchers = self._find_border_ancestors(
2490.2.23 by Aaron Bentley Adapt find_borders to produce a graph difference	301	[left_revision, right_revision])
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	302	self._search_for_extra_common(common, searchers)
	303	left = searchers[0].seen
	304	right = searchers[1].seen
	305	return (left.difference(right), right.difference(left))
2490.2.23 by Aaron Bentley Adapt find_borders to produce a graph difference	306
5365.6.1 by Aaron Bentley Implement find_descendants.	307	def find_descendants(self, old_key, new_key):
	308	"""Find descendants of old_key that are ancestors of new_key."""
	309	child_map = self.get_child_map(self._find_descendant_ancestors(
	310	old_key, new_key))
	311	graph = Graph(DictParentsProvider(child_map))
	312	searcher = graph._make_breadth_first_searcher([old_key])
	313	list(searcher)
	314	return searcher.seen
	315
	316	def _find_descendant_ancestors(self, old_key, new_key):
	317	"""Find ancestors of new_key that may be descendants of old_key."""
	318	stop = self._make_breadth_first_searcher([old_key])
	319	descendants = self._make_breadth_first_searcher([new_key])
	320	for revisions in descendants:
	321	old_stop = stop.seen.intersection(revisions)
	322	descendants.stop_searching_any(old_stop)
	323	seen_stop = descendants.find_seen_ancestors(stop.step())
	324	descendants.stop_searching_any(seen_stop)
	325	return descendants.seen.difference(stop.seen)
	326
	327	def get_child_map(self, keys):
	328	"""Get a mapping from parents to children of the specified keys.
	329
	330	This is simply the inversion of get_parent_map. Only supplied keys
	331	will be discovered as children.
	332	:return: a dict of key:child_list for keys.
	333	"""
	334	parent_map = self._parents_provider.get_parent_map(keys)
	335	parent_child = {}
	336	for child, parents in sorted(parent_map.items()):
	337	for parent in parents:
	338	parent_child.setdefault(parent, []).append(child)
	339	return parent_child
	340
3445.1.4 by John Arbash Meinel Change the function to be called 'find_distance_to_null'	341	def find_distance_to_null(self, target_revision_id, known_revision_ids):
	342	"""Find the left-hand distance to the NULL_REVISION.
	343
	344	(This can also be considered the revno of a branch at
	345	target_revision_id.)
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	346
	347	:param target_revision_id: A revision_id which we would like to know
	348	the revno for.
	349	:param known_revision_ids: [(revision_id, revno)] A list of known
	350	revno, revision_id tuples. We'll use this to seed the search.
	351	"""
	352	# Map from revision_ids to a known value for their revno
	353	known_revnos = dict(known_revision_ids)
	354	cur_tip = target_revision_id
	355	num_steps = 0
	356	NULL_REVISION = revision.NULL_REVISION
3445.1.2 by John Arbash Meinel Handle when a known revision is an ancestor.	357	known_revnos[NULL_REVISION] = 0
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	358
3445.1.3 by John Arbash Meinel Search from all of the known revisions.	359	searching_known_tips = list(known_revnos.keys())
	360
	361	unknown_searched = {}
	362
3445.1.2 by John Arbash Meinel Handle when a known revision is an ancestor.	363	while cur_tip not in known_revnos:
3445.1.3 by John Arbash Meinel Search from all of the known revisions.	364	unknown_searched[cur_tip] = num_steps
	365	num_steps += 1
3445.1.2 by John Arbash Meinel Handle when a known revision is an ancestor.	366	to_search = set([cur_tip])
3445.1.3 by John Arbash Meinel Search from all of the known revisions.	367	to_search.update(searching_known_tips)
3445.1.2 by John Arbash Meinel Handle when a known revision is an ancestor.	368	parent_map = self.get_parent_map(to_search)
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	369	parents = parent_map.get(cur_tip, None)
3445.1.8 by John Arbash Meinel Clarity tweaks recommended by Ian	370	if not parents: # An empty list or None is a ghost
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	371	raise errors.GhostRevisionsHaveNoRevno(target_revision_id,
	372	cur_tip)
	373	cur_tip = parents[0]
3445.1.3 by John Arbash Meinel Search from all of the known revisions.	374	next_known_tips = []
	375	for revision_id in searching_known_tips:
	376	parents = parent_map.get(revision_id, None)
	377	if not parents:
	378	continue
	379	next = parents[0]
	380	next_revno = known_revnos[revision_id] - 1
	381	if next in unknown_searched:
	382	# We have enough information to return a value right now
	383	return next_revno + unknown_searched[next]
	384	if next in known_revnos:
	385	continue
	386	known_revnos[next] = next_revno
	387	next_known_tips.append(next)
	388	searching_known_tips = next_known_tips
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	389
3445.1.2 by John Arbash Meinel Handle when a known revision is an ancestor.	390	# We reached a known revision, so just add in how many steps it took to
	391	# get there.
	392	return known_revnos[cur_tip] + num_steps
3445.1.1 by John Arbash Meinel Start working on a new Graph api to make finding revision numbers faster.	393
4332.3.4 by Robert Collins Add a graph API for getting multiple distances to NULL at once.	394	def find_lefthand_distances(self, keys):
	395	"""Find the distance to null for all the keys in keys.
	396
	397	:param keys: keys to lookup.
	398	:return: A dict key->distance for all of keys.
	399	"""
	400	# Optimisable by concurrent searching, but a random spread should get
	401	# some sort of hit rate.
	402	result = {}
	403	known_revnos = []
4332.3.6 by Robert Collins Teach graph.find_lefthand_distances about ghosts.	404	ghosts = []
4332.3.4 by Robert Collins Add a graph API for getting multiple distances to NULL at once.	405	for key in keys:
4332.3.6 by Robert Collins Teach graph.find_lefthand_distances about ghosts.	406	try:
	407	known_revnos.append(
	408	(key, self.find_distance_to_null(key, known_revnos)))
	409	except errors.GhostRevisionsHaveNoRevno:
	410	ghosts.append(key)
	411	for key in ghosts:
	412	known_revnos.append((key, -1))
4332.3.4 by Robert Collins Add a graph API for getting multiple distances to NULL at once.	413	return dict(known_revnos)
	414
3377.3.21 by John Arbash Meinel Simple brute-force implementation of find_unique_ancestors	415	def find_unique_ancestors(self, unique_revision, common_revisions):
	416	"""Find the unique ancestors for a revision versus others.
	417
	418	This returns the ancestry of unique_revision, excluding all revisions
	419	in the ancestry of common_revisions. If unique_revision is in the
	420	ancestry, then the empty set will be returned.
	421
	422	:param unique_revision: The revision_id whose ancestry we are
	423	interested in.
5891.1.2 by Andrew Bennetts Fix a bunch of docstring formatting nits, making pydoctor a bit happier.	424	(XXX: Would this API be better if we allowed multiple revisions on
	425	to be searched here?)
3377.3.21 by John Arbash Meinel Simple brute-force implementation of find_unique_ancestors	426	:param common_revisions: Revision_ids of ancestries to exclude.
	427	:return: A set of revisions in the ancestry of unique_revision
	428	"""
	429	if unique_revision in common_revisions:
	430	return set()
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	431
	432	# Algorithm description
	433	# 1) Walk backwards from the unique node and all common nodes.
	434	# 2) When a node is seen by both sides, stop searching it in the unique
	435	# walker, include it in the common walker.
	436	# 3) Stop searching when there are no nodes left for the unique walker.
	437	# At this point, you have a maximal set of unique nodes. Some of
	438	# them may actually be common, and you haven't reached them yet.
	439	# 4) Start new searchers for the unique nodes, seeded with the
	440	# information you have so far.
	441	# 5) Continue searching, stopping the common searches when the search
	442	# tip is an ancestor of all unique nodes.
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	443	# 6) Aggregate together unique searchers when they are searching the
	444	# same tips. When all unique searchers are searching the same node,
	445	# stop move it to a single 'all_unique_searcher'.
	446	# 7) The 'all_unique_searcher' represents the very 'tip' of searching.
	447	# Most of the time this produces very little important information.
	448	# So don't step it as quickly as the other searchers.
	449	# 8) Search is done when all common searchers have completed.
	450
	451	unique_searcher, common_searcher = self._find_initial_unique_nodes(
	452	[unique_revision], common_revisions)
	453
	454	unique_nodes = unique_searcher.seen.difference(common_searcher.seen)
	455	if not unique_nodes:
	456	return unique_nodes
	457
	458	(all_unique_searcher,
	459	unique_tip_searchers) = self._make_unique_searchers(unique_nodes,
	460	unique_searcher, common_searcher)
	461
	462	self._refine_unique_nodes(unique_searcher, all_unique_searcher,
	463	unique_tip_searchers, common_searcher)
	464	true_unique_nodes = unique_nodes.difference(common_searcher.seen)
3377.3.33 by John Arbash Meinel Add some logging with -Dgraph	465	if 'graph' in debug.debug_flags:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	466	trace.mutter('Found %d truly unique nodes out of %d',
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	467	len(true_unique_nodes), len(unique_nodes))
	468	return true_unique_nodes
	469
	470	def _find_initial_unique_nodes(self, unique_revisions, common_revisions):
	471	"""Steps 1-3 of find_unique_ancestors.
	472
	473	Find the maximal set of unique nodes. Some of these might actually
	474	still be common, but we are sure that there are no other unique nodes.
	475
	476	:return: (unique_searcher, common_searcher)
	477	"""
	478
	479	unique_searcher = self._make_breadth_first_searcher(unique_revisions)
	480	# we know that unique_revisions aren't in common_revisions, so skip
	481	# past them.
3377.3.27 by John Arbash Meinel some simple updates	482	unique_searcher.next()
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	483	common_searcher = self._make_breadth_first_searcher(common_revisions)
	484
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	485	# As long as we are still finding unique nodes, keep searching
3377.3.27 by John Arbash Meinel some simple updates	486	while unique_searcher._next_query:
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	487	next_unique_nodes = set(unique_searcher.step())
	488	next_common_nodes = set(common_searcher.step())
	489
	490	# Check if either searcher encounters new nodes seen by the other
	491	# side.
	492	unique_are_common_nodes = next_unique_nodes.intersection(
	493	common_searcher.seen)
	494	unique_are_common_nodes.update(
	495	next_common_nodes.intersection(unique_searcher.seen))
	496	if unique_are_common_nodes:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	497	ancestors = unique_searcher.find_seen_ancestors(
	498	unique_are_common_nodes)
3377.4.5 by John Arbash Meinel Several updates. A bit more debug logging, only step the all_unique searcher 1/10th of the time.	499	# TODO: This is a bit overboard, we only really care about
	500	# the ancestors of the tips because the rest we
	501	# already know. This is correct but causes us to
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	502	# search too much ancestry.
3377.3.29 by John Arbash Meinel Revert the _find_any_seen change.	503	ancestors.update(common_searcher.find_seen_ancestors(ancestors))
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	504	unique_searcher.stop_searching_any(ancestors)
	505	common_searcher.start_searching(ancestors)
	506
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	507	return unique_searcher, common_searcher
	508
	509	def _make_unique_searchers(self, unique_nodes, unique_searcher,
	510	common_searcher):
	511	"""Create a searcher for all the unique search tips (step 4).
	512
	513	As a side effect, the common_searcher will stop searching any nodes
	514	that are ancestors of the unique searcher tips.
	515
	516	:return: (all_unique_searcher, unique_tip_searchers)
	517	"""
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	518	unique_tips = self._remove_simple_descendants(unique_nodes,
	519	self.get_parent_map(unique_nodes))
	520
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	521	if len(unique_tips) == 1:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	522	unique_tip_searchers = []
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	523	ancestor_all_unique = unique_searcher.find_seen_ancestors(unique_tips)
	524	else:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	525	unique_tip_searchers = []
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	526	for tip in unique_tips:
	527	revs_to_search = unique_searcher.find_seen_ancestors([tip])
	528	revs_to_search.update(
	529	common_searcher.find_seen_ancestors(revs_to_search))
	530	searcher = self._make_breadth_first_searcher(revs_to_search)
	531	# We don't care about the starting nodes.
	532	searcher._label = tip
	533	searcher.step()
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	534	unique_tip_searchers.append(searcher)
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	535
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	536	ancestor_all_unique = None
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	537	for searcher in unique_tip_searchers:
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	538	if ancestor_all_unique is None:
	539	ancestor_all_unique = set(searcher.seen)
	540	else:
	541	ancestor_all_unique = ancestor_all_unique.intersection(
	542	searcher.seen)
3377.3.33 by John Arbash Meinel Add some logging with -Dgraph	543	# Collapse all the common nodes into a single searcher
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	544	all_unique_searcher = self._make_breadth_first_searcher(
	545	ancestor_all_unique)
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	546	if ancestor_all_unique:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	547	# We've seen these nodes in all the searchers, so we'll just go to
	548	# the next
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	549	all_unique_searcher.step()
	550
	551	# Stop any search tips that are already known as ancestors of the
	552	# unique nodes
	553	stopped_common = common_searcher.stop_searching_any(
	554	common_searcher.find_seen_ancestors(ancestor_all_unique))
	555
	556	total_stopped = 0
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	557	for searcher in unique_tip_searchers:
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	558	total_stopped += len(searcher.stop_searching_any(
	559	searcher.find_seen_ancestors(ancestor_all_unique)))
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	560	if 'graph' in debug.debug_flags:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	561	trace.mutter('For %d unique nodes, created %d + 1 unique searchers'
	562	' (%d stopped search tips, %d common ancestors'
	563	' (%d stopped common)',
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	564	len(unique_nodes), len(unique_tip_searchers),
	565	total_stopped, len(ancestor_all_unique),
	566	len(stopped_common))
	567	return all_unique_searcher, unique_tip_searchers
	568
	569	def _step_unique_and_common_searchers(self, common_searcher,
	570	unique_tip_searchers,
	571	unique_searcher):
3377.4.7 by John Arbash Meinel Small documentation and code wrapping cleanup	572	"""Step all the searchers"""
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	573	newly_seen_common = set(common_searcher.step())
	574	newly_seen_unique = set()
	575	for searcher in unique_tip_searchers:
	576	next = set(searcher.step())
	577	next.update(unique_searcher.find_seen_ancestors(next))
	578	next.update(common_searcher.find_seen_ancestors(next))
	579	for alt_searcher in unique_tip_searchers:
	580	if alt_searcher is searcher:
	581	continue
	582	next.update(alt_searcher.find_seen_ancestors(next))
	583	searcher.start_searching(next)
	584	newly_seen_unique.update(next)
3377.4.8 by John Arbash Meinel Final tweaks from Ian	585	return newly_seen_common, newly_seen_unique
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	586
	587	def _find_nodes_common_to_all_unique(self, unique_tip_searchers,
	588	all_unique_searcher,
	589	newly_seen_unique, step_all_unique):
	590	"""Find nodes that are common to all unique_tip_searchers.
	591
	592	If it is time, step the all_unique_searcher, and add its nodes to the
	593	result.
	594	"""
3377.4.8 by John Arbash Meinel Final tweaks from Ian	595	common_to_all_unique_nodes = newly_seen_unique.copy()
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	596	for searcher in unique_tip_searchers:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	597	common_to_all_unique_nodes.intersection_update(searcher.seen)
3377.4.8 by John Arbash Meinel Final tweaks from Ian	598	common_to_all_unique_nodes.intersection_update(
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	599	all_unique_searcher.seen)
	600	# Step all-unique less frequently than the other searchers.
	601	# In the common case, we don't need to spider out far here, so
	602	# avoid doing extra work.
3377.4.8 by John Arbash Meinel Final tweaks from Ian	603	if step_all_unique:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	604	tstart = time.clock()
	605	nodes = all_unique_searcher.step()
3377.4.8 by John Arbash Meinel Final tweaks from Ian	606	common_to_all_unique_nodes.update(nodes)
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	607	if 'graph' in debug.debug_flags:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	608	tdelta = time.clock() - tstart
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	609	trace.mutter('all_unique_searcher step() took %.3fs'
	610	'for %d nodes (%d total), iteration: %s',
	611	tdelta, len(nodes), len(all_unique_searcher.seen),
	612	all_unique_searcher._iterations)
3377.4.8 by John Arbash Meinel Final tweaks from Ian	613	return common_to_all_unique_nodes
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	614
	615	def _collapse_unique_searchers(self, unique_tip_searchers,
3377.4.8 by John Arbash Meinel Final tweaks from Ian	616	common_to_all_unique_nodes):
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	617	"""Combine searchers that are searching the same tips.
	618
	619	When two searchers are searching the same tips, we can stop one of the
	620	searchers. We also know that the maximal set of common ancestors is the
	621	intersection of the two original searchers.
	622
	623	:return: A list of searchers that are searching unique nodes.
	624	"""
	625	# Filter out searchers that don't actually search different
	626	# nodes. We already have the ancestry intersection for them
	627	unique_search_tips = {}
	628	for searcher in unique_tip_searchers:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	629	stopped = searcher.stop_searching_any(common_to_all_unique_nodes)
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	630	will_search_set = frozenset(searcher._next_query)
	631	if not will_search_set:
	632	if 'graph' in debug.debug_flags:
	633	trace.mutter('Unique searcher %s was stopped.'
	634	' (%s iterations) %d nodes stopped',
	635	searcher._label,
	636	searcher._iterations,
	637	len(stopped))
	638	elif will_search_set not in unique_search_tips:
	639	# This searcher is searching a unique set of nodes, let it
	640	unique_search_tips[will_search_set] = [searcher]
	641	else:
	642	unique_search_tips[will_search_set].append(searcher)
	643	# TODO: it might be possible to collapse searchers faster when they
	644	# only have some search tips in common.
	645	next_unique_searchers = []
	646	for searchers in unique_search_tips.itervalues():
	647	if len(searchers) == 1:
	648	# Searching unique tips, go for it
	649	next_unique_searchers.append(searchers[0])
	650	else:
	651	# These searchers have started searching the same tips, we
	652	# don't need them to cover the same ground. The
	653	# intersection of their ancestry won't change, so create a
	654	# new searcher, combining their histories.
	655	next_searcher = searchers[0]
	656	for searcher in searchers[1:]:
	657	next_searcher.seen.intersection_update(searcher.seen)
	658	if 'graph' in debug.debug_flags:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	659	trace.mutter('Combining %d searchers into a single'
	660	' searcher searching %d nodes with'
	661	' %d ancestry',
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	662	len(searchers),
	663	len(next_searcher._next_query),
	664	len(next_searcher.seen))
	665	next_unique_searchers.append(next_searcher)
	666	return next_unique_searchers
	667
	668	def _refine_unique_nodes(self, unique_searcher, all_unique_searcher,
	669	unique_tip_searchers, common_searcher):
	670	"""Steps 5-8 of find_unique_ancestors.
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	671
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	672	This function returns when common_searcher has stopped searching for
	673	more nodes.
	674	"""
	675	# We step the ancestor_all_unique searcher only every
	676	# STEP_UNIQUE_SEARCHER_EVERY steps.
	677	step_all_unique_counter = 0
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	678	# While we still have common nodes to search
	679	while common_searcher._next_query:
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	680	(newly_seen_common,
	681	newly_seen_unique) = self._step_unique_and_common_searchers(
	682	common_searcher, unique_tip_searchers, unique_searcher)
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	683	# These nodes are common ancestors of all unique nodes
3377.4.8 by John Arbash Meinel Final tweaks from Ian	684	common_to_all_unique_nodes = self._find_nodes_common_to_all_unique(
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	685	unique_tip_searchers, all_unique_searcher, newly_seen_unique,
	686	step_all_unique_counter==0)
	687	step_all_unique_counter = ((step_all_unique_counter + 1)
	688	% STEP_UNIQUE_SEARCHER_EVERY)
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	689
	690	if newly_seen_common:
	691	# If a 'common' node is an ancestor of all unique searchers, we
	692	# can stop searching it.
	693	common_searcher.stop_searching_any(
	694	all_unique_searcher.seen.intersection(newly_seen_common))
3377.4.8 by John Arbash Meinel Final tweaks from Ian	695	if common_to_all_unique_nodes:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	696	common_to_all_unique_nodes.update(
3377.4.7 by John Arbash Meinel Small documentation and code wrapping cleanup	697	common_searcher.find_seen_ancestors(
3377.4.8 by John Arbash Meinel Final tweaks from Ian	698	common_to_all_unique_nodes))
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	699	# The all_unique searcher can start searching the common nodes
	700	# but everyone else can stop.
3377.4.7 by John Arbash Meinel Small documentation and code wrapping cleanup	701	# This is the sort of thing where we would like to not have it
	702	# start_searching all of the nodes, but only mark all of them
	703	# as seen, and have it search only the actual tips. Otherwise
	704	# it is another get_parent_map() traversal for it to figure out
	705	# what we already should know.
3377.4.8 by John Arbash Meinel Final tweaks from Ian	706	all_unique_searcher.start_searching(common_to_all_unique_nodes)
3377.4.8 by John Arbash Meinel Final tweaks from Ian	707	common_searcher.stop_searching_any(common_to_all_unique_nodes)
3377.4.4 by John Arbash Meinel Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.	708
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	709	next_unique_searchers = self._collapse_unique_searchers(
3377.4.8 by John Arbash Meinel Final tweaks from Ian	710	unique_tip_searchers, common_to_all_unique_nodes)
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	711	if len(unique_tip_searchers) != len(next_unique_searchers):
	712	if 'graph' in debug.debug_flags:
3377.4.8 by John Arbash Meinel Final tweaks from Ian	713	trace.mutter('Collapsed %d unique searchers => %d'
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	714	' at %s iterations',
	715	len(unique_tip_searchers),
	716	len(next_unique_searchers),
	717	all_unique_searcher._iterations)
	718	unique_tip_searchers = next_unique_searchers
3377.3.21 by John Arbash Meinel Simple brute-force implementation of find_unique_ancestors	719
3172.1.2 by Robert Collins Parent Providers should now implement ``get_parent_map`` returning a	720	def get_parent_map(self, revisions):
	721	"""Get a map of key:parent_list for revisions.
	722
	723	This implementation delegates to get_parents, for old parent_providers
	724	that do not supply get_parent_map.
	725	"""
	726	result = {}
	727	for rev, parents in self.get_parents(revisions):
	728	if parents is not None:
	729	result[rev] = parents
	730	return result
	731
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	732	def _make_breadth_first_searcher(self, revisions):
	733	return _BreadthFirstSearcher(revisions, self)
	734
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	735	def _find_border_ancestors(self, revisions):
2490.2.12 by Aaron Bentley Improve documentation	736	"""Find common ancestors with at least one uncommon descendant.
	737
	738	Border ancestors are identified using a breadth-first
	739	search starting at the bottom of the graph. Searches are stopped
	740	whenever a node or one of its descendants is determined to be common.
	741
	742	This will scale with the number of uncommon ancestors.
2490.2.25 by Aaron Bentley Update from review	743
	744	As well as the border ancestors, a set of seen common ancestors and a
	745	list of sets of seen ancestors for each input revision is returned.
	746	This allows calculation of graph difference from the results of this
	747	operation.
2490.2.12 by Aaron Bentley Improve documentation	748	"""
2490.2.28 by Aaron Bentley Fix handling of null revision	749	if None in revisions:
2490.2.28 by Aaron Bentley Fix handling of null revision	750	raise errors.InvalidRevisionId(None, self)
2490.2.19 by Aaron Bentley Implement common-ancestor-based culling	751	common_ancestors = set()
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	752	searchers = [self._make_breadth_first_searcher([r])
	753	for r in revisions]
	754	active_searchers = searchers[:]
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	755	border_ancestors = set()
2490.2.19 by Aaron Bentley Implement common-ancestor-based culling	756
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	757	while True:
	758	newly_seen = set()
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	759	for searcher in searchers:
	760	new_ancestors = searcher.step()
	761	if new_ancestors:
	762	newly_seen.update(new_ancestors)
	763	new_common = set()
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	764	for revision in newly_seen:
2490.2.19 by Aaron Bentley Implement common-ancestor-based culling	765	if revision in common_ancestors:
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	766	# Not a border ancestor because it was seen as common
	767	# already
	768	new_common.add(revision)
2490.2.19 by Aaron Bentley Implement common-ancestor-based culling	769	continue
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	770	for searcher in searchers:
	771	if revision not in searcher.seen:
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	772	break
	773	else:
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	774	# This is a border because it is a first common that we see
	775	# after walking for a while.
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	776	border_ancestors.add(revision)
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	777	new_common.add(revision)
	778	if new_common:
	779	for searcher in searchers:
	780	new_common.update(searcher.find_seen_ancestors(new_common))
	781	for searcher in searchers:
	782	searcher.start_searching(new_common)
	783	common_ancestors.update(new_common)
	784
	785	# Figure out what the searchers will be searching next, and if
	786	# there is only 1 set being searched, then we are done searching,
	787	# since all searchers would have to be searching the same data,
	788	# thus it must be in common.
	789	unique_search_sets = set()
	790	for searcher in searchers:
	791	will_search_set = frozenset(searcher._next_query)
	792	if will_search_set not in unique_search_sets:
	793	# This searcher is searching a unique set of nodes, let it
	794	unique_search_sets.add(will_search_set)
	795
	796	if len(unique_search_sets) == 1:
	797	nodes = unique_search_sets.pop()
	798	uncommon_nodes = nodes.difference(common_ancestors)
3376.2.14 by Martin Pool Remove recently-introduced assert statements	799	if uncommon_nodes:
	800	raise AssertionError("Somehow we ended up converging"
	801	" without actually marking them as"
	802	" in common."
	803	"\nStart_nodes: %s"
	804	"\nuncommon_nodes: %s"
	805	% (revisions, uncommon_nodes))
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	806	break
	807	return border_ancestors, common_ancestors, searchers
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	808
2776.3.1 by Robert Collins * Deprecated method ``find_previous_heads`` on	809	def heads(self, keys):
	810	"""Return the heads from amongst keys.
	811
	812	This is done by searching the ancestries of each key. Any key that is
	813	reachable from another key is not returned; all the others are.
	814
	815	This operation scales with the relative depth between any two keys. If
	816	any two keys are completely disconnected all ancestry of both sides
	817	will be retrieved.
	818
	819	:param keys: An iterable of keys.
2776.1.4 by Robert Collins Trivial review feedback changes.	820	:return: A set of the heads. Note that as a set there is no ordering
	821	information. Callers will need to filter their input to create
	822	order if they need it.
2490.2.12 by Aaron Bentley Improve documentation	823	"""
2776.1.4 by Robert Collins Trivial review feedback changes.	824	candidate_heads = set(keys)
3052.5.5 by John Arbash Meinel Special case Graph.heads() for NULL_REVISION rather than is_ancestor.	825	if revision.NULL_REVISION in candidate_heads:
	826	# NULL_REVISION is only a head if it is the only entry
	827	candidate_heads.remove(revision.NULL_REVISION)
	828	if not candidate_heads:
	829	return set([revision.NULL_REVISION])
2850.2.1 by Robert Collins (robertc) Special case the zero-or-no-heads case for Graph.heads(). (Robert Collins)	830	if len(candidate_heads) < 2:
	831	return candidate_heads
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	832	searchers = dict((c, self._make_breadth_first_searcher([c]))
2776.1.4 by Robert Collins Trivial review feedback changes.	833	for c in candidate_heads)
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	834	active_searchers = dict(searchers)
	835	# skip over the actual candidate for each searcher
	836	for searcher in active_searchers.itervalues():
1551.15.81 by Aaron Bentley Remove testing code	837	searcher.next()
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	838	# The common walker finds nodes that are common to two or more of the
	839	# input keys, so that we don't access all history when a currently
	840	# uncommon search point actually meets up with something behind a
	841	# common search point. Common search points do not keep searches
	842	# active; they just allow us to make searches inactive without
	843	# accessing all history.
	844	common_walker = self._make_breadth_first_searcher([])
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	845	while len(active_searchers) > 0:
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	846	ancestors = set()
	847	# advance searches
	848	try:
	849	common_walker.next()
	850	except StopIteration:
2921.3.4 by Robert Collins Review feedback.	851	# No common points being searched at this time.
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	852	pass
1551.15.78 by Aaron Bentley Fix KeyError in filter_candidate_lca	853	for candidate in active_searchers.keys():
	854	try:
	855	searcher = active_searchers[candidate]
	856	except KeyError:
	857	# rare case: we deleted candidate in a previous iteration
	858	# through this for loop, because it was determined to be
	859	# a descendant of another candidate.
	860	continue
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	861	try:
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	862	ancestors.update(searcher.next())
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	863	except StopIteration:
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	864	del active_searchers[candidate]
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	865	continue
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	866	# process found nodes
	867	new_common = set()
	868	for ancestor in ancestors:
	869	if ancestor in candidate_heads:
	870	candidate_heads.remove(ancestor)
	871	del searchers[ancestor]
	872	if ancestor in active_searchers:
	873	del active_searchers[ancestor]
	874	# it may meet up with a known common node
2921.3.4 by Robert Collins Review feedback.	875	if ancestor in common_walker.seen:
	876	# some searcher has encountered our known common nodes:
	877	# just stop it
	878	ancestor_set = set([ancestor])
	879	for searcher in searchers.itervalues():
	880	searcher.stop_searching_any(ancestor_set)
	881	else:
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	882	# or it may have been just reached by all the searchers:
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	883	for searcher in searchers.itervalues():
	884	if ancestor not in searcher.seen:
2490.2.9 by Aaron Bentley Fix minimal common ancestor algorithm for non-minimal perhipheral ancestors	885	break
	886	else:
2921.3.4 by Robert Collins Review feedback.	887	# The final active searcher has just reached this node,
	888	# making it be known as a descendant of all candidates,
	889	# so we can stop searching it, and any seen ancestors
	890	new_common.add(ancestor)
	891	for searcher in searchers.itervalues():
	892	seen_ancestors =\
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	893	searcher.find_seen_ancestors([ancestor])
2921.3.4 by Robert Collins Review feedback.	894	searcher.stop_searching_any(seen_ancestors)
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	895	common_walker.start_searching(new_common)
2776.1.4 by Robert Collins Trivial review feedback changes.	896	return candidate_heads
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	897
3514.2.8 by John Arbash Meinel The insertion ordering into the weave has an impact on conflicts.	898	def find_merge_order(self, tip_revision_id, lca_revision_ids):
	899	"""Find the order that each revision was merged into tip.
	900
	901	This basically just walks backwards with a stack, and walks left-first
	902	until it finds a node to stop.
	903	"""
	904	if len(lca_revision_ids) == 1:
	905	return list(lca_revision_ids)
	906	looking_for = set(lca_revision_ids)
	907	# TODO: Is there a way we could do this "faster" by batching up the
	908	# get_parent_map requests?
	909	# TODO: Should we also be culling the ancestry search right away? We
	910	# could add looking_for to the "stop" list, and walk their
	911	# ancestry in batched mode. The flip side is it might mean we walk a
	912	# lot of "stop" nodes, rather than only the minimum.
	913	# Then again, without it we may trace back into ancestry we could have
	914	# stopped early.
	915	stack = [tip_revision_id]
	916	found = []
	917	stop = set()
	918	while stack and looking_for:
	919	next = stack.pop()
	920	stop.add(next)
	921	if next in looking_for:
	922	found.append(next)
	923	looking_for.remove(next)
	924	if len(looking_for) == 1:
	925	found.append(looking_for.pop())
	926	break
	927	continue
	928	parent_ids = self.get_parent_map([next]).get(next, None)
	929	if not parent_ids: # Ghost, nothing to search here
	930	continue
	931	for parent_id in reversed(parent_ids):
	932	# TODO: (performance) We see the parent at this point, but we
	933	# wait to mark it until later to make sure we get left
	934	# parents before right parents. However, instead of
	935	# waiting until we have traversed enough parents, we
	936	# could instead note that we've found it, and once all
	937	# parents are in the stack, just reverse iterate the
	938	# stack for them.
	939	if parent_id not in stop:
	940	# this will need to be searched
	941	stack.append(parent_id)
	942	stop.add(parent_id)
	943	return found
	944
5365.6.3 by Aaron Bentley Implement find_lefthand_merger.	945	def find_lefthand_merger(self, merged_key, tip_key):
	946	"""Find the first lefthand ancestor of tip_key that merged merged_key.
	947
	948	We do this by first finding the descendants of merged_key, then
	949	walking through the lefthand ancestry of tip_key until we find a key
	950	that doesn't descend from merged_key. Its child is the key that
	951	merged merged_key.
	952
	953	:return: The first lefthand ancestor of tip_key to merge merged_key.
	954	merged_key if it is a lefthand ancestor of tip_key.
	955	None if no ancestor of tip_key merged merged_key.
	956	"""
	957	descendants = self.find_descendants(merged_key, tip_key)
	958	candidate_iterator = self.iter_lefthand_ancestry(tip_key)
	959	last_candidate = None
	960	for candidate in candidate_iterator:
	961	if candidate not in descendants:
	962	return last_candidate
	963	last_candidate = candidate
	964
1551.19.10 by Aaron Bentley Merge now warns when it encounters a criss-cross	965	def find_unique_lca(self, left_revision, right_revision,
	966	count_steps=False):
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	967	"""Find a unique LCA.
	968
	969	Find lowest common ancestors. If there is no unique common
	970	ancestor, find the lowest common ancestors of those ancestors.
	971
	972	Iteration stops when a unique lowest common ancestor is found.
	973	The graph origin is necessarily a unique lowest common ancestor.
2490.2.5 by Aaron Bentley Use GraphWalker.unique_ancestor to determine merge base	974
	975	Note that None is not an acceptable substitute for NULL_REVISION.
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	976	in the input for this method.
1551.19.12 by Aaron Bentley Add documentation for the count_steps parameter of Graph.find_unique_lca	977
	978	:param count_steps: If True, the return value will be a tuple of
	979	(unique_lca, steps) where steps is the number of times that
	980	find_lca was run. If False, only unique_lca is returned.
2490.2.3 by Aaron Bentley Implement new merge base picker	981	"""
2490.2.3 by Aaron Bentley Implement new merge base picker	982	revisions = [left_revision, right_revision]
1551.19.10 by Aaron Bentley Merge now warns when it encounters a criss-cross	983	steps = 0
2490.2.3 by Aaron Bentley Implement new merge base picker	984	while True:
1551.19.10 by Aaron Bentley Merge now warns when it encounters a criss-cross	985	steps += 1
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	986	lca = self.find_lca(*revisions)
	987	if len(lca) == 1:
1551.19.10 by Aaron Bentley Merge now warns when it encounters a criss-cross	988	result = lca.pop()
	989	if count_steps:
	990	return result, steps
	991	else:
	992	return result
2520.4.104 by Aaron Bentley Avoid infinite loop when there is no unique lca	993	if len(lca) == 0:
	994	raise errors.NoCommonAncestor(left_revision, right_revision)
2490.2.13 by Aaron Bentley Update distinct -> lowest, refactor, add ParentsProvider concept	995	revisions = lca
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	996
3228.4.4 by John Arbash Meinel Change iter_ancestry to take a group instead of a single node,	997	def iter_ancestry(self, revision_ids):
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	998	"""Iterate the ancestry of this revision.
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	999
3228.4.4 by John Arbash Meinel Change iter_ancestry to take a group instead of a single node,	1000	:param revision_ids: Nodes to start the search
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1001	:return: Yield tuples mapping a revision_id to its parents for the
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1002	ancestry of revision_id.
3228.4.10 by John Arbash Meinel Respond to abentley's review comments.	1003	Ghosts will be returned with None as their parents, and nodes
3228.4.4 by John Arbash Meinel Change iter_ancestry to take a group instead of a single node,	1004	with no parents will have NULL_REVISION as their only parent. (As
	1005	defined by get_parent_map.)
3228.4.10 by John Arbash Meinel Respond to abentley's review comments.	1006	There will also be a node for (NULL_REVISION, ())
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1007	"""
3228.4.4 by John Arbash Meinel Change iter_ancestry to take a group instead of a single node,	1008	pending = set(revision_ids)
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1009	processed = set()
	1010	while pending:
	1011	processed.update(pending)
	1012	next_map = self.get_parent_map(pending)
	1013	next_pending = set()
	1014	for item in next_map.iteritems():
	1015	yield item
	1016	next_pending.update(p for p in item[1] if p not in processed)
	1017	ghosts = pending.difference(next_map)
	1018	for ghost in ghosts:
3228.4.10 by John Arbash Meinel Respond to abentley's review comments.	1019	yield (ghost, None)
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1020	pending = next_pending
3228.4.2 by John Arbash Meinel Add a Graph.iter_ancestry()	1021
5365.6.2 by Aaron Bentley Extract iter_lefthand_ancestry from Repository.iter_ancestry.	1022	def iter_lefthand_ancestry(self, start_key, stop_keys=None):
	1023	if stop_keys is None:
	1024	stop_keys = ()
	1025	next_key = start_key
	1026	def get_parents(key):
	1027	try:
	1028	return self._parents_provider.get_parent_map([key])[key]
	1029	except KeyError:
	1030	raise errors.RevisionNotPresent(next_key, self)
	1031	while True:
	1032	if next_key in stop_keys:
	1033	return
	1034	parents = get_parents(next_key)
	1035	yield next_key
	1036	if len(parents) == 0:
	1037	return
	1038	else:
	1039	next_key = parents[0]
	1040
2490.2.31 by Aaron Bentley Fix iter_topo_order to permit un-included parents	1041	def iter_topo_order(self, revisions):
2490.2.30 by Aaron Bentley Add functionality for tsorting graphs	1042	"""Iterate through the input revisions in topological order.
	1043
	1044	This sorting only ensures that parents come before their children.
	1045	An ancestor may sort after a descendant if the relationship is not
	1046	visible in the supplied list of revisions.
	1047	"""
4593.5.30 by John Arbash Meinel Move the topo_sort implementation into KnownGraph, rather than calling back to it.	1048	from bzrlib import tsort
3099.3.3 by John Arbash Meinel Deprecate get_parents() in favor of get_parent_map()	1049	sorter = tsort.TopoSorter(self.get_parent_map(revisions))
2490.2.34 by Aaron Bentley Update NEWS and change implementation to return an iterator	1050	return sorter.iter_topo_order()
2490.2.30 by Aaron Bentley Add functionality for tsorting graphs	1051
2653.2.1 by Aaron Bentley Implement Graph.is_ancestor	1052	def is_ancestor(self, candidate_ancestor, candidate_descendant):
2653.2.5 by Aaron Bentley Update to clarify algorithm	1053	"""Determine whether a revision is an ancestor of another.
2653.2.5 by Aaron Bentley Update to clarify algorithm	1054
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	1055	We answer this using heads() as heads() has the logic to perform the
3078.2.6 by Ian Clatworthy fix efficiency of local commit detection as recommended by jameinel's review	1056	smallest number of parent lookups to determine the ancestral
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	1057	relationship between N revisions.
2653.2.5 by Aaron Bentley Update to clarify algorithm	1058	"""
2921.3.1 by Robert Collins * Graph ``heads()`` queries have been bugfixed to no longer access all	1059	return set([candidate_descendant]) == self.heads(
	1060	[candidate_ancestor, candidate_descendant])
2653.2.1 by Aaron Bentley Implement Graph.is_ancestor	1061
3921.3.5 by Marius Kruger extract graph.is_between from builtins.cmd_tags.run, and test it	1062	def is_between(self, revid, lower_bound_revid, upper_bound_revid):
	1063	"""Determine whether a revision is between two others.
	1064
	1065	returns true if and only if:
	1066	lower_bound_revid <= revid <= upper_bound_revid
	1067	"""
	1068	return ((upper_bound_revid is None or
	1069	self.is_ancestor(revid, upper_bound_revid)) and
	1070	(lower_bound_revid is None or
	1071	self.is_ancestor(lower_bound_revid, revid)))
	1072
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1073	def _search_for_extra_common(self, common, searchers):
	1074	"""Make sure that unique nodes are genuinely unique.
	1075
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1076	After _find_border_ancestors, all nodes marked "common" are indeed
	1077	common. Some of the nodes considered unique are not, due to history
	1078	shortcuts stopping the searches early.
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1079
	1080	We know that we have searched enough when all common search tips are
	1081	descended from all unique (uncommon) nodes because we know that a node
	1082	cannot be an ancestor of its own ancestor.
	1083
	1084	:param common: A set of common nodes
	1085	:param searchers: The searchers returned from _find_border_ancestors
	1086	:return: None
	1087	"""
	1088	# Basic algorithm...
	1089	# A) The passed in searchers should all be on the same tips, thus
	1090	# they should be considered the "common" searchers.
	1091	# B) We find the difference between the searchers, these are the
	1092	# "unique" nodes for each side.
	1093	# C) We do a quick culling so that we only start searching from the
	1094	# more interesting unique nodes. (A unique ancestor is more
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1095	# interesting than any of its children.)
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1096	# D) We start searching for ancestors common to all unique nodes.
	1097	# E) We have the common searchers stop searching any ancestors of
	1098	# nodes found by (D)
	1099	# F) When there are no more common search tips, we stop
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1100
	1101	# TODO: We need a way to remove unique_searchers when they overlap with
	1102	# other unique searchers.
3376.2.14 by Martin Pool Remove recently-introduced assert statements	1103	if len(searchers) != 2:
	1104	raise NotImplementedError(
	1105	"Algorithm not yet implemented for > 2 searchers")
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1106	common_searchers = searchers
	1107	left_searcher = searchers[0]
	1108	right_searcher = searchers[1]
3377.3.15 by John Arbash Meinel minor update	1109	unique = left_searcher.seen.symmetric_difference(right_searcher.seen)
3377.3.17 by John Arbash Meinel Keep track of the intersection of unique ancestry,	1110	if not unique: # No unique nodes, nothing to do
	1111	return
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1112	total_unique = len(unique)
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1113	unique = self._remove_simple_descendants(unique,
	1114	self.get_parent_map(unique))
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1115	simple_unique = len(unique)
3377.3.14 by John Arbash Meinel Take another tack on _search_for_extra	1116
	1117	unique_searchers = []
	1118	for revision_id in unique:
3377.3.15 by John Arbash Meinel minor update	1119	if revision_id in left_searcher.seen:
3377.3.14 by John Arbash Meinel Take another tack on _search_for_extra	1120	parent_searcher = left_searcher
	1121	else:
	1122	parent_searcher = right_searcher
	1123	revs_to_search = parent_searcher.find_seen_ancestors([revision_id])
	1124	if not revs_to_search: # XXX: This shouldn't be possible
	1125	revs_to_search = [revision_id]
3377.3.15 by John Arbash Meinel minor update	1126	searcher = self._make_breadth_first_searcher(revs_to_search)
	1127	# We don't care about the starting nodes.
	1128	searcher.step()
	1129	unique_searchers.append(searcher)
3377.3.14 by John Arbash Meinel Take another tack on _search_for_extra	1130
3377.3.16 by John Arbash Meinel small cleanups	1131	# possible todo: aggregate the common searchers into a single common
	1132	# searcher, just make sure that we include the nodes into the .seen
	1133	# properties of the original searchers
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1134
3377.3.17 by John Arbash Meinel Keep track of the intersection of unique ancestry,	1135	ancestor_all_unique = None
	1136	for searcher in unique_searchers:
	1137	if ancestor_all_unique is None:
	1138	ancestor_all_unique = set(searcher.seen)
	1139	else:
	1140	ancestor_all_unique = ancestor_all_unique.intersection(
	1141	searcher.seen)
	1142
3377.3.23 by John Arbash Meinel Implement find_unique_ancestors using more explicit graph searching.	1143	trace.mutter('Started %s unique searchers for %s unique revisions',
	1144	simple_unique, total_unique)
3377.3.19 by John Arbash Meinel Start culling unique searchers once they converge.	1145
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1146	while True: # If we have no more nodes we have nothing to do
	1147	newly_seen_common = set()
	1148	for searcher in common_searchers:
	1149	newly_seen_common.update(searcher.step())
	1150	newly_seen_unique = set()
	1151	for searcher in unique_searchers:
	1152	newly_seen_unique.update(searcher.step())
	1153	new_common_unique = set()
	1154	for revision in newly_seen_unique:
	1155	for searcher in unique_searchers:
	1156	if revision not in searcher.seen:
	1157	break
	1158	else:
	1159	# This is a border because it is a first common that we see
	1160	# after walking for a while.
	1161	new_common_unique.add(revision)
	1162	if newly_seen_common:
	1163	# These are nodes descended from one of the 'common' searchers.
	1164	# Make sure all searchers are on the same page
	1165	for searcher in common_searchers:
3377.3.16 by John Arbash Meinel small cleanups	1166	newly_seen_common.update(
3377.3.16 by John Arbash Meinel small cleanups	1167	searcher.find_seen_ancestors(newly_seen_common))
3377.3.14 by John Arbash Meinel Take another tack on _search_for_extra	1168	# We start searching the whole ancestry. It is a bit wasteful,
	1169	# though. We really just want to mark all of these nodes as
	1170	# 'seen' and then start just the tips. However, it requires a
	1171	# get_parent_map() call to figure out the tips anyway, and all
	1172	# redundant requests should be fairly fast.
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1173	for searcher in common_searchers:
	1174	searcher.start_searching(newly_seen_common)
3377.3.13 by John Arbash Meinel Change _search_for_extra_common slightly.	1175
3377.3.17 by John Arbash Meinel Keep track of the intersection of unique ancestry,	1176	# If a 'common' node is an ancestor of all unique searchers, we
3377.3.13 by John Arbash Meinel Change _search_for_extra_common slightly.	1177	# can stop searching it.
3377.3.17 by John Arbash Meinel Keep track of the intersection of unique ancestry,	1178	stop_searching_common = ancestor_all_unique.intersection(
	1179	newly_seen_common)
3377.3.13 by John Arbash Meinel Change _search_for_extra_common slightly.	1180	if stop_searching_common:
	1181	for searcher in common_searchers:
	1182	searcher.stop_searching_any(stop_searching_common)
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1183	if new_common_unique:
3377.3.20 by John Arbash Meinel comment cleanups.	1184	# We found some ancestors that are common
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1185	for searcher in unique_searchers:
3377.3.16 by John Arbash Meinel small cleanups	1186	new_common_unique.update(
3377.3.16 by John Arbash Meinel small cleanups	1187	searcher.find_seen_ancestors(new_common_unique))
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1188	# Since these are common, we can grab another set of ancestors
	1189	# that we have seen
	1190	for searcher in common_searchers:
3377.3.16 by John Arbash Meinel small cleanups	1191	new_common_unique.update(
3377.3.16 by John Arbash Meinel small cleanups	1192	searcher.find_seen_ancestors(new_common_unique))
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1193
	1194	# We can tell all of the unique searchers to start at these
	1195	# nodes, and tell all of the common searchers to stop
	1196	# searching these nodes
	1197	for searcher in unique_searchers:
	1198	searcher.start_searching(new_common_unique)
	1199	for searcher in common_searchers:
	1200	searcher.stop_searching_any(new_common_unique)
3377.3.17 by John Arbash Meinel Keep track of the intersection of unique ancestry,	1201	ancestor_all_unique.update(new_common_unique)
3377.3.19 by John Arbash Meinel Start culling unique searchers once they converge.	1202
3377.3.20 by John Arbash Meinel comment cleanups.	1203	# Filter out searchers that don't actually search different
3377.3.20 by John Arbash Meinel comment cleanups.	1204	# nodes. We already have the ancestry intersection for them
3377.3.19 by John Arbash Meinel Start culling unique searchers once they converge.	1205	next_unique_searchers = []
	1206	unique_search_sets = set()
	1207	for searcher in unique_searchers:
	1208	will_search_set = frozenset(searcher._next_query)
	1209	if will_search_set not in unique_search_sets:
	1210	# This searcher is searching a unique set of nodes, let it
	1211	unique_search_sets.add(will_search_set)
	1212	next_unique_searchers.append(searcher)
	1213	unique_searchers = next_unique_searchers
3377.3.2 by John Arbash Meinel find_difference is fixed by updating _find_border_ancestors.... is that reasonable?	1214	for searcher in common_searchers:
	1215	if searcher._next_query:
	1216	break
	1217	else:
	1218	# All common searcher have stopped searching
3377.3.16 by John Arbash Meinel small cleanups	1219	return
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1220
	1221	def _remove_simple_descendants(self, revisions, parent_map):
	1222	"""remove revisions which are children of other ones in the set
	1223
	1224	This doesn't do any graph searching, it just checks the immediate
	1225	parent_map to find if there are any children which can be removed.
	1226
	1227	:param revisions: A set of revision_ids
	1228	:return: A set of revision_ids with the children removed
	1229	"""
	1230	simple_ancestors = revisions.copy()
	1231	# TODO: jam 20071214 we could restrict it to searching only the
	1232	# parent_map of revisions already present in 'revisions', but
	1233	# considering the general use case, I think this is actually
	1234	# better.
	1235
	1236	# This is the same as the following loop. I don't know that it is any
	1237	# faster.
	1238	## simple_ancestors.difference_update(r for r, p_ids in parent_map.iteritems()
	1239	## if p_ids is not None and revisions.intersection(p_ids))
	1240	## return simple_ancestors
	1241
	1242	# Yet Another Way, invert the parent map (which can be cached)
	1243	## descendants = {}
	1244	## for revision_id, parent_ids in parent_map.iteritems():
	1245	## for p_id in parent_ids:
	1246	## descendants.setdefault(p_id, []).append(revision_id)
	1247	## for revision in revisions.intersection(descendants):
	1248	## simple_ancestors.difference_update(descendants[revision])
	1249	## return simple_ancestors
	1250	for revision, parent_ids in parent_map.iteritems():
	1251	if parent_ids is None:
	1252	continue
	1253	for parent_id in parent_ids:
	1254	if parent_id in revisions:
	1255	# This node has a parent present in the set, so we can
	1256	# remove it
	1257	simple_ancestors.discard(revision)
	1258	break
	1259	return simple_ancestors
	1260
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1261
2911.4.1 by Robert Collins Factor out the Graph.heads() cache from _RevisionTextVersionCache for reuse, and use it in commit.	1262	class HeadsCache(object):
	1263	"""A cache of results for graph heads calls."""
	1264
	1265	def __init__(self, graph):
	1266	self.graph = graph
	1267	self._heads = {}
	1268
	1269	def heads(self, keys):
	1270	"""Return the heads of keys.
	1271
2911.4.3 by Robert Collins Make the contract of HeadsCache.heads() more clear.	1272	This matches the API of Graph.heads(), specifically the return value is
	1273	a set which can be mutated, and ordering of the input is not preserved
	1274	in the output.
	1275
2911.4.1 by Robert Collins Factor out the Graph.heads() cache from _RevisionTextVersionCache for reuse, and use it in commit.	1276	:see also: Graph.heads.
	1277	:param keys: The keys to calculate heads for.
	1278	:return: A set containing the heads, which may be mutated without
	1279	affecting future lookups.
	1280	"""
2911.4.2 by Robert Collins Make HeadsCache actually work.	1281	keys = frozenset(keys)
2911.4.1 by Robert Collins Factor out the Graph.heads() cache from _RevisionTextVersionCache for reuse, and use it in commit.	1282	try:
	1283	return set(self._heads[keys])
	1284	except KeyError:
	1285	heads = self.graph.heads(keys)
	1286	self._heads[keys] = heads
	1287	return set(heads)
	1288
	1289
3224.1.20 by John Arbash Meinel Reduce the number of cache misses by caching known heads answers	1290	class FrozenHeadsCache(object):
	1291	"""Cache heads() calls, assuming the caller won't modify them."""
	1292
	1293	def __init__(self, graph):
	1294	self.graph = graph
	1295	self._heads = {}
	1296
	1297	def heads(self, keys):
	1298	"""Return the heads of keys.
	1299
3224.1.24 by John Arbash Meinel Fix up docstring since FrozenHeadsCache doesn't let you mutate the result.	1300	Similar to Graph.heads(). The main difference is that the return value
	1301	is a frozen set which cannot be mutated.
3224.1.20 by John Arbash Meinel Reduce the number of cache misses by caching known heads answers	1302
	1303	:see also: Graph.heads.
	1304	:param keys: The keys to calculate heads for.
3224.1.24 by John Arbash Meinel Fix up docstring since FrozenHeadsCache doesn't let you mutate the result.	1305	:return: A frozenset containing the heads.
3224.1.20 by John Arbash Meinel Reduce the number of cache misses by caching known heads answers	1306	"""
	1307	keys = frozenset(keys)
	1308	try:
	1309	return self._heads[keys]
	1310	except KeyError:
	1311	heads = frozenset(self.graph.heads(keys))
	1312	self._heads[keys] = heads
	1313	return heads
	1314
	1315	def cache(self, keys, heads):
	1316	"""Store a known value."""
	1317	self._heads[frozenset(keys)] = frozenset(heads)
	1318
	1319
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	1320	class _BreadthFirstSearcher(object):
2921.3.4 by Robert Collins Review feedback.	1321	"""Parallel search breadth-first the ancestry of revisions.
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	1322
	1323	This class implements the iterator protocol, but additionally
	1324	1. provides a set of seen ancestors, and
	1325	2. allows some ancestries to be unsearched, via stop_searching_any
	1326	"""
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1327
2490.2.22 by Aaron Bentley Rename GraphWalker -> Graph, _AncestryWalker -> _BreadthFirstSearcher	1328	def __init__(self, revisions, parents_provider):
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1329	self._iterations = 0
	1330	self._next_query = set(revisions)
	1331	self.seen = set()
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1332	self._started_keys = set(self._next_query)
	1333	self._stopped_keys = set()
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	1334	self._parents_provider = parents_provider
3177.3.3 by Robert Collins Review feedback.	1335	self._returning = 'next_with_ghosts'
3184.1.2 by Robert Collins Add tests for starting and stopping searches in combination with get_recipe.	1336	self._current_present = set()
	1337	self._current_ghosts = set()
	1338	self._current_parents = {}
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1339
	1340	def __repr__(self):
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1341	if self._iterations:
	1342	prefix = "searching"
3099.3.1 by John Arbash Meinel Implement get_parent_map for ParentProviders	1343	else:
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1344	prefix = "starting"
	1345	search = '%s=%r' % (prefix, list(self._next_query))
	1346	return ('_BreadthFirstSearcher(iterations=%d, %s,'
	1347	' seen=%r)' % (self._iterations, search, list(self.seen)))
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1348
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1349	def get_result(self):
	1350	"""Get a SearchResult for the current state of this searcher.
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1351
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1352	:return: A SearchResult for this search so far. The SearchResult is
	1353	static - the search can be advanced and the search result will not
	1354	be invalidated or altered.
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1355	"""
	1356	if self._returning == 'next':
	1357	# We have to know the current nodes children to be able to list the
	1358	# exclude keys for them. However, while we could have a second
	1359	# look-ahead result buffer and shuffle things around, this method
	1360	# is typically only called once per search - when memoising the
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1361	# results of the search.
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1362	found, ghosts, next, parents = self._do_query(self._next_query)
	1363	# pretend we didn't query: perhaps we should tweak _do_query to be
	1364	# entirely stateless?
	1365	self.seen.difference_update(next)
3184.1.3 by Robert Collins Automatically exclude ghosts.	1366	next_query = next.union(ghosts)
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1367	else:
	1368	next_query = self._next_query
3184.1.5 by Robert Collins Record the number of found revisions for cross checking.	1369	excludes = self._stopped_keys.union(next_query)
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1370	included_keys = self.seen.difference(excludes)
	1371	return SearchResult(self._started_keys, excludes, len(included_keys),
	1372	included_keys)
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1373
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1374	def step(self):
	1375	try:
	1376	return self.next()
	1377	except StopIteration:
	1378	return ()
	1379
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1380	def next(self):
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	1381	"""Return the next ancestors of this revision.
	1382
2490.2.12 by Aaron Bentley Improve documentation	1383	Ancestors are returned in the order they are seen in a breadth-first
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1384	traversal. No ancestor will be returned more than once. Ancestors are
	1385	returned before their parentage is queried, so ghosts and missing
	1386	revisions (including the start revisions) are included in the result.
	1387	This can save a round trip in LCA style calculation by allowing
	1388	convergence to be detected without reading the data for the revision
	1389	the convergence occurs on.
	1390
	1391	:return: A set of revision_ids.
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	1392	"""
3177.3.3 by Robert Collins Review feedback.	1393	if self._returning != 'next':
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1394	# switch to returning the query, not the results.
3177.3.3 by Robert Collins Review feedback.	1395	self._returning = 'next'
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1396	self._iterations += 1
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1397	else:
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1398	self._advance()
	1399	if len(self._next_query) == 0:
	1400	raise StopIteration()
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1401	# We have seen what we're querying at this point as we are returning
	1402	# the query, not the results.
	1403	self.seen.update(self._next_query)
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1404	return self._next_query
	1405
	1406	def next_with_ghosts(self):
	1407	"""Return the next found ancestors, with ghosts split out.
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1408
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1409	Ancestors are returned in the order they are seen in a breadth-first
	1410	traversal. No ancestor will be returned more than once. Ancestors are
3177.3.3 by Robert Collins Review feedback.	1411	returned only after asking for their parents, which allows us to detect
3177.3.3 by Robert Collins Review feedback.	1412	which revisions are ghosts and which are not.
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1413
	1414	:return: A tuple with (present ancestors, ghost ancestors) sets.
	1415	"""
3177.3.3 by Robert Collins Review feedback.	1416	if self._returning != 'next_with_ghosts':
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1417	# switch to returning the results, not the current query.
3177.3.3 by Robert Collins Review feedback.	1418	self._returning = 'next_with_ghosts'
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1419	self._advance()
	1420	if len(self._next_query) == 0:
	1421	raise StopIteration()
	1422	self._advance()
	1423	return self._current_present, self._current_ghosts
	1424
	1425	def _advance(self):
	1426	"""Advance the search.
	1427
	1428	Updates self.seen, self._next_query, self._current_present,
3177.3.3 by Robert Collins Review feedback.	1429	self._current_ghosts, self._current_parents and self._iterations.
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1430	"""
	1431	self._iterations += 1
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1432	found, ghosts, next, parents = self._do_query(self._next_query)
	1433	self._current_present = found
	1434	self._current_ghosts = ghosts
	1435	self._next_query = next
	1436	self._current_parents = parents
3184.1.3 by Robert Collins Automatically exclude ghosts.	1437	# ghosts are implicit stop points, otherwise the search cannot be
	1438	# repeated when ghosts are filled.
	1439	self._stopped_keys.update(ghosts)
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1440
	1441	def _do_query(self, revisions):
	1442	"""Query for revisions.
	1443
3184.1.4 by Robert Collins Correctly exclude ghosts when ghosts are started on an existing search.	1444	Adds revisions to the seen set.
	1445
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1446	:param revisions: Revisions to query.
	1447	:return: A tuple: (set(found_revisions), set(ghost_revisions),
	1448	set(parents_of_found_revisions), dict(found_revisions:parents)).
	1449	"""
3377.3.9 by John Arbash Meinel Small tweaks to _do_query	1450	found_revisions = set()
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1451	parents_of_found = set()
3184.1.1 by Robert Collins Add basic get_recipe to the graph breadth first searcher.	1452	# revisions may contain nodes that point to other nodes in revisions:
	1453	# we want to filter them out.
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1454	seen = self.seen
	1455	seen.update(revisions)
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1456	parent_map = self._parents_provider.get_parent_map(revisions)
3377.3.9 by John Arbash Meinel Small tweaks to _do_query	1457	found_revisions.update(parent_map)
3177.3.1 by Robert Collins * New method ``next_with_ghosts`` on the Graph breadth-first-search objects	1458	for rev_id, parents in parent_map.iteritems():
3517.4.2 by Martin Pool Make simple-annotation and graph code more tolerant of knits with no graph	1459	if parents is None:
	1460	continue
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1461	new_found_parents = [p for p in parents if p not in seen]
3377.3.9 by John Arbash Meinel Small tweaks to _do_query	1462	if new_found_parents:
	1463	# Calling set.update() with an empty generator is actually
	1464	# rather expensive.
	1465	parents_of_found.update(new_found_parents)
	1466	ghost_revisions = revisions - found_revisions
	1467	return found_revisions, ghost_revisions, parents_of_found, parent_map
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1468
2490.2.8 by Aaron Bentley fix iteration stuff	1469	def __iter__(self):
2490.2.8 by Aaron Bentley fix iteration stuff	1470	return self
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1471
3377.3.1 by John Arbash Meinel Bring in some of the changes from graph_update and graph_optimization	1472	def find_seen_ancestors(self, revisions):
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	1473	"""Find ancestors of these revisions that have already been seen.
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1474
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	1475	This function generally makes the assumption that querying for the
	1476	parents of a node that has already been queried is reasonably cheap.
	1477	(eg, not a round trip to a remote host).
	1478	"""
	1479	# TODO: Often we might ask one searcher for its seen ancestors, and
	1480	# then ask another searcher the same question. This can result in
	1481	# searching the same revisions repeatedly if the two searchers
	1482	# have a lot of overlap.
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1483	all_seen = self.seen
	1484	pending = set(revisions).intersection(all_seen)
	1485	seen_ancestors = set(pending)
	1486
	1487	if self._returning == 'next':
	1488	# self.seen contains what nodes have been returned, not what nodes
	1489	# have been queried. We don't want to probe for nodes that haven't
	1490	# been searched yet.
	1491	not_searched_yet = self._next_query
	1492	else:
	1493	not_searched_yet = ()
3377.3.11 by John Arbash Meinel Committing a debug thunk that was very helpful	1494	pending.difference_update(not_searched_yet)
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1495	get_parent_map = self._parents_provider.get_parent_map
3377.3.12 by John Arbash Meinel Remove the helpful but ugly thunk	1496	while pending:
	1497	parent_map = get_parent_map(pending)
	1498	all_parents = []
	1499	# We don't care if it is a ghost, since it can't be seen if it is
	1500	# a ghost
	1501	for parent_ids in parent_map.itervalues():
	1502	all_parents.extend(parent_ids)
	1503	next_pending = all_seen.intersection(all_parents).difference(seen_ancestors)
	1504	seen_ancestors.update(next_pending)
	1505	next_pending.difference_update(not_searched_yet)
	1506	pending = next_pending
3377.3.10 by John Arbash Meinel Tweak _BreadthFirstSearcher.find_seen_ancestors()	1507
2490.2.7 by Aaron Bentley Start implementing mca that scales with number of uncommon ancestors	1508	return seen_ancestors
	1509
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	1510	def stop_searching_any(self, revisions):
	1511	"""
	1512	Remove any of the specified revisions from the search list.
	1513
	1514	None of the specified revisions are required to be present in the
3808.1.4 by John Arbash Meinel make _walk_to_common responsible for stopping ancestors	1515	search list.
3808.1.1 by Andrew Bennetts Possible fix for bug in new _walk_to_common_revisions.	1516
3808.1.4 by John Arbash Meinel make _walk_to_common responsible for stopping ancestors	1517	It is okay to call stop_searching_any() for revisions which were seen
	1518	in previous iterations. It is the callers responsibility to call
	1519	find_seen_ancestors() to make sure that current search tips that are
	1520	ancestors of those revisions are also stopped. All explicitly stopped
	1521	revisions will be excluded from the search result's get_keys(), though.
2490.2.10 by Aaron Bentley Clarify text, remove unused _get_ancestry method	1522	"""
3377.4.6 by John Arbash Meinel Lots of refactoring for find_unique_ancestors.	1523	# TODO: does this help performance?
	1524	# if not revisions:
	1525	# return set()
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1526	revisions = frozenset(revisions)
3177.3.3 by Robert Collins Review feedback.	1527	if self._returning == 'next':
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1528	stopped = self._next_query.intersection(revisions)
	1529	self._next_query = self._next_query.difference(revisions)
	1530	else:
3184.2.1 by Robert Collins Handle stopping ghosts in searches properly.	1531	stopped_present = self._current_present.intersection(revisions)
	1532	stopped = stopped_present.union(
	1533	self._current_ghosts.intersection(revisions))
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1534	self._current_present.difference_update(stopped)
	1535	self._current_ghosts.difference_update(stopped)
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1536	# stopping 'x' should stop returning parents of 'x', but
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1537	# not if 'y' always references those same parents
	1538	stop_rev_references = {}
3184.2.1 by Robert Collins Handle stopping ghosts in searches properly.	1539	for rev in stopped_present:
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1540	for parent_id in self._current_parents[rev]:
	1541	if parent_id not in stop_rev_references:
	1542	stop_rev_references[parent_id] = 0
	1543	stop_rev_references[parent_id] += 1
	1544	# if only the stopped revisions reference it, the ref count will be
	1545	# 0 after this loop
3177.3.3 by Robert Collins Review feedback.	1546	for parents in self._current_parents.itervalues():
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1547	for parent_id in parents:
	1548	try:
	1549	stop_rev_references[parent_id] -= 1
	1550	except KeyError:
	1551	pass
	1552	stop_parents = set()
	1553	for rev_id, refs in stop_rev_references.iteritems():
	1554	if refs == 0:
	1555	stop_parents.add(rev_id)
	1556	self._next_query.difference_update(stop_parents)
3184.1.2 by Robert Collins Add tests for starting and stopping searches in combination with get_recipe.	1557	self._stopped_keys.update(stopped)
4053.2.2 by Andrew Bennetts Better fix, with test.	1558	self._stopped_keys.update(revisions)
2490.2.25 by Aaron Bentley Update from review	1559	return stopped
2490.2.17 by Aaron Bentley Add start_searching, tweak stop_searching_any	1560
	1561	def start_searching(self, revisions):
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1562	"""Add revisions to the search.
	1563
	1564	The parents of revisions will be returned from the next call to next()
	1565	or next_with_ghosts(). If next_with_ghosts was the most recently used
	1566	next* call then the return value is the result of looking up the
	1567	ghost/not ghost status of revisions. (A tuple (present, ghosted)).
	1568	"""
	1569	revisions = frozenset(revisions)
3184.1.2 by Robert Collins Add tests for starting and stopping searches in combination with get_recipe.	1570	self._started_keys.update(revisions)
3184.1.4 by Robert Collins Correctly exclude ghosts when ghosts are started on an existing search.	1571	new_revisions = revisions.difference(self.seen)
3177.3.3 by Robert Collins Review feedback.	1572	if self._returning == 'next':
3184.1.4 by Robert Collins Correctly exclude ghosts when ghosts are started on an existing search.	1573	self._next_query.update(new_revisions)
3377.3.30 by John Arbash Meinel Can we avoid the extra _do_query in start_searching?	1574	self.seen.update(new_revisions)
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1575	else:
	1576	# perform a query on revisions
3377.3.30 by John Arbash Meinel Can we avoid the extra _do_query in start_searching?	1577	revs, ghosts, query, parents = self._do_query(revisions)
	1578	self._stopped_keys.update(ghosts)
3177.3.2 by Robert Collins Update graph searchers stop_searching_any and start_searching for next_with_ghosts.	1579	self._current_present.update(revs)
	1580	self._current_ghosts.update(ghosts)
	1581	self._next_query.update(query)
	1582	self._current_parents.update(parents)
	1583	return revs, ghosts
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1584
	1585
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1586	class AbstractSearchResult(object):
5535.3.50 by Andrew Bennetts Add docstrings to AbstractSearch and AbstractSearchResult.	1587	"""The result of a search, describing a set of keys.
	1588
	1589	Search results are typically used as the 'fetch_spec' parameter when
	1590	fetching revisions.
	1591
	1592	:seealso: AbstractSearch
	1593	"""
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1594
	1595	def get_recipe(self):
	1596	"""Return a recipe that can be used to replay this search.
	1597
	1598	The recipe allows reconstruction of the same results at a later date.
	1599
5891.1.2 by Andrew Bennetts Fix a bunch of docstring formatting nits, making pydoctor a bit happier.	1600	:return: A tuple of `(search_kind_str, *details)`. The details vary by
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1601	kind of search result.
	1602	"""
	1603	raise NotImplementedError(self.get_recipe)
	1604
	1605	def get_network_struct(self):
	1606	"""Return a tuple that can be transmitted via the HPSS protocol."""
	1607	raise NotImplementedError(self.get_network_struct)
	1608
	1609	def get_keys(self):
	1610	"""Return the keys found in this search.
	1611
	1612	:return: A set of keys.
	1613	"""
	1614	raise NotImplementedError(self.get_keys)
	1615
	1616	def is_empty(self):
	1617	"""Return false if the search lists 1 or more revisions."""
	1618	raise NotImplementedError(self.is_empty)
	1619
	1620	def refine(self, seen, referenced):
	1621	"""Create a new search by refining this search.
	1622
	1623	:param seen: Revisions that have been satisfied.
	1624	:param referenced: Revision references observed while satisfying some
	1625	of this search.
	1626	:return: A search result.
	1627	"""
	1628	raise NotImplementedError(self.refine)
	1629
	1630
	1631	class AbstractSearch(object):
5535.3.50 by Andrew Bennetts Add docstrings to AbstractSearch and AbstractSearchResult.	1632	"""A search that can be executed, producing a search result.
	1633
	1634	:seealso: AbstractSearchResult
	1635	"""
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1636
5536.3.1 by Andrew Bennetts Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch.	1637	def execute(self):
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1638	"""Construct a network-ready search result from this search description.
	1639
	1640	This may take some time to search repositories, etc.
	1641
5536.3.1 by Andrew Bennetts Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch.	1642	:return: A search result (an object that implements
	1643	AbstractSearchResult's API).
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1644	"""
5536.3.1 by Andrew Bennetts Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch.	1645	raise NotImplementedError(self.execute)
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1646
	1647
	1648	class SearchResult(AbstractSearchResult):
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1649	"""The result of a breadth first search.
	1650
	1651	A SearchResult provides the ability to reconstruct the search or access a
	1652	set of the keys the search found.
	1653	"""
	1654
	1655	def __init__(self, start_keys, exclude_keys, key_count, keys):
	1656	"""Create a SearchResult.
	1657
	1658	:param start_keys: The keys the search started at.
	1659	:param exclude_keys: The keys the search excludes.
	1660	:param key_count: The total number of keys (from start to but not
	1661	including exclude).
	1662	:param keys: The keys the search found. Note that in future we may get
	1663	a SearchResult from a smart server, in which case the keys list is
	1664	not necessarily immediately available.
	1665	"""
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1666	self._recipe = ('search', start_keys, exclude_keys, key_count)
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1667	self._keys = frozenset(keys)
	1668
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1669	def __repr__(self):
	1670	kind, start_keys, exclude_keys, key_count = self._recipe
	1671	if len(start_keys) > 5:
	1672	start_keys_repr = repr(list(start_keys)[:5])[:-1] + ', ...]'
	1673	else:
	1674	start_keys_repr = repr(start_keys)
	1675	if len(exclude_keys) > 5:
	1676	exclude_keys_repr = repr(list(exclude_keys)[:5])[:-1] + ', ...]'
	1677	else:
	1678	exclude_keys_repr = repr(exclude_keys)
	1679	return '<%s %s:(%s, %s, %d)>' % (self.__class__.__name__,
	1680	kind, start_keys_repr, exclude_keys_repr, key_count)
	1681
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1682	def get_recipe(self):
	1683	"""Return a recipe that can be used to replay this search.
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	1684
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1685	The recipe allows reconstruction of the same results at a later date
	1686	without knowing all the found keys. The essential elements are a list
4031.3.1 by Frank Aspell Fixing various typos	1687	of keys to start and to stop at. In order to give reproducible
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1688	results when ghosts are encountered by a search they are automatically
	1689	added to the exclude list (or else ghost filling may alter the
	1690	results).
	1691
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1692	:return: A tuple ('search', start_keys_set, exclude_keys_set,
	1693	revision_count). To recreate the results of this search, create a
	1694	breadth first searcher on the same graph starting at start_keys.
	1695	Then call next() (or next_with_ghosts()) repeatedly, and on every
	1696	result, call stop_searching_any on any keys from the exclude_keys
	1697	set. The revision_count value acts as a trivial cross-check - the
	1698	found revisions of the new search should have as many elements as
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1699	revision_count. If it does not, then additional revisions have been
	1700	ghosted since the search was executed the first time and the second
	1701	time.
	1702	"""
	1703	return self._recipe
	1704
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1705	def get_network_struct(self):
	1706	start_keys = ' '.join(self._recipe[1])
	1707	stop_keys = ' '.join(self._recipe[2])
	1708	count = str(self._recipe[3])
	1709	return (self._recipe[0], '\n'.join((start_keys, stop_keys, count)))
	1710
3184.1.6 by Robert Collins Create a SearchResult object which can be used as a replacement for sets.	1711	def get_keys(self):
	1712	"""Return the keys found in this search.
	1713
	1714	:return: A set of keys.
	1715	"""
	1716	return self._keys
	1717
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1718	def is_empty(self):
4204.2.2 by Matt Nordhoff Fix docstrings on graph.py's is_empty methods that said they returned true when they were not empty.	1719	"""Return false if the search lists 1 or more revisions."""
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1720	return self._recipe[3] == 0
	1721
	1722	def refine(self, seen, referenced):
	1723	"""Create a new search by refining this search.
	1724
	1725	:param seen: Revisions that have been satisfied.
	1726	:param referenced: Revision references observed while satisfying some
	1727	of this search.
	1728	"""
	1729	start = self._recipe[1]
	1730	exclude = self._recipe[2]
	1731	count = self._recipe[3]
	1732	keys = self.get_keys()
	1733	# New heads = referenced + old heads - seen things - exclude
	1734	pending_refs = set(referenced)
	1735	pending_refs.update(start)
	1736	pending_refs.difference_update(seen)
	1737	pending_refs.difference_update(exclude)
	1738	# New exclude = old exclude + satisfied heads
	1739	seen_heads = start.intersection(seen)
	1740	exclude.update(seen_heads)
	1741	# keys gets seen removed
	1742	keys = keys - seen
	1743	# length is reduced by len(seen)
	1744	count -= len(seen)
	1745	return SearchResult(pending_refs, exclude, count, keys)
	1746
3514.2.14 by John Arbash Meinel Bring in the code to collapse linear portions of the graph.	1747
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1748	class PendingAncestryResult(AbstractSearchResult):
4070.9.14 by Andrew Bennetts Tweaks requested by Robert's review.	1749	"""A search result that will reconstruct the ancestry for some graph heads.
4086.1.4 by Andrew Bennetts Fix whitespace nit.	1750
4070.9.14 by Andrew Bennetts Tweaks requested by Robert's review.	1751	Unlike SearchResult, this doesn't hold the complete search result in
	1752	memory, it just holds a description of how to generate it.
	1753	"""
	1754
	1755	def __init__(self, heads, repo):
	1756	"""Constructor.
	1757
	1758	:param heads: an iterable of graph heads.
	1759	:param repo: a repository to use to generate the ancestry for the given
	1760	heads.
	1761	"""
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1762	self.heads = frozenset(heads)
4070.9.2 by Andrew Bennetts Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations.	1763	self.repo = repo
	1764
5539.2.20 by Andrew Bennetts Add PendingAncestryResult.__repr__.	1765	def __repr__(self):
	1766	if len(self.heads) > 5:
5535.4.21 by Andrew Bennetts Slightly more informative __repr__.	1767	heads_repr = repr(list(self.heads)[:5])[:-1]
	1768	heads_repr += ', <%d more>...]' % (len(self.heads) - 5,)
5539.2.20 by Andrew Bennetts Add PendingAncestryResult.__repr__.	1769	else:
	1770	heads_repr = repr(self.heads)
	1771	return '<%s heads:%s repo:%r>' % (
	1772	self.__class__.__name__, heads_repr, self.repo)
	1773
4070.9.5 by Andrew Bennetts Better wire protocol: don't shoehorn MiniSearchResult serialisation into previous serialisation format.	1774	def get_recipe(self):
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1775	"""Return a recipe that can be used to replay this search.
	1776
	1777	The recipe allows reconstruction of the same results at a later date.
	1778
	1779	:seealso SearchResult.get_recipe:
	1780
	1781	:return: A tuple ('proxy-search', start_keys_set, set(), -1)
	1782	To recreate this result, create a PendingAncestryResult with the
	1783	start_keys_set.
	1784	"""
	1785	return ('proxy-search', self.heads, set(), -1)
4070.9.5 by Andrew Bennetts Better wire protocol: don't shoehorn MiniSearchResult serialisation into previous serialisation format.	1786
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1787	def get_network_struct(self):
	1788	parts = ['ancestry-of']
	1789	parts.extend(self.heads)
	1790	return parts
	1791
4070.9.2 by Andrew Bennetts Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations.	1792	def get_keys(self):
4098.1.1 by Andrew Bennetts Fix a bug with how PendingAncestryResult.get_keys handles NULL_REVISION.	1793	"""See SearchResult.get_keys.
4098.1.2 by Andrew Bennetts Fix 'trailing' whitespace (actually just a blank line in an indented docstring).	1794
4098.1.1 by Andrew Bennetts Fix a bug with how PendingAncestryResult.get_keys handles NULL_REVISION.	1795	Returns all the keys for the ancestry of the heads, excluding
	1796	NULL_REVISION.
	1797	"""
	1798	return self._get_keys(self.repo.get_graph())
4098.1.3 by Andrew Bennetts Fix 'trailing' whitespace (actually just a blank line between methods).	1799
4098.1.1 by Andrew Bennetts Fix a bug with how PendingAncestryResult.get_keys handles NULL_REVISION.	1800	def _get_keys(self, graph):
	1801	NULL_REVISION = revision.NULL_REVISION
	1802	keys = [key for (key, parents) in graph.iter_ancestry(self.heads)
4343.3.11 by John Arbash Meinel Change PendingAncestryResult to strip ghosts from .get_keys()	1803	if key != NULL_REVISION and parents is not None]
4070.9.2 by Andrew Bennetts Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations.	1804	return keys
	1805
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1806	def is_empty(self):
4204.2.2 by Matt Nordhoff Fix docstrings on graph.py's is_empty methods that said they returned true when they were not empty.	1807	"""Return false if the search lists 1 or more revisions."""
4152.1.2 by Robert Collins Add streaming from a stacked branch when the sort order is compatible with doing so.	1808	if revision.NULL_REVISION in self.heads:
	1809	return len(self.heads) == 1
	1810	else:
	1811	return len(self.heads) == 0
	1812
	1813	def refine(self, seen, referenced):
	1814	"""Create a new search by refining this search.
	1815
	1816	:param seen: Revisions that have been satisfied.
	1817	:param referenced: Revision references observed while satisfying some
	1818	of this search.
	1819	"""
	1820	referenced = self.heads.union(referenced)
	1821	return PendingAncestryResult(referenced - seen, self.repo)
	1822
4070.9.2 by Andrew Bennetts Rough prototype of allowing a SearchResult to be passed to fetch, and using that to improve network conversations.	1823
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1824	class EmptySearchResult(AbstractSearchResult):
5539.2.15 by Andrew Bennetts Add a docstring.	1825	"""An empty search result."""
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1826
	1827	def is_empty(self):
	1828	return True
	1829
	1830
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1831	class EverythingResult(AbstractSearchResult):
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1832	"""A search result that simply requests everything in the repository."""
	1833
	1834	def __init__(self, repo):
	1835	self._repo = repo
	1836
5535.3.33 by Andrew Bennetts Fix a bug.	1837	def __repr__(self):
	1838	return '%s(%r)' % (self.__class__.__name__, self._repo)
	1839
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1840	def get_recipe(self):
	1841	raise NotImplementedError(self.get_recipe)
	1842
	1843	def get_network_struct(self):
	1844	return ('everything',)
	1845
	1846	def get_keys(self):
	1847	if 'evil' in debug.debug_flags:
	1848	from bzrlib import remote
	1849	if isinstance(self._repo, remote.RemoteRepository):
	1850	# warn developers (not users) not to do this
	1851	trace.mutter_callsite(
	1852	2, "EverythingResult(RemoteRepository).get_keys() is slow.")
5539.2.4 by Andrew Bennetts Add some basic tests for the new verb, fix some shallow bugs.	1853	return self._repo.all_revision_ids()
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1854
	1855	def is_empty(self):
	1856	# It's ok for this to wrongly return False: the worst that can happen
	1857	# is that RemoteStreamSource will initiate a get_stream on an empty
	1858	# repository. And almost all repositories are non-empty.
	1859	return False
	1860
	1861	def refine(self, seen, referenced):
5539.3.3 by Andrew Bennetts Implement EverythingResult.refine.	1862	heads = set(self._repo.all_revision_ids())
	1863	heads.difference_update(seen)
	1864	heads.update(referenced)
	1865	return PendingAncestryResult(heads, self._repo)
5539.2.1 by Andrew Bennetts Start defining an 'everything' fetch spec.	1866
	1867
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1868	class EverythingNotInOther(AbstractSearch):
	1869	"""Find all revisions in that are in one repo but not the other."""
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1870
	1871	def __init__(self, to_repo, from_repo, find_ghosts=False):
	1872	self.to_repo = to_repo
	1873	self.from_repo = from_repo
	1874	self.find_ghosts = find_ghosts
	1875
5536.3.1 by Andrew Bennetts Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch.	1876	def execute(self):
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1877	return self.to_repo.search_missing_revision_ids(
	1878	self.from_repo, find_ghosts=self.find_ghosts)
	1879
	1880
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1881	class NotInOtherForRevs(AbstractSearch):
	1882	"""Find all revisions missing in one repo for a some specific heads."""
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1883
5535.3.31 by Andrew Bennetts Cope with tags that reference missing revisions.	1884	def __init__(self, to_repo, from_repo, required_ids, if_present_ids=None,
5852.1.5 by Andrew Bennetts, Jelmer Vernooij Support limit= for fetching between Bazaar branches.	1885	find_ghosts=False, limit=None):
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1886	"""Constructor.
	1887
	1888	:param required_ids: revision IDs of heads that must be found, or else
	1889	the search will fail with NoSuchRevision. All revisions in their
	1890	ancestry not already in the other repository will be included in
	1891	the search result.
	1892	:param if_present_ids: revision IDs of heads that may be absent in the
	1893	source repository. If present, then their ancestry not already
	1894	found in other will be included in the search result.
5852.1.5 by Andrew Bennetts, Jelmer Vernooij Support limit= for fetching between Bazaar branches.	1895	:param limit: maximum number of revisions to fetch
5539.2.19 by Andrew Bennetts Define SearchResult/Search interfaces with explicit abstract base classes, add some docstrings and change a method name.	1896	"""
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1897	self.to_repo = to_repo
	1898	self.from_repo = from_repo
	1899	self.find_ghosts = find_ghosts
5535.3.31 by Andrew Bennetts Cope with tags that reference missing revisions.	1900	self.required_ids = required_ids
	1901	self.if_present_ids = if_present_ids
5852.1.5 by Andrew Bennetts, Jelmer Vernooij Support limit= for fetching between Bazaar branches.	1902	self.limit = limit
5535.3.31 by Andrew Bennetts Cope with tags that reference missing revisions.	1903
	1904	def __repr__(self):
	1905	if len(self.required_ids) > 5:
	1906	reqd_revs_repr = repr(list(self.required_ids)[:5])[:-1] + ', ...]'
	1907	else:
	1908	reqd_revs_repr = repr(self.required_ids)
	1909	if self.if_present_ids and len(self.if_present_ids) > 5:
	1910	ifp_revs_repr = repr(list(self.if_present_ids)[:5])[:-1] + ', ...]'
	1911	else:
	1912	ifp_revs_repr = repr(self.if_present_ids)
	1913
5852.1.5 by Andrew Bennetts, Jelmer Vernooij Support limit= for fetching between Bazaar branches.	1914	return ("<%s from:%r to:%r find_ghosts:%r req'd:%r if-present:%r"
	1915	"limit:%r>") % (
	1916	self.__class__.__name__, self.from_repo, self.to_repo,
	1917	self.find_ghosts, reqd_revs_repr, ifp_revs_repr,
	1918	self.limit)
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1919
5536.3.1 by Andrew Bennetts Rename get_search_result to execute, require a SearchResult (and not a search) be passed to fetch.	1920	def execute(self):
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1921	return self.to_repo.search_missing_revision_ids(
5535.3.31 by Andrew Bennetts Cope with tags that reference missing revisions.	1922	self.from_repo, revision_ids=self.required_ids,
5852.1.5 by Andrew Bennetts, Jelmer Vernooij Support limit= for fetching between Bazaar branches.	1923	if_present_ids=self.if_present_ids, find_ghosts=self.find_ghosts,
	1924	limit=self.limit)
5539.2.8 by Andrew Bennetts Refactor call to search_missing_revision_ids out from RepoFetcher._revids_to_fetch into the 'fetch spec'.	1925
	1926
6015.23.4 by John Arbash Meinel Prototype the walk-backwards-and-then-forwards code.	1927	def invert_parent_map(parent_map):
	1928	"""Given a map from child => parents, create a map of parent=>children"""
	1929	child_map = {}
	1930	for child, parents in parent_map.iteritems():
	1931	for p in parents:
	1932	# Any given parent is likely to have only a small handful
	1933	# of children, many will have only one. So we avoid mem overhead of
	1934	# a list, in exchange for extra copying of tuples
	1935	if p not in child_map:
	1936	child_map[p] = (child,)
	1937	else:
	1938	child_map[p] = child_map[p] + (child,)
	1939	return child_map
	1940
	1941
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1942	def _find_possible_heads(parent_map, tip_keys, depth):
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1943	"""Walk backwards (towards children) through the parent_map.
	1944
	1945	This finds 'heads' that will hopefully succinctly describe our search
	1946	graph.
	1947	"""
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1948	child_map = invert_parent_map(parent_map)
	1949	heads = set()
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1950	current_roots = tip_keys
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1951	walked = set(current_roots)
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1952	while current_roots and depth > 0:
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1953	depth -= 1
	1954	children = set()
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1955	children_update = children.update
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1956	for p in current_roots:
	1957	# Is it better to pre- or post- filter the children?
	1958	try:
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1959	children_update(child_map[p])
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1960	except KeyError:
	1961	heads.add(p)
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1962	# If we've seen a key before, we don't want to walk it again. Note that
	1963	# 'children' stays relatively small while 'walked' grows large. So
	1964	# don't use 'difference_update' here which has to walk all of 'walked'.
	1965	# '.difference' is smart enough to walk only children and compare it to
	1966	# walked.
	1967	children = children.difference(walked)
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1968	walked.update(children)
	1969	current_roots = children
	1970	if current_roots:
6015.23.10 by John Arbash Meinel Small tweaks to search performance, though still at depth=100 the primary time	1971	# We walked to the end of depth, so these are the new tips.
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1972	heads.update(current_roots)
	1973	return heads
	1974
	1975
	1976	def _run_search(parent_map, heads, exclude_keys):
6015.23.15 by John Arbash Meinel Clean out more of the cruft that got left by accident.	1977	"""Given a parent map, run a _BreadthFirstSearcher on it.
	1978
	1979	Start at heads, walk until you hit exclude_keys. As a further improvement,
	1980	watch for any heads that you encounter while walking, which means they were
	1981	not heads of the search.
	1982
	1983	This is mostly used to generate a succinct recipe for how to walk through
	1984	most of parent_map.
	1985
	1986	:return: (_BreadthFirstSearcher, set(heads_encountered_by_walking))
	1987	"""
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	1988	g = Graph(DictParentsProvider(parent_map))
	1989	s = g._make_breadth_first_searcher(heads)
	1990	found_heads = set()
	1991	while True:
	1992	try:
	1993	next_revs = s.next()
	1994	except StopIteration:
	1995	break
	1996	for parents in s._current_parents.itervalues():
	1997	f_heads = heads.intersection(parents)
	1998	if f_heads:
	1999	found_heads.update(f_heads)
	2000	stop_keys = exclude_keys.intersection(next_revs)
	2001	if stop_keys:
	2002	s.stop_searching_any(stop_keys)
	2003	for parents in s._current_parents.itervalues():
	2004	f_heads = heads.intersection(parents)
	2005	if f_heads:
	2006	found_heads.update(f_heads)
	2007	return s, found_heads
	2008
	2009
6015.23.5 by John Arbash Meinel Implement something that seems to work for limited search recipies.	2010	def limited_search_result_from_parent_map(parent_map, missing_keys, tip_keys,
	2011	depth):
6015.23.4 by John Arbash Meinel Prototype the walk-backwards-and-then-forwards code.	2012	"""Transform a parent_map that is searching 'tip_keys' into an
	2013	approximate SearchResult.
	2014
	2015	We should be able to generate a SearchResult from a given set of starting
	2016	keys, that covers a subset of parent_map that has the last step pointing at
	2017	tip_keys. This is to handle the case that really-long-searches shouldn't be
	2018	started from scratch on each get_parent_map request, but we do want to
	2019	filter out some of the keys that we've already seen, so we don't get
	2020	information that we already know about on every request.
	2021
	2022	The server will validate the search (that starting at start_keys and
	2023	stopping at stop_keys yields the exact key_count), so we have to be careful
	2024	to give an exact recipe.
	2025
	2026	Basic algorithm is:
	2027	1) Invert parent_map to get child_map (todo: have it cached and pass it
	2028	in)
	2029	2) Starting at tip_keys, walk towards children for 'depth' steps.
	2030	3) At that point, we have the 'start' keys.
	2031	4) Start walking parent_map from 'start' keys, counting how many keys
	2032	are seen, and generating stop_keys for anything that would walk
	2033	outside of the parent_map.
	2034
	2035	:param parent_map: A map from {child_id: (parent_ids,)}
	2036	:param missing_keys: parent_ids that we know are unavailable
	2037	:param tip_keys: the revision_ids that we are searching
	2038	:param depth: How far back to walk.
	2039	"""
6015.23.17 by John Arbash Meinel Code was relying on an empty parent map to yield an empty search.	2040	if not parent_map:
	2041	# No search to send, because we haven't done any searching yet.
	2042	return [], [], 0
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	2043	heads = _find_possible_heads(parent_map, tip_keys, depth)
6015.23.13 by John Arbash Meinel Don't walk the graph a second time.	2044	s, found_heads = _run_search(parent_map, heads, set(tip_keys))
	2045	_, start_keys, exclude_keys, key_count = s.get_result().get_recipe()
6015.23.7 by John Arbash Meinel Run the search a 3rd time if we encounter heads.	2046	if found_heads:
6015.23.13 by John Arbash Meinel Don't walk the graph a second time.	2047	# Anything in found_heads are redundant start_keys, we hit them while
	2048	# walking, so we can exclude them from the start list.
	2049	start_keys = set(start_keys).difference(found_heads)
	2050	return start_keys, exclude_keys, key_count
6015.23.4 by John Arbash Meinel Prototype the walk-backwards-and-then-forwards code.	2051
	2052
6015.23.3 by John Arbash Meinel Start refactoring code into graph.py code for easier testing.	2053	def search_result_from_parent_map(parent_map, missing_keys):
	2054	"""Transform a parent_map into SearchResult information."""
	2055	if not parent_map:
	2056	# parent_map is empty or None, simple search result
	2057	return [], [], 0
	2058	# start_set is all the keys in the cache
	2059	start_set = set(parent_map)
	2060	# result set is all the references to keys in the cache
	2061	result_parents = set()
	2062	for parents in parent_map.itervalues():
	2063	result_parents.update(parents)
	2064	stop_keys = result_parents.difference(start_set)
	2065	# We don't need to send ghosts back to the server as a position to
	2066	# stop either.
	2067	stop_keys.difference_update(missing_keys)
	2068	key_count = len(parent_map)
	2069	if (revision.NULL_REVISION in result_parents
	2070	and revision.NULL_REVISION in missing_keys):
	2071	# If we pruned NULL_REVISION from the stop_keys because it's also
	2072	# in our cache of "missing" keys we need to increment our key count
	2073	# by 1, because the reconsitituted SearchResult on the server will
	2074	# still consider NULL_REVISION to be an included key.
	2075	key_count += 1
	2076	included_keys = start_set.intersection(result_parents)
	2077	start_set.difference_update(included_keys)
	2078	return start_set, stop_keys, key_count
	2079
	2080
3514.2.14 by John Arbash Meinel Bring in the code to collapse linear portions of the graph.	2081	def collapse_linear_regions(parent_map):
	2082	"""Collapse regions of the graph that are 'linear'.
	2083
	2084	For example::
	2085
	2086	A:[B], B:[C]
	2087
	2088	can be collapsed by removing B and getting::
	2089
	2090	A:[C]
	2091
	2092	:param parent_map: A dictionary mapping children to their parents
	2093	:return: Another dictionary with 'linear' chains collapsed
	2094	"""
	2095	# Note: this isn't a strictly minimal collapse. For example:
	2096	# A
	2097	# / \
	2098	# B C
	2099	# \ /
	2100	# D
	2101	# \|
	2102	# E
	2103	# Will not have 'D' removed, even though 'E' could fit. Also:
	2104	# A
	2105	# \| A
	2106	# B => \|
	2107	# \| C
	2108	# C
	2109	# A and C are both kept because they are edges of the graph. We could get
	2110	# rid of A if we wanted.
	2111	# A
	2112	# / \
	2113	# B C
	2114	# \| \|
	2115	# D E
	2116	# \ /
	2117	# F
	2118	# Will not have any nodes removed, even though you do have an
	2119	# 'uninteresting' linear D->B and E->C
	2120	children = {}
	2121	for child, parents in parent_map.iteritems():
	2122	children.setdefault(child, [])
	2123	for p in parents:
	2124	children.setdefault(p, []).append(child)
	2125
	2126	orig_children = dict(children)
	2127	removed = set()
	2128	result = dict(parent_map)
	2129	for node in parent_map:
	2130	parents = result[node]
	2131	if len(parents) == 1:
	2132	parent_children = children[parents[0]]
	2133	if len(parent_children) != 1:
	2134	# This is not the only child
	2135	continue
	2136	node_children = children[node]
	2137	if len(node_children) != 1:
	2138	continue
	2139	child_parents = result.get(node_children[0], None)
	2140	if len(child_parents) != 1:
	2141	# This is not its only parent
	2142	continue
	2143	# The child of this node only points at it, and the parent only has
	2144	# this as a child. remove this node, and join the others together
2145	result[node_children[0]] = parents
2146	children[parents[0]] = node_children
2147	del result[node]
2148	del children[node]
2149	removed.add(node)
2150
2151	return result
4371.3.18 by John Arbash Meinel Change VF.annotate to use the new KnownGraph code.	2152
	2153
4819.2.3 by John Arbash Meinel Add a GraphThunkIdsToKeys as a tested class.	2154	class GraphThunkIdsToKeys(object):
	2155	"""Forwards calls about 'ids' to be about keys internally."""
	2156
	2157	def __init__(self, graph):
	2158	self._graph = graph
	2159
4913.4.4 by Jelmer Vernooij Add test for Repository.get_known_graph_ancestry().	2160	def topo_sort(self):
	2161	return [r for (r,) in self._graph.topo_sort()]
	2162
4819.2.3 by John Arbash Meinel Add a GraphThunkIdsToKeys as a tested class.	2163	def heads(self, ids):
	2164	"""See Graph.heads()"""
	2165	as_keys = [(i,) for i in ids]
	2166	head_keys = self._graph.heads(as_keys)
	2167	return set([h[0] for h in head_keys])
	2168
4913.4.2 by Jelmer Vernooij Add Repository.get_known_graph_ancestry.	2169	def merge_sort(self, tip_revision):
5988.1.1 by Jelmer Vernooij Fix GraphThunkIdsToKeys.merge_sort	2170	nodes = self._graph.merge_sort((tip_revision,))
	2171	for node in nodes:
	2172	node.key = node.key[0]
	2173	return nodes
4913.4.2 by Jelmer Vernooij Add Repository.get_known_graph_ancestry.	2174
5559.3.1 by Jelmer Vernooij Add GraphThunkIdsToKeys.add_node.	2175	def add_node(self, revision, parents):
	2176	self._graph.add_node((revision,), [(p,) for p in parents])
	2177
4819.2.3 by John Arbash Meinel Add a GraphThunkIdsToKeys as a tested class.	2178
4371.3.38 by John Arbash Meinel Add a failing test for handling nodes that are in the same linear chain.	2179	_counters = [0,0,0,0,0,0,0]
4371.3.18 by John Arbash Meinel Change VF.annotate to use the new KnownGraph code.	2180	try:
	2181	from bzrlib._known_graph_pyx import KnownGraph
4574.3.6 by Martin Pool More warnings when failing to load extensions	2182	except ImportError, e:
4574.3.8 by Martin Pool Only mutter extension load errors when they occur, and record for later	2183	osutils.failed_to_load_extension(e)
4371.3.18 by John Arbash Meinel Change VF.annotate to use the new KnownGraph code.	2184	from bzrlib._known_graph_py import KnownGraph