4679.3.62
by John Arbash Meinel
Move some of the information into the pxd header file. |
1 |
# Copyright (C) 2009 Canonical Ltd
|
2 |
#
|
|
3 |
# This program is free software; you can redistribute it and/or modify
|
|
4 |
# it under the terms of the GNU General Public License as published by
|
|
5 |
# the Free Software Foundation; either version 2 of the License, or
|
|
6 |
# (at your option) any later version.
|
|
7 |
#
|
|
8 |
# This program is distributed in the hope that it will be useful,
|
|
9 |
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
10 |
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
11 |
# GNU General Public License for more details.
|
|
12 |
#
|
|
13 |
# You should have received a copy of the GNU General Public License
|
|
14 |
# along with this program; if not, write to the Free Software
|
|
15 |
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
|
16 |
||
4679.3.76
by John Arbash Meinel
Rename StaticTupleInterner => SimpleSet. |
17 |
"""Interface definition of a class like PySet but without caching the hash.
|
18 |
||
19 |
This is generally useful when you want to 'intern' objects, etc. Note that this
|
|
20 |
differs from Set in that we:
|
|
21 |
1) Don't have all of the .intersection, .difference, etc functions
|
|
22 |
2) Do return the object from the set via queries
|
|
23 |
eg. SimpleSet.add(key) => saved_key and SimpleSet[key] => saved_key
|
|
24 |
"""
|
|
4679.3.62
by John Arbash Meinel
Move some of the information into the pxd header file. |
25 |
|
26 |
cdef extern from "Python.h": |
|
27 |
ctypedef struct PyObject: |
|
28 |
pass
|
|
29 |
||
4679.3.86
by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file. |
30 |
|
4679.3.76
by John Arbash Meinel
Rename StaticTupleInterner => SimpleSet. |
31 |
cdef public api class SimpleSet [object SimpleSetObject, type SimpleSet_Type]: |
32 |
"""A class similar to PySet, but with simpler implementation.
|
|
33 |
||
34 |
The main advantage is that this class uses only 2N memory to store N
|
|
35 |
objects rather than 4N memory. The main trade-off is that we do not cache
|
|
36 |
the hash value of saved objects. As such, it is assumed that computing the
|
|
37 |
hash will be cheap (such as strings or tuples of strings, etc.)
|
|
38 |
||
39 |
This also differs in that you can get back the objects that are stored
|
|
40 |
(like a dict), but we also don't implement the complete list of 'set'
|
|
41 |
operations (difference, intersection, etc).
|
|
42 |
"""
|
|
4679.3.88
by John Arbash Meinel
Some review comments from Andrew. |
43 |
# Data structure definition:
|
44 |
# This is a basic hash table using open addressing.
|
|
45 |
# http://en.wikipedia.org/wiki/Open_addressing
|
|
46 |
# Basically that means we keep an array of pointers to Python objects
|
|
47 |
# (called a table). Each location in the array is called a 'slot'.
|
|
48 |
#
|
|
49 |
# An empty slot holds a NULL pointer, a slot where there was an item
|
|
50 |
# which was then deleted will hold a pointer to _dummy, and a filled slot
|
|
51 |
# points at the actual object which fills that slot.
|
|
52 |
#
|
|
53 |
# The table is always a power of two, and the default location where an
|
|
54 |
# object is inserted is at hash(object) & (table_size - 1)
|
|
55 |
#
|
|
56 |
# If there is a collision, then we search for another location. The
|
|
57 |
# specific algorithm is in _lookup. We search until we:
|
|
58 |
# find the object
|
|
59 |
# find an equivalent object (by tp_richcompare(obj1, obj2, Py_EQ))
|
|
60 |
# find a NULL slot
|
|
61 |
#
|
|
62 |
# When an object is deleted, we set its slot to _dummy. this way we don't
|
|
63 |
# have to track whether there was a collision, and find the corresponding
|
|
64 |
# keys. (The collision resolution algorithm makes that nearly impossible
|
|
65 |
# anyway, because it depends on the upper bits of the hash.)
|
|
66 |
# The main effect of this, is that if we find _dummy, then we can insert
|
|
67 |
# an object there, but we have to keep searching until we find NULL to
|
|
68 |
# know that the object is not present elsewhere.
|
|
4679.3.62
by John Arbash Meinel
Move some of the information into the pxd header file. |
69 |
|
4679.3.81
by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again. |
70 |
cdef Py_ssize_t _used # active |
71 |
cdef Py_ssize_t _fill # active + dummy |
|
4679.3.88
by John Arbash Meinel
Some review comments from Andrew. |
72 |
cdef Py_ssize_t _mask # Table contains (mask+1) slots, a power of 2 |
4679.3.81
by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again. |
73 |
cdef PyObject **_table # Pyrex/Cython doesn't support arrays to 'object' |
4679.3.62
by John Arbash Meinel
Move some of the information into the pxd header file. |
74 |
# so we manage it manually
|
75 |
||
76 |
cdef PyObject *_get(self, object key) except? NULL |
|
4679.3.81
by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again. |
77 |
cdef object _add(self, key) |
78 |
cdef int _discard(self, key) except -1 |
|
4679.3.63
by John Arbash Meinel
Implement resizing. |
79 |
cdef int _insert_clean(self, PyObject *key) except -1 |
4679.3.81
by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again. |
80 |
cdef Py_ssize_t _resize(self, Py_ssize_t min_unused) except -1 |
4679.3.73
by John Arbash Meinel
start trying to expose everything just from cython. |
81 |
|
4679.3.86
by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file. |
82 |
|
4679.3.74
by John Arbash Meinel
Revert back to before I started trying to move to pyrex/cython code. |
83 |
# TODO: might want to export the C api here, though it is all available from
|
84 |
# the class object...
|
|
4679.3.81
by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again. |
85 |
cdef api SimpleSet SimpleSet_New() |
4679.3.86
by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file. |
86 |
cdef api object SimpleSet_Add(object self, object key) |
87 |
cdef api int SimpleSet_Contains(object self, object key) except -1 |
|
88 |
cdef api int SimpleSet_Discard(object self, object key) except -1 |
|
89 |
cdef api PyObject *SimpleSet_Get(SimpleSet self, object key) except? NULL |
|
90 |
cdef api Py_ssize_t SimpleSet_Size(object self) except -1 |
|
4932.1.2
by John Arbash Meinel
Clean up _simple_set.pyx, and handle 'nogil'. |
91 |
cdef api int SimpleSet_Next(object self, Py_ssize_t *pos, PyObject **key) except -1 |