~bzr-pqm/bzr/bzr.dev

4679.3.62 by John Arbash Meinel
Move some of the information into the pxd header file.
1
# Copyright (C) 2009 Canonical Ltd
2
#
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
7
#
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
12
#
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
15
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
4679.3.76 by John Arbash Meinel
Rename StaticTupleInterner => SimpleSet.
17
"""Interface definition of a class like PySet but without caching the hash.
18
19
This is generally useful when you want to 'intern' objects, etc. Note that this
20
differs from Set in that we:
21
  1) Don't have all of the .intersection, .difference, etc functions
22
  2) Do return the object from the set via queries
23
     eg. SimpleSet.add(key) => saved_key and SimpleSet[key] => saved_key
24
"""
4679.3.62 by John Arbash Meinel
Move some of the information into the pxd header file.
25
26
cdef extern from "Python.h":
27
    ctypedef struct PyObject:
28
        pass
29
4679.3.86 by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file.
30
4679.3.76 by John Arbash Meinel
Rename StaticTupleInterner => SimpleSet.
31
cdef public api class SimpleSet [object SimpleSetObject, type SimpleSet_Type]:
32
    """A class similar to PySet, but with simpler implementation.
33
34
    The main advantage is that this class uses only 2N memory to store N
35
    objects rather than 4N memory. The main trade-off is that we do not cache
36
    the hash value of saved objects. As such, it is assumed that computing the
37
    hash will be cheap (such as strings or tuples of strings, etc.)
38
39
    This also differs in that you can get back the objects that are stored
40
    (like a dict), but we also don't implement the complete list of 'set'
41
    operations (difference, intersection, etc).
42
    """
4679.3.88 by John Arbash Meinel
Some review comments from Andrew.
43
    # Data structure definition:
44
    #   This is a basic hash table using open addressing.
45
    #       http://en.wikipedia.org/wiki/Open_addressing
46
    #   Basically that means we keep an array of pointers to Python objects
47
    #   (called a table). Each location in the array is called a 'slot'.
48
    #
49
    #   An empty slot holds a NULL pointer, a slot where there was an item
50
    #   which was then deleted will hold a pointer to _dummy, and a filled slot
51
    #   points at the actual object which fills that slot.
52
    #
53
    #   The table is always a power of two, and the default location where an
54
    #   object is inserted is at hash(object) & (table_size - 1)
55
    #
56
    #   If there is a collision, then we search for another location. The
57
    #   specific algorithm is in _lookup. We search until we:
58
    #       find the object
59
    #       find an equivalent object (by tp_richcompare(obj1, obj2, Py_EQ))
60
    #       find a NULL slot
61
    #
62
    #   When an object is deleted, we set its slot to _dummy. this way we don't
63
    #   have to track whether there was a collision, and find the corresponding
64
    #   keys. (The collision resolution algorithm makes that nearly impossible
65
    #   anyway, because it depends on the upper bits of the hash.)
66
    #   The main effect of this, is that if we find _dummy, then we can insert
67
    #   an object there, but we have to keep searching until we find NULL to
68
    #   know that the object is not present elsewhere.
4679.3.62 by John Arbash Meinel
Move some of the information into the pxd header file.
69
4679.3.81 by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again.
70
    cdef Py_ssize_t _used   # active
71
    cdef Py_ssize_t _fill   # active + dummy
4679.3.88 by John Arbash Meinel
Some review comments from Andrew.
72
    cdef Py_ssize_t _mask   # Table contains (mask+1) slots, a power of 2
4679.3.81 by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again.
73
    cdef PyObject **_table  # Pyrex/Cython doesn't support arrays to 'object'
4679.3.62 by John Arbash Meinel
Move some of the information into the pxd header file.
74
                            # so we manage it manually
75
76
    cdef PyObject *_get(self, object key) except? NULL
4679.3.81 by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again.
77
    cdef object _add(self, key)
78
    cdef int _discard(self, key) except -1
4679.3.63 by John Arbash Meinel
Implement resizing.
79
    cdef int _insert_clean(self, PyObject *key) except -1
4679.3.81 by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again.
80
    cdef Py_ssize_t _resize(self, Py_ssize_t min_unused) except -1
4679.3.73 by John Arbash Meinel
start trying to expose everything just from cython.
81
4679.3.86 by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file.
82
4679.3.74 by John Arbash Meinel
Revert back to before I started trying to move to pyrex/cython code.
83
# TODO: might want to export the C api here, though it is all available from
84
#       the class object...
4679.3.81 by John Arbash Meinel
Fix up _simple_set_pyx.pyx to be compatible with pyrex again.
85
cdef api SimpleSet SimpleSet_New()
4679.3.86 by John Arbash Meinel
some whitespace and expose the rest of the C api in the .pxd file.
86
cdef api object SimpleSet_Add(object self, object key)
87
cdef api int SimpleSet_Contains(object self, object key) except -1
88
cdef api int SimpleSet_Discard(object self, object key) except -1
89
cdef api PyObject *SimpleSet_Get(SimpleSet self, object key) except? NULL
90
cdef api Py_ssize_t SimpleSet_Size(object self) except -1
91
cdef api int SimpleSet_Next(object self, Py_ssize_t *pos, PyObject **key)