4679.3.82
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.81
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.80
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.79
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.78
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.77
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.76
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.75
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.74
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.73
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.72
|
|
Things are ~ working again.
Man this is getting messy, I may have to change my mind again.
I tried to export the StaticTuple type only through _static_tuple_pyx.pyx by using a little bit of trickery. However, you end up with a circular import issue, and also I forgot to track down one place where I needed to rename '_static_tuple_c' => '_static_tuple_type_c'.
The idea was that _static_tuple_type.c would *only* define the type, and not any extra info. This way the code could be compiled with either cython or pyrex and still get the 'better' StaticTuple object.
It ended up, overall, just being a multi-hour mess trying to get the dependencies sorted out. By using a .pxd file, at least the basic circular import problem was sorted out.
However at this point, you *have* to import _static_tuple_pyx before _static_tuple_type_c or you get a segfault, and you have to import the latter if you want to get direct access to the class.
So at this point I feel like I either need to: 1) Go back to the way it was, and get rid of the circular import 2) Finish the rest of the steps, bring everything into Cython and say 'if you want the memory improvements, then you have to compile with cython.'
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.71
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.70
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.69
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.68
|
|
Some more performance data.
#1 thing is that StaticTupleInterner type is not actually in the GC, because it doesn't have any pure 'object' attributes. As such it doesn't implement tp_traverse either. Which means we don't track the size via Meliae as well as I would like. However, in an interesting result, memory consumption is down *and* speed is better. On LP: 259460KB => 243936KB (15MB) and 0m30.525s => 0m26.457s My guess is that by not walking the intern dict, gc has a lot less work to do when walking all those keys that we know it will just ignore next. Adding the 'self.hash' attribute brings memory consumption back up a bit: 252724KB and time down barely at all 0m25.756s Probably we could get more time performance out of StaticTupleInterner by making the internal table only point to StaticTuple objects, and then we could access their self->hash directly.
However, seeing this means I can move StaticTuple itself into a pyrex function, since I'll only have C attributes and a PyObject* table. I just hope I can figure out how to get a tp_traverse that Meliae can use...
Note that as a reference point, w/ bzr.dev: 326984KB, 53.4s Or we now have a 1.34:1 memory savings and 2:1 speed improvement. The other guess on why StaticTupleInterner works well is that the hash functionality really does spread things out evenly, so we rarely get collisions. And if you don't have a collision, then you don't care about the time for hash() because it is only computed for the new key (which you have to do anyway) and not for any of the entries already present.
Time for 'bzr log -r -1 -n0 bzr.dev' is at 691ms which is ~= bzr.dev 'bzr log -v bzr.dev' is 19.1 => 18.7s and 169MB=>197MB which is a net loss... :( May have to revisit the CHKMap internals. 'bzr log -n0 -v bzr.dev' is 2m25s=>2m21s and 235MB=>244MB which isn't a great tradeoff. Strange that things don't seem to be a win in 'real-life', I wonder if somewhere we are casting things back into regular tuples that I missed. (tuple(tpl) is tpl, but tuple(static_tpl) is not static_tpl).
There are also possibilities that tuples are special cased in more places, or the custom __iter__ functionality or... etc.
Next up, mem testing on 64-bit.
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.67
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.66
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.65
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.64
|
|
|
John Arbash Meinel |
15 years ago
|
|
|
4679.3.63
|
|
|
John Arbash Meinel |
15 years ago
|
|
|