Skip to content
This repository was archived by the owner on Jun 27, 2025. It is now read-only.

Commit b8b917e

Browse files
committed
code cleanup
1 parent 7fe65ad commit b8b917e

3 files changed

Lines changed: 28 additions & 113 deletions

File tree

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,30 @@ arraymap
55
The ArrayMap library provides dictionary-like lookup from NumPy array values to their integer positions. The hash table design and C implementation is based on [AutoMap](https://github.com/brandtbucher/automap), with extensive additions for direct support of NumPy arrays.
66

77

8+
Code: https://github.com/static-frame/arraymap
9+
10+
Packages: https://pypi.org/project/arraymap
11+
12+
13+
14+
Dependencies
15+
--------------
16+
17+
ArrayKit requires the following:
18+
19+
- Python >= 3.7
20+
- NumPy >= 1.18.5
21+
22+
23+
24+
What is New in ArrayMap
25+
-------------------------
26+
27+
28+
29+
30+
0.1.0
31+
-------
32+
33+
Initial release with NumPy integration.
834

arraymap.c

Lines changed: 2 additions & 107 deletions
Original file line numberDiff line numberDiff line change
@@ -1,113 +1,8 @@
1-
// TODO: Rewrite performance tests using pyperf.
2-
// TODO: Group similar functionality.
3-
// TODO: Check refcounts when calling into hash and comparison functions.
4-
// TODO: Check allocation and cleanup.
5-
// TODO: Subinterpreter support.
6-
// TODO: Docstrings and stubs.
7-
// TODO: GC support.
81

2+
// For background on the hashtable design first implemented in AutoMap, see the following:
3+
// https://github.com/brandtbucher/automap/blob/b787199d38d6bfa1b55484e5ea1e89b31cc1fa72/automap.c#L12
94

10-
/*******************************************************************************
115

12-
Our use cases differ significantly from Python's general-purpose dict type, even
13-
when setting aside the whole immutable/grow-only and contiguous-integer-values
14-
stuff.
15-
16-
What we don't care about:
17-
18-
- Memory usage. Python's dicts are used literally everywhere, so a tiny
19-
reduction in the footprint of the average dict results in a significant gain
20-
for *all* Python programs. We are happy to instead trade a few extra bytes
21-
of RAM for a more cache-friendly hash table design. Since we don't store
22-
values, we are still close to the same size on average!
23-
24-
- Worst-case performance. Again, Python's dicts are used for literally
25-
everything, so they need to be able to gracefully handle lots of hash
26-
collisions, whether resulting from bad hash algorithms, heterogeneous keys
27-
with badly-combining hash algorithms, or maliciously-formed input. We can
28-
safely assume that our use cases don't need to worry about these issues, and
29-
instead choose lookup and collision resolution strategies that utilize cache
30-
lines more effectively. This extends to the case of lookups for nonexistent
31-
keys as well; we can assume that if our users are looking for something,
32-
they know that it's probably there.
33-
34-
What we do care about:
35-
36-
- Creation and update time. This is *by far* the most expensive operation you
37-
do on a mapping. More on this below.
38-
39-
- The speed of lookups that result in hits. This is what the mapping is used
40-
for, so it *must* be good. More on this below.
41-
42-
- Iteration order and speed. You really can't beat a Python list or tuple
43-
here, so we can just store the keys in one of them to avoid reinventing the
44-
wheel. We use a list since it allows us to grow more efficiently.
45-
46-
So what we need is a hash table that's easy to insert into and easy to scan.
47-
48-
Here's how it works. A vanilla Python dict of the form:
49-
50-
{a: 0, b: 1, c: 2}
51-
52-
...basically looks like this (assume the hashes are 3, 6, and 9):
53-
54-
Indices: [-, 2, -, 0, -, -, 1, -]
55-
56-
Hashes: [3, 6, 9, -, -]
57-
Keys: [a, b, c, -, -]
58-
Values: [0, 1, 2 -, -]
59-
60-
It's pretty standard; keys, values, and cached hashes are stored in sequential
61-
order, and their offsets are placed in the Indices table at position
62-
HASH % TABLE_SIZE. Though it's not used here, collisions are resolved by jumping
63-
around the table according to the following recurrence:
64-
65-
NEXT_INDEX = (5 * CURRENT_INDEX + 1 + (HASH >>= 5)) % TABLE_SIZE
66-
67-
This is good in the face of bad hash algorithms, but is sorta expensive. It's
68-
also unable to utilize cache lines at all, since it's basically random (it's
69-
literally based on random number generation)!
70-
71-
To contrast, the same table looks something like this for us:
72-
73-
Indices: [-, -, -, 0, -, -, 1, -, -, 2, -, -, -, -, -, -, -, -, -]
74-
Hashes: [-, -, -, 3, -, -, 6, -, -, 9, -, -, -, -, -, -, -, -, -]
75-
76-
Keys: [a, b, c]
77-
78-
Right away you can see that we don't need to store the values, because they
79-
match the indices (by design).
80-
81-
Notice that even though we allocated enough space in our table for 19 entries,
82-
we still insert them into initial position HASH % 4. This leaves the whole
83-
15-element tail chunk of the table free for colliding keys. So, what's a good
84-
collision-resolution strategy?
85-
86-
NEXT_INDEX = CURRENT_INDEX + 1
87-
88-
It's just a sequential scan! That means *every* collision-resolution lookup is
89-
hot in L1 cache (and can even be predicted and speculatively executed). The
90-
indices and hashes are actually interleaved for better cache locality as well.
91-
92-
We repeat this scan 15 times. We don't even have to worry about wrapping around
93-
the edge of the table during this part, since we've left enough free space
94-
(equal to the number of scans) to safely run over the end. It's wasteful for a
95-
small example like this, but for more realistic sizes it's just about perfect.
96-
97-
We then jump to another spot in the table using a version of the recurrence
98-
above:
99-
100-
NEXT_INDEX = (5 * (CURRENT_INDEX - 15) + 1 + (HASH >>= 1)) % TABLE_SIZE
101-
102-
...and repeat the whole thing over again. This collision resolution strategy is
103-
similar to what Python's sets do, so we still handle some nasty collisions and
104-
missing keys well.
105-
106-
There are a couple of other tricks that we use (like globally caching integer
107-
objects from value lookups), but the hardware-friendly hash table design is what
108-
really gives us our awesome performance.
109-
110-
*******************************************************************************/
1116
# include <math.h>
1127
# define PY_SSIZE_T_CLEAN
1138
# include "Python.h"

tasks.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,12 @@
88

99
@invoke.task
1010
def install(context):
11-
# type: (invoke.Context) -> None
1211
run(context, f"{sys.executable} -m pip install --upgrade pip")
1312
run(context, f"{sys.executable} -m pip install --upgrade -r requirements.txt")
1413

1514

1615
@invoke.task()
1716
def clean(context):
18-
# type: (invoke.Context) -> None
19-
# run(context, f"{sys.executable} setup.py develop --uninstall")
2017
run(context, f"{sys.executable} -m pip uninstall --yes arraymap")
2118

2219
for artifact in ("*.egg-info", "*.so", "build", "dist"):
@@ -26,12 +23,9 @@ def clean(context):
2623

2724
@invoke.task(clean)
2825
def build(context):
29-
# type: (invoke.Context) -> None
30-
# run(context, f"{sys.executable} setup.py develop")
3126
run(context, f"{sys.executable} -m pip -v install .")
3227

3328

3429
@invoke.task(build)
3530
def test(context):
36-
# type: (invoke.Context) -> None
3731
run(context, f"{sys.executable} -m pytest -v")

0 commit comments

Comments
 (0)