2025, Nov 05 21:00

Fixing Python dict membership errors: implement hash to match eq and avoid KeyError

Why Python dict membership can differ from in checks when __eq__ and __hash__ mismatch. Learn fixes, hashable key rules, mutability caveats, and avoid KeyError.

When you rely on Python’s in operator to check membership, the documentation suggests a simple mental model: for container types, x in y behaves like any(x is e or x == e for e in y). That model holds nicely for sequences, but it can fall apart with mappings like dict if your keys are not truly hashable. Here is a minimal, reproducible walk-through that demonstrates why dictionary membership can differ from a naive any(...) over the same keys.

The problem, reproduced

class Box:
    def __init__(self, payload):
        self.payload = payload
    def __eq__(self, other):
        return self.payload == other.payload
    __hash__ = object.__hash__

x = Box(1)
y = Box(2)
z = Box(3)

seq_boxes = [x, y, z]

print("Box(2) in seq_boxes:")
print("   ", Box(2) in seq_boxes)

Box(2) in seq_boxes:
    True

The same object model as a key in a mapping suddenly behaves differently.

box_map = {x: "ekam", y: "dve", z: "trini"}

print("box_map[Box(2)]:")
try:
    print(box_map[Box(2)])
except KeyError:
    print("    KeyError")

print()
print("Box(2) in box_map:")
print("   ", Box(2) in box_map)

box_map[Box(2)]:
    KeyError

Box(2) in box_map:
    False

If the docs say it’s equivalent to any(x is e or x == e for e in y), why does a direct test appear to contradict what dict reports?

def contains_probe(item, container):
    return any(item is cand or item == cand for cand in container)

print("contains_probe(Box(2), box_map):")
print("   ", contains_probe(Box(2), box_map))

contains_probe(Box(2), box_map):
    True

Iterating the mapping’s keys shows that equality does hit:

def probe_trace(item, container):
    for cand in container:
        print(item is cand, item == cand)

print("probe_trace(Box(2), box_map):")
probe_trace(Box(2), box_map)

probe_trace(Box(2), box_map):
False False
False True
False False

What’s actually going on

The core issue is not the in operator. It is that dict is a mapping and mappings operate on hashable keys. The relevant definition is unambiguous: “Hashable objects which compare equal must have the same hash value.” If you define equality on your class but leave __hash__ pointing at object.__hash__, you get identity-based hashing while your equality compares contents. Two distinct Box(2) instances compare equal, but they do not share the same hash. That breaks the invariants dict depends on for membership and lookup.

Mappings are built on algorithms that assume equal keys share a hash. This assumption is what enables constant-time lookups. When that assumption is violated, membership tests and indexing against a new but equal instance will not behave as you expect.

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

That equivalence is about semantics at the level of containment, not an implementation guarantee that bypasses hashing in mappings. With dict, the membership operation uses hashing as part of its normal algorithm. If equal objects do not share a stable hash, lookups fail even though a naive any(...) over the keys would find equality.

Visible implications in real code

The consequences show up immediately when you insert seemingly “equal” keys into a dict. Instead of updating an existing entry, you get multiple entries for the same logical key.

>>> m = {Box(1): "first time"}
>>> len(m)
1
>>> m[Box(1)] = "second time"
>>> len(m)
2

Indexing with a fresh but equal instance will not find the entry either, because the hash used to place the key in the table does not match that of the new instance. The mapping is allowed to assume that equal keys hash the same, and it uses that to decide where to look.

The fix

Make the hash consistent with equality. If your equality compares payload, your hash must derive from payload as well.

class Box:
    def __init__(self, payload):
        self.payload = payload
    def __eq__(self, other):
        return self.payload == other.payload
    def __hash__(self):
        return hash(self.payload)

With this change, as long as the payload values you store are themselves hashable, dict operations, including membership and indexing, will align with your equality. “A mapping object maps hashable values to arbitrary objects,” and by providing a valid hash you meet that contract.

A crucial caveat about mutability

Hashable also means the hash value never changes during the object’s lifetime. If you mutate the state that determines your hash after using the object as a key, the mapping can no longer find it. The behavior below is expected when the key’s hash changes post-insert.

>>> k = Box(1)
>>> store = {}
>>> store[k] = "one"
>>> k in store
True
>>> k.payload = 3
>>> k in store
False
>>> # Indexing and lookups will now fail with KeyError for any variant

Once a key’s hash-affecting state changes, the original entry becomes effectively unreachable through normal dict operations. The requirement that “a hash value never changes during its lifetime” is not optional if the object is used as a key.

Why this matters

Relying on equality without a matching hash undermines how Python’s mappings work. Code that looks correct at first glance will produce silent logic errors: duplicate keys for identical values, membership tests that fail, and lookups that misbehave. Understanding that dict membership is built on the hash-equality contract prevents hard-to-debug issues and preserves the constant-time performance characteristics mappings are designed to provide.

Takeaways

If you define __eq__ for a class intended as a dict key or a set element, you must also define __hash__ so that equal instances share the same hash. Do not mutate fields that participate in the hash while the object is used as a key. When those constraints are respected, the in operator, dictionary membership, and direct indexing all behave consistently with your equality semantics.

The article is based on a question from StackOverflow by jamadagni and an answer by joanis.

python

Fixing Python dict membership errors: implement __hash__ to match __eq__ and avoid KeyError