2025, Nov 05 13:00

Avoid shared state when storing custom Python objects with NumPy: use instance attributes and object arrays

Learn why NumPy updates affect every object: class attributes cause shared state. Fix it with instance attributes and dtype=object for safe containers.

When building containers of custom Python objects alongside NumPy arrays, it’s easy to run into unexpected shared state. A common symptom is that updating one object appears to update all the others. The root cause isn’t in NumPy’s broadcasting, but in how attributes are defined on the class.

Problem setup

Consider a class that allocates a 3×2 NumPy array and an identifier, and a fixed-size container of four instances. After assigning a value into the array of just one instance, you discover that the change shows up in every instance.

import numpy as np

class Vertex():
    matrix = np.zeros((3, 2))
    tag = 0

verts = np.empty((4)).astype(Vertex)

for k in range(4):
    v = Vertex()
    v.tag = k
    verts[k] = v

# Later, a single update appears to affect all entries:
verts[1].matrix[0] = [5, 5]

By contrast, a pure NumPy approach behaves as expected, because each subarray is distinct memory and not tied to Python objects.

import numpy as np

arr = np.zeros((4, 3, 2))

def dump(x):
    for a in range(4):
        print("[", end="")
        for b in range(3):
            print(x[a][b], end="")
        print("] ")
    print()

dump(arr)
print("set arr[1][0] = [5, 5]...")
arr[1][0] = [5, 5]
dump(arr)

Why this happens

The array and identifier were placed on the class, not on each instance. Attributes defined at the class level are shared across all instances, so every object references the same underlying NumPy array. Assigning through any instance mutates that single shared array, which makes it appear as if all objects changed simultaneously.

In addition, when creating arrays intended to hold Python objects, NumPy expects an object array. Using a regular numeric dtype or casting via the class doesn’t give you a true container of independent Python objects.

Fix and working example

Move the array and the identifier into the constructor so each instance owns its own state, and make the container an object array. After that, an in-place update on one instance remains local to that instance.

import numpy as np

class Vertex:
    def __init__(self):
        self.matrix = np.zeros((3, 2))
        self.tag = 0

verts = np.empty((4,), dtype=object)  # note the comma in shape and dtype=object

for k in range(4):
    v = Vertex()
    v.tag = k
    verts[k] = v

# Now a single update affects only one instance:
verts[1].matrix[0] = [5, 5]

This approach ensures that each object has its own NumPy array and prevents cross-talk between instances. The object dtype on the outer container makes it a proper holder for arbitrary Python objects.

Why it matters

Accidental shared state can silently corrupt data flows in numerical code and simulations, where mutable arrays are frequently updated in place. Recognizing the difference between class attributes and instance attributes, and using object arrays when you actually want a collection of Python objects, helps avoid subtle bugs that are hard to trace.

Takeaways

Place per-object data in the constructor so each instance has its own copy. When you need a NumPy container of Python objects, explicitly allocate it with dtype=object and a proper shape. With these two practices, updating one object won’t unexpectedly ripple through the rest.

The article is based on a question from StackOverflow by user1069353 and an answer by Uchenna Adubasim.