https://pytroubles.com/en/posts/id23-python-shared-memory-why-you-can-t-cast-objects-across-processes-and-how-to-share-them-safely

Python shared memory: why you can't cast objects across processes and how to share them safely

Sharing Python objects across processes: why raw shared memory fails and the correct pickle-based approach

Python shared memory: why you can't cast objects across processes and how to share them safely

Learn why Python's multiprocessing.shared_memory can't hold live objects, what breaks with pointer-style casts, and how to share data safely via pickle.

2025-09-19T09:00:04+03:00

Sharing a complex Python object across processes via multiprocessing.shared_memory looks tempting, especially if you are used to C++-style pointer casting. But Python does not allow you to reinterpret arbitrary bytes in shared memory as a live instance of an arbitrary class. Here is why, what goes wrong with the naive approach, and what the canonical solution looks like.Problem setupImagine you create a shared memory block, define a generic class, and then try to put an instance into that shared memory by simply assigning it to the buffer reference. The code below illustrates the idea that does not work.from multiprocessing import shared_memory mem_region = shared_memory.SharedMemory(create=True, size=1024) view = mem_region.buf class Payload: def __init__(self, x, y): self.x = x self.y = y item_a = Payload(1, 6) view = item_a # This does not place the object into shared memory And in another process you try to open that region by name and recover the object:from multiprocessing import shared_memory opened_region = shared_memory.SharedMemory(name='psm_21467_46075') The question is how to get a variable, say item_b, to point to the shared Payload object, as you might do in C++ by casting a void * to the desired type.What actually goes wrongIn Python, objects are managed by the runtime. A Python instance does have a concrete byte representation in memory, but those bytes alone are not enough to reconstruct a working object in a different process. Instances carry references to other Python objects, including a reference to their class object. Even if both processes import the same module, the class object will live at a different address in the other interpreter. You might hit the same address by luck, but you cannot rely on it. Any internal references would break when blindly reinterpreting raw bytes.There is also reference counting to consider. If a second interpreter suddenly starts treating those bytes as an official object, its reference count would diverge from the original interpreter’s accounting, which can corrupt object lifetime management in subtle ways.In static languages like C++ the compiler bakes in the memory layout of a type. The runtime does not need to know about types to fetch a field at a fixed offset. In Python the class an object belongs to is stored in the instance itself as a reference, and the layout for attributes is discovered dynamically.About subinterpretersFrom Python 3.12 onward, subinterpreters are accessible from Python code. Within the same process, directly accessing the same memory can sometimes appear to work. There is code demonstrating this approach at https://github.com/jsbueno/extrainterpreters (not updated for Python 3.14). Even then, this is not the recommended way to share live objects: reference counting remains a problem, and attributes that reference other containers or instances may be subject to parallel access without protections like the GIL or finer-grained locks in free-threaded builds.The approach that works: serializeThe canonical path is to serialize the object in one process, copy the serialized bytes into shared memory, and deserialize in the other process. Python’s built-in pickle handles this.Place your class definition in a module importable by both processes. For example, in a file typespec.py:class Payload: def __init__(self, x, y): self.x = x self.y = y In the writer process:from multiprocessing import shared_memory from typespec import Payload import pickle region = shared_memory.SharedMemory(create=True, size=1024) print(region.name) obj_writer = Payload(5, 6) blob = pickle.dumps(obj_writer) region.buf[0:len(blob)] = blob In the reader process:from multiprocessing import shared_memory import pickle region_opened = shared_memory.SharedMemory("psm_ff9c5e26") obj_reader = pickle.loads(region_opened.buf) You do not need to explicitly import the module in the reader process. The serialized payload includes the __module__ information for the class, and pickle will import it as needed before re-instantiating the object.There is overhead compared to using an object in place—on the order of a couple magnitudes—but this is the standard and supported way.Practical angleIf your objects combine regular Python attributes with large numeric buffers (for example, a dataframe’s underlying data), it may be possible to serialize the lightweight structure while sharing large buffers without copying for speed. Achieving that is non-trivial. As a starting point, see PEP 574. Projects like Dask implement similar ideas and can be faster than naive pickle-only flows.Sharing live state instead of bytesIf you need to observe and mutate object attributes across processes so that setting obj.a = 5 in one process is visible in another safely, look at the tools under multiprocessing.manager. They provide a higher-level mechanism for cross-process object access and may involve an additional process to broker communication.Why this mattersTreating shared memory as a universal container for arbitrary Python instances is unreliable because of interpreter-managed object identity, internal references, and reference counting. The supported path—serialization—keeps processes independent, avoids undefined behavior, and integrates with Python’s multiprocessing and concurrent.futures primitives, which already use serialization under the hood. The trade-off is overhead, but correctness wins here.ConclusionDo not attempt to cast raw shared memory to a Python class instance as you would with void * in C++. Use pickle to serialize into shared memory and deserialize on the other side. When you need live, coordinated state across processes, consider multiprocessing.manager. If your workloads mix metadata with large binary payloads, explore approaches in the spirit of PEP 574 or adopt tooling that already optimizes this path. Above all, lean on the mechanisms Python provides rather than fighting the runtime’s object model.

python shared memory, multiprocessing.shared_memory, share objects across processes, pickle serialization, deserialize, PEP 574, multiprocessing Manager, subinterpreters, C++ pointer cast

2025

2025, Sep 19 09:00

Sharing Python objects across processes: why raw shared memory fails and the correct pickle-based approach

Learn why Python's multiprocessing.shared_memory can't hold live objects, what breaks with pointer-style casts, and how to share data safely via pickle.

Problem setup

Imagine you create a shared memory block, define a generic class, and then try to put an instance into that shared memory by simply assigning it to the buffer reference. The code below illustrates the idea that does not work.

from multiprocessing import shared_memory
mem_region = shared_memory.SharedMemory(create=True, size=1024)
view = mem_region.buf
class Payload:
    def __init__(self, x, y):
        self.x = x
        self.y = y
item_a = Payload(1, 6)
view = item_a  # This does not place the object into shared memory

And in another process you try to open that region by name and recover the object:

from multiprocessing import shared_memory
opened_region = shared_memory.SharedMemory(name='psm_21467_46075')

The question is how to get a variable, say item_b, to point to the shared Payload object, as you might do in C++ by casting a void * to the desired type.

What actually goes wrong

In Python, objects are managed by the runtime. A Python instance does have a concrete byte representation in memory, but those bytes alone are not enough to reconstruct a working object in a different process. Instances carry references to other Python objects, including a reference to their class object. Even if both processes import the same module, the class object will live at a different address in the other interpreter. You might hit the same address by luck, but you cannot rely on it. Any internal references would break when blindly reinterpreting raw bytes.

There is also reference counting to consider. If a second interpreter suddenly starts treating those bytes as an official object, its reference count would diverge from the original interpreter’s accounting, which can corrupt object lifetime management in subtle ways.

In static languages like C++ the compiler bakes in the memory layout of a type. The runtime does not need to know about types to fetch a field at a fixed offset. In Python the class an object belongs to is stored in the instance itself as a reference, and the layout for attributes is discovered dynamically.

About subinterpreters

From Python 3.12 onward, subinterpreters are accessible from Python code. Within the same process, directly accessing the same memory can sometimes appear to work. There is code demonstrating this approach at https://github.com/jsbueno/extrainterpreters (not updated for Python 3.14). Even then, this is not the recommended way to share live objects: reference counting remains a problem, and attributes that reference other containers or instances may be subject to parallel access without protections like the GIL or finer-grained locks in free-threaded builds.

The approach that works: serialize

The canonical path is to serialize the object in one process, copy the serialized bytes into shared memory, and deserialize in the other process. Python’s built-in pickle handles this.

Place your class definition in a module importable by both processes. For example, in a file typespec.py:

class Payload:
    def __init__(self, x, y):
        self.x = x
        self.y = y

In the writer process:

from multiprocessing import shared_memory
from typespec import Payload
import pickle
region = shared_memory.SharedMemory(create=True, size=1024)
print(region.name)
obj_writer = Payload(5, 6)
blob = pickle.dumps(obj_writer)
region.buf[0:len(blob)] = blob

In the reader process:

from multiprocessing import shared_memory
import pickle
region_opened = shared_memory.SharedMemory("psm_ff9c5e26")
obj_reader = pickle.loads(region_opened.buf)

You do not need to explicitly import the module in the reader process. The serialized payload includes the __module__ information for the class, and pickle will import it as needed before re-instantiating the object.

There is overhead compared to using an object in place—on the order of a couple magnitudes—but this is the standard and supported way.

Practical angle

If your objects combine regular Python attributes with large numeric buffers (for example, a dataframe’s underlying data), it may be possible to serialize the lightweight structure while sharing large buffers without copying for speed. Achieving that is non-trivial. As a starting point, see PEP 574. Projects like Dask implement similar ideas and can be faster than naive pickle-only flows.

Sharing live state instead of bytes

If you need to observe and mutate object attributes across processes so that setting obj.a = 5 in one process is visible in another safely, look at the tools under multiprocessing.manager. They provide a higher-level mechanism for cross-process object access and may involve an additional process to broker communication.

Why this matters

Treating shared memory as a universal container for arbitrary Python instances is unreliable because of interpreter-managed object identity, internal references, and reference counting. The supported path—serialization—keeps processes independent, avoids undefined behavior, and integrates with Python’s multiprocessing and concurrent.futures primitives, which already use serialization under the hood. The trade-off is overhead, but correctness wins here.

Conclusion

Do not attempt to cast raw shared memory to a Python class instance as you would with void * in C++. Use pickle to serialize into shared memory and deserialize on the other side. When you need live, coordinated state across processes, consider multiprocessing.manager. If your workloads mix metadata with large binary payloads, explore approaches in the spirit of PEP 574 or adopt tooling that already optimizes this path. Above all, lean on the mechanisms Python provides rather than fighting the runtime’s object model.

The article is based on a question from StackOverflow by Geremia and an answer by jsbueno.

casting multiprocessing python shared-memory void-pointers