2025, Nov 24 21:00

How to Rotate Coordinates from a NumPy Structured Array and Avoid the ufunc matmul Error

Learn why NumPy matmul fails on structured arrays and how to rotate (x,y,z) coordinates: convert with structured_to_unstructured, multiply, then write back.

Rotating particle coordinates straight from a NumPy structured array looks tempting, especially when the data mirrors LAMMPS output. But trying to apply matrix multiplication directly across fields usually ends with an unpleasant surprise: NumPy’s math ufuncs don’t operate across structured fields the way you might expect. Here’s a clean way to handle it without mangling the data layout.

Reproducing the issue

The setup uses a structured dtype for x, y, z, and attempts a rotation via matrix multiplication on the field subset. The code below mirrors that pattern.

import numpy as np
rot_mtx = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.float64)
coord_sig = np.dtype([("x", np.float64), ("y", np.float64), ("z", np.float64)])
cloud = np.array(
    [
        (0.0, 0.0, 0.0),
        (1.0, 0.0, 0.0),
        (0.0, 1.0, 0.0),
        (1.0, 1.0, 1.0),
    ],
    dtype=coord_sig,
)
cloud[["x", "y", "z"]] = cloud[["x", "y", "z"]] @ rot_mtx.T

This raises a type-related failure from ufunc matmul.

ufunc 'matmul' did not contain a loop with signature matching types (dtype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')]), dtype('float64')) -> None

What’s actually wrong

NumPy doesn’t support doing math across fields of a structured array. A fielded record is great for organizing heterogeneous data in a single container, but it is not a good fit for multidimensional linear algebra. That’s why matrix multiplication across a selection of fields fails: the operation expects a regular float array of shape (n, 3), not a structured dtype with named fields.

There’s also a practical performance angle. Applying matmul to a plain 2D float array is typically the fastest path because it can go straight into compiled kernels without the overhead of field handling. With structured arrays, there is a trade-off between organizational convenience and calculation speed.

And if your real-world dataset contains additional columns beyond coordinates—integers, booleans, even strings—keeping a structured layout is reasonable. You just don’t want to perform the numerical kernels directly on the structured container.

Solution: convert positions to an unstructured float array, rotate, then write back

A direct and readable approach is to convert the coordinate view to an unstructured (n, 3) float array, perform the rotation, and copy the results back into the fields. The code below makes that flow explicit.

import numpy as np
from numpy.lib.recfunctions import structured_to_unstructured as to_raw
rot_mtx = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.float64)
coord_sig = np.dtype([("x", np.float64), ("y", np.float64), ("z", np.float64)])
cloud = np.array(
    [
        (0.0, 0.0, 0.0),
        (1.0, 0.0, 0.0),
        (0.0, 1.0, 0.0),
        (1.0, 1.0, 1.0),
    ],
    dtype=coord_sig,
)
pts = to_raw(cloud[["x", "y", "z"]], dtype=np.float64, copy=False)
pts = pts @ rot_mtx.T
cloud["x"] = pts[:, 0]
cloud["y"] = pts[:, 1]
cloud["z"] = pts[:, 2]

This keeps the data organized as a structured array for everything else you store alongside coordinates, while letting the math run on a plain float matrix where NumPy shines. The transformation step is explicit and easy to reason about.

Why this matters

When working with large simulations and tens of millions of atoms, avoiding confusion between organization and computation is key. Structured arrays remain a solid choice for heterogeneous records, but the numerical kernels should operate on homogeneous arrays. In practice, matrix multiplication is fastest on a standalone (n, 3) float array, and converting the coordinate fields to such a view is a clear and maintainable way to achieve that. It also aligns with how NumPy is designed: records for storage, ndarrays for math.

Wrap-up

If you hit ufunc 'matmul' errors with structured arrays, don’t fight the type system. Extract the coordinate block with structured_to_unstructured, apply the transform on a regular float array, and write the results back into the fields. You’ll keep your data model intact, the code remains clear, and the math runs efficiently. If your dataset grows with more non-float fields, this pattern scales without changing the surrounding structure.