2026, Jan 06 21:00

NumPy Boolean Masking Meets Slicing: Understanding Advanced Indexing Shape Reordering and Keeping Axis Order

Learn why NumPy boolean indexing with slicing reorders shapes and how advanced indexing behaves, with ways to keep axis order using indexing or numpy.compress

When boolean indexing in NumPy meets slicing, the resulting shape can look like it has been silently transposed. A 3D array indexed as expected may return a shape with dimensions reordered, even though nothing in the code explicitly asks for a transpose. Understanding why this happens will save time when building array pipelines and prevent subtle shape bugs.

Reproducing the surprise

import numpy as np
data3d = np.empty((2, 10, 5))
print(data3d.shape)            # (2, 10, 5)
print(data3d[0].shape)         # (10, 5)
print(data3d[0, :, :].shape)   # (10, 5)
mask_vec = [True, True, True, False, False]
print(data3d[0, :, mask_vec].shape)  # (3, 10)  <-- looks "transposed"

It’s natural to expect a shape of (10, 3) after applying the mask to the last axis of a (2, 10, 5) array. Instead, the result reports (3, 10). With another array, the same mask can look perfectly intuitive:

grid2d = np.empty((2, 5))
print(grid2d.shape)            # (2, 5)
print(grid2d[0].shape)         # (5,)
print(grid2d[0, :].shape)      # (5,)
print(grid2d[:, mask_vec].shape)  # (2, 3)

What’s actually happening

NumPy boolean masks do more than “filter” elements. They participate in advanced indexing, which can restructure the output array. In the documented case of combining advanced and basic indexing, the dimension that is sliced is placed at the end of the result, and the elements selected by the advanced index appear as the leading dimension. That’s why the (10, 3) you might expect shows up as (3, 10).

This behavior matches selecting the last axis via explicit integer indices. The following expressions are equivalent in result and order:

check = np.all(data3d[0, :, [0, 1, 2]] == data3d[0, :, mask_vec])
print(check)  # True
print(data3d[0, :, [0, 1, 2]].shape)  # (3, 10)

There’s a simple way to avoid the reordering in this case. Performing the indexing in two steps keeps the axis order as you’d intuitively expect:

print(data3d[0][:, mask_vec].shape)  # (10, 3)

How to keep the original axis order

If you want to preserve the original dimensions while applying a boolean condition along a specific axis, use numpy.compress with an explicit axis. This selects elements along the chosen axis and maintains the rest of the layout as is.

preserved = np.compress(mask_vec, data3d[0], axis=1)
print(preserved.shape)  # (10, 3)

Why this matters

Shape reordering during mixed basic and advanced indexing is subtle and easy to miss. It can ripple through code that assumes a particular axis order, leading to mismatched dimensions in matrix operations, incorrect broadcasting, or silently incorrect aggregations. Recognizing that a boolean mask may restructure the result, and knowing how to keep axis order stable when needed, makes multidimensional indexing predictable.

Takeaways

Boolean masks participate in advanced indexing and can change the order of dimensions when mixed with slicing. In the mixed case, the masked dimension leads the result and the sliced dimension is effectively tacked onto the end, which explains shapes like (3, 10) instead of (10, 3). If the goal is to preserve the axis order, either index step by step, as in data3d[0][:, mask_vec], or use numpy.compress with the axis specified. Keeping this mental model in mind will help avoid accidental shape surprises in real-world NumPy code.