2025, Dec 09 09:00

Nearest displacement per index between two arrays in Python: avoid naive NumPy broadcasting, use a simple loop and optional Numba JIT

Learn how to compute nearest displacement per index between two arrays in Python. See why NumPy broadcasting fails and implement a correct loop with Numba JIT.

Aligning two integer arrays by computing, for each position, how far you need to move to hit the closest equal value in the other array sounds straightforward until you try to vectorize it. A real-world case is one million elements with a small value space limited to integers in the range [0, 10], no missing values, and a strict requirement to compute the smallest displacement per position. Below is a concise walk-through of why a naïve NumPy approach fails and how to implement a correct, loop-based solution that you can later JIT-compile if needed.

Reproducing the problem

Consider two arrays and the expected per-index displacement to reach the nearest match in the other array. The first position in the left array is 1; in the right array the closest 1 is three steps away, so the result begins with 3. Zeroes at aligned positions have displacement 0. The goal is to compute such a displacement for every index.

import numpy as np

arr_u = np.array([1, 0, 2, 0, 0, 1, 0, 2, 0, 0, 1, 0, 2, 0, 0])
arr_v = np.array([2, 0, 0, 1, 0, 2, 0, 0, 1, 0, 2, 0, 0, 1, 0])

expected = np.array([3, 0, 2, 1, 0, 2, 0, 2, 1, 0, 2, 0, 2, 1, 0])

A tempting approach is to compute all indices for each unique label and subtract them. The following function demonstrates that pitfall. It looks vectorized and concise but produces incorrect distances because it pairs occurrences by order rather than by proximity.

def offsets_broadcast_fail(seq_x, seq_y):
    assert len(seq_x) == len(seq_y)
    out = np.zeros(len(seq_x))
    classes = np.unique(seq_x)
    for tag in classes:
        idx_x = np.where(seq_x == tag)
        idx_y = np.where(seq_y == tag)
        out[idx_x] = np.subtract(idx_x, idx_y)
    return np.abs(out).astype(int)

bad = offsets_broadcast_fail(arr_u, arr_v)

This produces a result matching order-aligned occurrences, not nearest ones. In the example above it yields [3, 0, 2, 1, 0, 3, 0, 2, 1, 0, 3, 0, 2, 1, 0] instead of the expected [3, 0, 2, 1, 0, 2, 0, 2, 1, 0, 2, 0, 2, 1, 0].

Why the vectorized attempt fails

For each value, the approach subtracts one entire set of indices from another. That implicitly pairs the kth occurrence in one array with the kth occurrence in the other, regardless of where the nearest same-valued element actually is. Aligning by occurrence order is not the same as searching for the closest match around each position. As a result, some distances are too large when a nearer identical value exists at a different ordinal position.

A simple, correct approach

Searching is inherently local and conditional. Instead of forcing a global pairing, step outwards from each index to the left and right until you discover the same value in the other array. If the values already match at the same index, the displacement is zero. If no match exists at all, record -1. The following implementation stays faithful to that logic and is easy to reason about and test.

def nearest_shift_offsets(seq_a, seq_b):
    assert len(seq_a) == len(seq_b)
    n = len(seq_a)
    out = np.zeros(n, np.int64)
    for pos, val in enumerate(seq_a):
        if seq_b[pos] == val:
            out[pos] = 0
            continue
        for step in range(1, max(n - pos, pos)):
            if (pos - step >= 0) and (seq_b[pos - step] == val):
                out[pos] = step
                break
            elif (pos + step < n) and (seq_b[pos + step] == val):
                out[pos] = step
                break
        else:
            out[pos] = -1
    return out

good = nearest_shift_offsets(arr_u, arr_v)

This yields the expected displacement array for the sample inputs.

What to expect at scale

The input constraints matter. Arrays can be on the order of one million elements, values are dense integers in the range [0, 10], and there are no missing values. With such data, the looped search is the correct baseline. If runtime is acceptable, you are done. If it is not, the function can be JIT-compiled. Under the given constraints, there is a concrete note that making the output buffer explicitly np.int64 enables Numba compatibility. Reported profiling indicates that a JIT-compiled version is very fast in cases like evenly distributed items, while being extremely slow in pathological cases such as sorted inputs. Also be mindful of misleading timing readouts; it was pointed out that microsecond results for million-scale inputs are not credible.

Why this is worth knowing

Not every task with arrays benefits from vectorization. Search and alignment often demand control flow that does not map well to broadcasting and reductions. Recognizing where to stop chasing vectorized one-liners and falling back to straightforward loops saves you from subtle correctness bugs and wasted optimization effort. In this case, there is no ready-made NumPy or SciPy primitive targeted at this specific nearest-match displacement, and the simplest correct approach is also the most maintainable starting point.

Conclusion

When the goal is nearest displacement per index between two arrays, avoid global index-set arithmetic that pairs occurrences by order. Implement a direct, index-local search that expands outward until it finds a match. Validate it on small samples, then scale up. If runtime becomes an issue, JIT-compiling the same logic can provide a practical speedup, with the caveat that performance depends on data distribution. Keep the output dtype compatible for compilation, and treat unbelievably short timings with skepticism.