2025, Oct 21 04:00

Generate buildup-peak-runoff accumulation matrices using NumPy masks and diagonals, with a complete Pandas example

Learn how to build N by 3N-2 buildup-peak-runoff matrices using NumPy masks and diagonals, then export to a Pandas DataFrame. Includes code and explanation.

When a sequence of weights A1, A2, ..., AN must be combined into a rectangular structure that builds up, peaks, and then tails off while cycling through the weights, it’s easy to overcomplicate the logic. A clean way to generate this structure is to work with NumPy masks that populate diagonals in a predictable pattern and then aggregate them with the given weights. The end result can be turned into a Pandas DataFrame if needed.

What we want to generate

For N numbers A1..AN, construct an array with N rows and 3N−2 columns. Each row corresponds to a shifted starting point, and columns represent the accumulation through the sequence that first grows by adding the next values, then shrinks by dropping the earliest ones, with the sequence cycling through the A’s. For example, with N = 3 this produces:

    0   1   2   3   4   5   6
0   1.0 3.0 6.0 5.0 3.0 0.0 0.0
1   0.0 2.0 5.0 6.0 4.0 1.0 0.0
2   0.0 0.0 3.0 4.0 6.0 3.0 2.0

This pattern is useful, for instance, in reinsurance with risks attaching during the year, where A1..AN can represent the monthly shares of written business, and each row shows how written amounts accumulate and then decay as contracts run off.

Code example: the core pattern

The key is to build, for each shift, a mask of ones that selects where a weight contributes, then multiply and sum the masks. For a given shift, the mask has two rectangular blocks of ones: the first grows down to the current shift, the second continues across the remaining rows. Here is the minimal construction of these masks:

import numpy as np
dim = 3
for ofs in range(dim):  # ofs stands for the current shift
    mask = np.zeros((dim, 3*dim - 2))
    left = dim + ofs
    right = 2*dim + ofs
    mask[:ofs+1, ofs:left] = 1
    mask[ofs+1:, left:right] = 1

For dim = 3, these masks look like this:

[[1. 1. 1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 1. 1. 0.]
 [0. 0. 0. 1. 1. 1. 0.]]
[[0. 1. 1. 1. 0. 0. 0.]
 [0. 1. 1. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 1. 1.]]
[[0. 0. 1. 1. 1. 0. 0.]
 [0. 0. 1. 1. 1. 0. 0.]
 [0. 0. 1. 1. 1. 0. 0.]]

Why this works

Each mask encodes a diagonal-like contribution window. The first block of ones spans from the current column equal to the shift and fills downward to the current row, producing the buildup. The second block kicks in from column dim + shift across the remaining rows, producing the run-off. Summing these masks, each scaled by the corresponding weight, yields the matrix where each column represents a specific combination of consecutive weights, and across the middle region you see the maximal sums of N consecutive terms. The shape is N by 3N−2, matching the buildup-through-peak-and-decay span.

Complete solution for any N

Here is the full procedure that multiplies each mask by its weight and aggregates the result. It builds a NumPy array first and then converts to a Pandas DataFrame.

import numpy as np
import pandas as pd
dim = 3
weights = [1, 2, 3]
accum = np.zeros((dim, 3*dim - 2))
for ofs in range(dim):
    mask = np.zeros((dim, 3*dim - 2))
    left = dim + ofs
    right = 2*dim + ofs
    mask[:ofs+1, ofs:left] = 1
    mask[ofs+1:, left:right] = 1
    accum += weights[ofs] * mask
pd.DataFrame(accum)

For dim = 3 and weights [1, 2, 3], the resulting DataFrame is:

    0   1   2   3   4   5   6
0   1.0 3.0 6.0 5.0 3.0 0.0 0.0
1   0.0 2.0 5.0 6.0 4.0 1.0 0.0
2   0.0 0.0 3.0 4.0 6.0 3.0 2.0

Why it’s worth internalizing

This construction aligns directly with how such schedules behave in practice, for example with risks attaching during the year in reinsurance. The approach exploits the homogenous numeric nature and simple indexing with NumPy, which is a natural fit here. If you prefer to think in diagonals, NumPy’s tooling, such as diag, is also available to assemble diagonal structures. It’s more robust than trying to mutate columns by repeatedly adding and removing terms, and it naturally generalizes to any N.

Takeaways

Model the buildup and run-off using masks and aggregate with the given weights. Work in NumPy to construct the N by 3N−2 array, then convert to a DataFrame for presentation. Think in terms of shifted windows that become diagonals, and let those masks do the bookkeeping. This keeps the logic clear and makes it easy to scale to different N without special cases.

The article is based on a question from StackOverflow by Matta and an answer by Jokilos.