2025, Oct 01 09:00

Fast eigenvectors for upper triangular matrices: why numpy.linalg.eig beats hand-rolled solves

Compute eigenvectors of upper triangular matrices efficiently in Python. See why numpy.linalg.eig uses LAPACK fast paths and beats per-eigenvalue solves.

When your input matrix is triangular, eigenvalues come “for free” from the diagonal. It is tempting to bypass general-purpose routines and hand-roll something faster for diagonalization or for extracting eigenvectors. The catch is that diagonalization is not always possible for a triangular matrix, and even when you only need eigenvectors, the off-the-shelf path can already be the fastest.

Problem setup

The task is to compute eigenvectors for many triangular matrices without paying unnecessary overhead. The diagonal gives the eigenvalues immediately, so solving (A − λI)x = 0 for each λ might look attractive. But if you have on the order of a few hundred matrices and similarly many eigenvalues per matrix, per-eigenvalue solves quickly add up. The practical question is whether numpy.linalg.eig is “too generic” here or whether it already optimizes for triangular input.

Code example: benchmarking the approaches

The following snippet compares performance of eigen-decomposition against QR and LU factorization on dense random matrices versus their upper-triangular counterparts.

import numpy as np
import scipy.linalg as sla
np.random.seed(42)
for dim in [100, 1000]:
    print(f"benchmark on size {dim}")
    mat = np.random.normal(size=(dim, dim))
    print("dense input")
    %timeit np.linalg.eig(mat)
    %timeit np.linalg.qr(mat)
    %timeit sla.lu_factor(mat)
    print("upper triangular input")
    tri = np.random.normal(size=(dim, dim))
    tri = np.triu(tri)
    %timeit np.linalg.eig(tri)
    %timeit np.linalg.qr(tri)
    %timeit sla.lu_factor(tri)

What is actually happening

Efficient algorithms that compute eigenvectors typically also produce eigenvalues as part of the same pipeline, so you do not lose anything by calling a single routine. In practice, numpy.linalg.eig is well-tuned even when the input is already upper triangular. Reading the implementation shows that OpenBLAS performs a QR factorization to move to an upper triangular form and then uses the strevc3_ LAPACK routine to obtain eigenvalues and eigenvectors. If you feed it an upper triangular matrix from the start, the QR step is especially fast.

Empirical timing backs this up. On upper triangular input of size 100, np.linalg.eig completes in roughly hundreds of microseconds, while for size 1000 it finishes in roughly hundreds of milliseconds. In the same setup, running eig on a full dense matrix takes around 11.7 ms for size 100 and about 1.31 s for size 1000. In other words, on triangular inputs it sees a large speedup versus dense inputs.

Solution: just use numpy.linalg.eig on the triangular input

If you already have an upper triangular matrix, call eig directly. You avoid the overhead of per-eigenvalue solves and leverage the fast path. Here is a minimal usage pattern.

import numpy as np
np.random.seed(42)
n = 1000
tri = np.triu(np.random.normal(size=(n, n)))
vals, vecs = np.linalg.eig(tri)

In broader testing, this approach has been about 10x–20x faster for upper triangular matrices compared to processing dense matrices of the same size. The earlier benchmark illustrates the same trend, with eig on triangular inputs completing in about 438 µs versus 11.7 ms for size 100, and about 157 ms versus 1.31 s for size 1000.

If you are considering calling strevc3_ directly from Python, there isn’t a simple route. Although SciPy and NumPy include a strevc3_ or dtrevc3 routine internally, there is no Python-accessible wrapper exposed by either library.

A necessary caveat about diagonalization

There is an important distinction between computing eigenvectors and diagonalizing a matrix. Not every triangular matrix is diagonalizable. A simple counterexample is [[1, 1], [0, 1]]: it has only one eigenvalue and not enough linearly independent eigenvectors to form a basis, so it cannot be diagonalized. If your end goal is strictly diagonalization, make sure it is theoretically possible; otherwise, any attempt to assemble a diagonal form will fail regardless of how quickly you compute eigenvectors.

Why this matters

When processing many matrices, shaving constant factors changes runtimes dramatically. Using a well-optimized eigen-decomposition leverages fast paths for triangular inputs and avoids re-implementing per-eigenvalue solves that add overhead. It also keeps your code shorter and more maintainable, without sacrificing performance.

Takeaways

If your matrix is upper triangular, call numpy.linalg.eig directly and pass the triangular data as-is. Do not burn cycles on separate solves for each eigenvalue. Be aware that triangular does not imply diagonalizable; the example [[1, 1], [0, 1]] highlights why diagonalization can be impossible even when eigenvalues are trivial to read off the diagonal. If you were hoping to wire up strevc3_ from Python, note that while the routine exists internally, there is no straightforward public wrapper in NumPy or SciPy.

The article is based on a question from StackOverflow by Nim and an answer by Nick ODell.