https://pytroubles.com/en/posts/id2913-linearizing-rms-in-lp-why-exact-linearization-fails-and-a-piecewise-milp-approximation-with-big-m

Linearizing RMS in LP: Why Exact Linearization Fails and a Piecewise MILP Approximation with Big-M

How to Model RMS in Linear Programs: Piecewise-Linear MILP Approximation with Binaries and Big-M

Linearizing RMS in LP: Why Exact Linearization Fails and a Piecewise MILP Approximation with Big-M

Learn why RMS cannot be linearized exactly in LP and how to build a piecewise-linear MILP with binaries and Big-M to approximate sqrt(x^2+y^2) accurately.

2026-01-06T01:00:11+03:00

2026-01-06T01:00:12+03:00

Linearising RMS inside an LP sounds tempting until you try to square a decision variable. The short version: exact linearisation is not possible. You can, however, build a mixed-integer linear approximation that is often accurate enough for practice. The catch is that the upper side of the quadratic is non-convex, so a continuous solver will not do; you need a MILP with binaries that select linear segments.Where the problem startsThe attempt usually looks like this: define S as the root mean square of two decision variables and optimise a linear objective. With a simple objective like maximising one variable, the optimum is trivial and masks the real issue, but the non-linearity is still there.import numpy as np import pulp # Attempt to use RMS inside a linear model mdl = pulp.LpProblem("rms_linearisation", pulp.LpMaximize) xA = pulp.LpVariable("xA", lowBound=-10, upBound=10, cat=pulp.LpContinuous) xB = pulp.LpVariable("xB", lowBound=-10, upBound=10, cat=pulp.LpContinuous) xR = pulp.LpVariable("xR", lowBound=0, upBound=10, cat=pulp.LpContinuous) # This is non-linear and outside of the LP algebra # xR = sqrt(xA**2 + xB**2) is not valid in a linear model # and squaring variables is not supported by LP xR_expr = np.sqrt(xA**2 + xB**2) # not a valid LP construct mdl.setObjective(100 * xA) # mdl.solve() # model structure is invalid due to the non-linear expression Why this cannot be linearised exactlyThe RMS embeds squared terms. Squaring a decision variable is quadratic. There is no exact linearisation for x^2 or sqrt(x^2 + y^2) within a linear program. Local linear approximations exist, and convex (non-LP) optimisation methods can handle the convex side, but a pure LP cannot represent the full equality. Only the lower envelope of the parabola is convex. The upper side is non-convex and requires binary selectors to choose which piecewise-linear segment applies. That is why a mixed-integer formulation is necessary for a practical approximation.In many tasks you can drop one half of the parabolic constraint depending on the context, but when the direction is ambiguous you need to enforce both sides. Also note that trivial objectives can produce corner solutions where most of the constructed constraints never bind, which makes the example look simpler than the general case.A workable MILP approximationThe idea is to approximate v^2 by a set of line segments over the bounded domain of v. The lower envelope is convex and can be enforced by linear inequalities without binaries. The upper envelope is non-convex; it is enforced with a disjunctive formulation that uses one binary per segment and a Big-M term to deactivate non-selected segments. The segment count controls the accuracy; more segments give a tighter fit at the cost of more binaries.import numpy as np import pulp from matplotlib import pyplot as plt def piecewise_quad( u: pulp.LpVariable, prob: pulp.LpProblem, parts: int = 5, show: bool = False, ): u_sq = pulp.LpVariable(name=f"{u.name}_sq", cat=u.cat) grid = np.linspace(u.lowBound, u.upBound, parts) vals = grid ** 2 tangents = 2 * grid ax = None if show: fig, ax = plt.subplots() dense = np.linspace(u.lowBound, u.upBound, 201) ax.plot(dense, dense ** 2, label=f"{u.name}^2") ax.set_xlabel(u.name) ax.set_ylabel(u_sq.name) # Lower convex envelope: u_sq >= slope * u - intercept for k, (xi, yi, dyi) in enumerate(zip(grid, vals, tangents)): prob.addConstraint( name=f"hull_lo_{u.name}_{k}", constraint=u_sq >= dyi * u - yi, ) if show and ax is not None: dense = np.linspace(u.lowBound, u.upBound, 201) ax.plot(dense, dyi * dense - yi, linestyle=":", color="gray") # Upper envelope segments: require binaries flags = pulp.LpVariable.matrix( name=f"flag_{u.name}", cat=pulp.LpBinary, indices=range(parts - 1) ) prob.addConstraint( name=f"choose_one_{u.name}", constraint=pulp.lpSum(flags) == 1, ) seg_slopes = (vals[1:] - vals[:-1]) / (grid[1:] - grid[:-1]) seg_off = vals[:-1] - seg_slopes * grid[:-1] # Big-M chosen to dominate any possible violation over bounds M = 2 * ( max(abs(u.lowBound), abs(u.upBound)) ** 2 - seg_off.min() - min(u.lowBound * seg_slopes.max(), u.upBound * seg_slopes.min()) ) for i, (x0, x1, m_i, b_i, z_i) in enumerate( zip(grid[:-1], grid[1:], seg_slopes, seg_off, flags) ): if show and ax is not None: ax.plot([x0, x1], [m_i * x0 + b_i, m_i * x1 + b_i], color="orange") if x0 > u.lowBound: prob.addConstraint( name=f"flag_left_{u.name}_{i}", constraint=z_i <= (u - u.lowBound) / (x0 - u.lowBound), ) if x1 < u.upBound: prob.addConstraint( name=f"flag_right_{u.name}_{i}", constraint=z_i <= (u.upBound - u) / (u.upBound - x1), ) prob.addConstraint( name=f"hull_hi_{u.name}_{i}", constraint=u_sq <= m_i * u + b_i + M * (1 - z_i), ) return u_sq, flags, ax def report(u: pulp.LpVariable, u_sq: pulp.LpVariable, bits, ax=None): print(f"{u.name} = {u.value()} ~ ±{np.sqrt(u_sq.value())}") print(f"{u_sq.name} = {u_sq.value()} ~ {u.value() ** 2}") pick = next(idx for idx, v in enumerate(bits) if v.value()) print("segment chosen", pick) if ax is not None: ax.scatter(u.value(), u_sq.value(), color="red") def run_demo(): a = pulp.LpVariable(name="a", lowBound=-10, upBound=2.5, cat=pulp.LpContinuous) b = pulp.LpVariable(name="b", lowBound=-3, upBound=0.5, cat=pulp.LpContinuous) r = pulp.LpVariable(name="r", lowBound=0, upBound=5, cat=pulp.LpContinuous) prob = pulp.LpProblem(name="rms_piecewise", sense=pulp.LpMaximize) prob.setObjective(a) a_sq, a_flags, ax_a = piecewise_quad(a, prob, parts=5, show=True) b_sq, b_flags, ax_b = piecewise_quad(b, prob, parts=5, show=True) r_sq, r_flags, ax_r = piecewise_quad(r, prob, parts=5, show=True) prob.addConstraint(name="euclid", constraint=r_sq == a_sq + b_sq) prob.solve() if prob.status != pulp.LpStatusOptimal: raise RuntimeError(prob.status) report(a, a_sq, a_flags, ax_a) report(b, b_sq, b_flags, ax_b) report(r, r_sq, r_flags, ax_r) plt.show() if __name__ == "__main__": run_demo() What the approximation is doingThe squared variable is represented by a lower set of tangent lines and an upper set of chord lines. The lower side is convex, so a single set of inequalities is enough to ensure u_sq stays above all tangents. The upper side is non-convex and cannot be captured by a single convex constraint. Instead, one and only one segment is activated using binary selectors; a Big-M term deactivates the others. This is why a continuous solver is not suitable, but a mixed integer solver is.Regarding the Big-M, it should be large enough to exceed any potential violation of the upper constraint over the variable bounds, but not excessively large because that degrades solver performance. The code computes M from the variable bounds, segment slopes and intercepts to dominate the worst-case violation while keeping it reasonably tight.Your minimal example objective is intentionally simple and will choose a corner solution, so many of the constructed constraints will not bind. In real models with non-trivial objectives, both sides of the approximation matter. Depending on the application, the lower or upper half of the parabola can sometimes be omitted entirely, but when that is unclear, keep both sides.Why it mattersKnowing the difference between exact linearisation and mixed-integer approximation avoids dead ends. If you truly need exact RMS, you must switch to a non-linear formulation outside of pulp, or change the model to an acceptable alternative to RMS. If an approximation is acceptable, a MILP with piecewise-linear envelopes is a practical path that integrates with standard MILP solvers. Accuracy is under your control via the number of segments, and the model often scales to higher resolution; for example, increasing the segmentation to fifty pieces can still solve quickly in small problems.TakeawaysDo not attempt to square LP variables directly and expect an LP to handle it. Build a MILP with tangent and chord approximations of v^2, enforce the lower convex side with linear inequalities, and use binary selectors plus a carefully chosen Big-M for the upper side. Keep variable bounds tight to improve M, add segments where you need precision, and simplify by dropping one side of the envelope when the modelling context allows. When exact RMS is a hard requirement, move to an appropriate non-linear optimisation approach.

RMS linearization, linear programming, LP, MILP, piecewise-linear approximation, Big-M, binary variables, pulp Python, sqrt(x^2+y^2), Euclidean norm, quadratic constraints

2026

2026, Jan 06 01:00

How to Model RMS in Linear Programs: Piecewise-Linear MILP Approximation with Binaries and Big-M

Learn why RMS cannot be linearized exactly in LP and how to build a piecewise-linear MILP with binaries and Big-M to approximate sqrt(x^2+y^2) accurately.

Where the problem starts

The attempt usually looks like this: define S as the root mean square of two decision variables and optimise a linear objective. With a simple objective like maximising one variable, the optimum is trivial and masks the real issue, but the non-linearity is still there.

import numpy as np
import pulp
# Attempt to use RMS inside a linear model
mdl = pulp.LpProblem("rms_linearisation", pulp.LpMaximize)
xA = pulp.LpVariable("xA", lowBound=-10, upBound=10, cat=pulp.LpContinuous)
xB = pulp.LpVariable("xB", lowBound=-10, upBound=10, cat=pulp.LpContinuous)
xR = pulp.LpVariable("xR", lowBound=0, upBound=10, cat=pulp.LpContinuous)
# This is non-linear and outside of the LP algebra
# xR = sqrt(xA**2 + xB**2) is not valid in a linear model
# and squaring variables is not supported by LP
xR_expr = np.sqrt(xA**2 + xB**2)  # not a valid LP construct
mdl.setObjective(100 * xA)
# mdl.solve()  # model structure is invalid due to the non-linear expression

Why this cannot be linearised exactly

The RMS embeds squared terms. Squaring a decision variable is quadratic. There is no exact linearisation for x^2 or sqrt(x^2 + y^2) within a linear program. Local linear approximations exist, and convex (non-LP) optimisation methods can handle the convex side, but a pure LP cannot represent the full equality. Only the lower envelope of the parabola is convex. The upper side is non-convex and requires binary selectors to choose which piecewise-linear segment applies. That is why a mixed-integer formulation is necessary for a practical approximation.

In many tasks you can drop one half of the parabolic constraint depending on the context, but when the direction is ambiguous you need to enforce both sides. Also note that trivial objectives can produce corner solutions where most of the constructed constraints never bind, which makes the example look simpler than the general case.

A workable MILP approximation

The idea is to approximate v^2 by a set of line segments over the bounded domain of v. The lower envelope is convex and can be enforced by linear inequalities without binaries. The upper envelope is non-convex; it is enforced with a disjunctive formulation that uses one binary per segment and a Big-M term to deactivate non-selected segments. The segment count controls the accuracy; more segments give a tighter fit at the cost of more binaries.

import numpy as np
import pulp
from matplotlib import pyplot as plt
def piecewise_quad(
    u: pulp.LpVariable,
    prob: pulp.LpProblem,
    parts: int = 5,
    show: bool = False,
):
    u_sq = pulp.LpVariable(name=f"{u.name}_sq", cat=u.cat)
    grid = np.linspace(u.lowBound, u.upBound, parts)
    vals = grid ** 2
    tangents = 2 * grid
    ax = None
    if show:
        fig, ax = plt.subplots()
        dense = np.linspace(u.lowBound, u.upBound, 201)
        ax.plot(dense, dense ** 2, label=f"{u.name}^2")
        ax.set_xlabel(u.name)
        ax.set_ylabel(u_sq.name)
    # Lower convex envelope: u_sq >= slope * u - intercept
    for k, (xi, yi, dyi) in enumerate(zip(grid, vals, tangents)):
        prob.addConstraint(
            name=f"hull_lo_{u.name}_{k}",
            constraint=u_sq >= dyi * u - yi,
        )
        if show and ax is not None:
            dense = np.linspace(u.lowBound, u.upBound, 201)
            ax.plot(dense, dyi * dense - yi, linestyle=":", color="gray")
    # Upper envelope segments: require binaries
    flags = pulp.LpVariable.matrix(
        name=f"flag_{u.name}", cat=pulp.LpBinary, indices=range(parts - 1)
    )
    prob.addConstraint(
        name=f"choose_one_{u.name}",
        constraint=pulp.lpSum(flags) == 1,
    )
    seg_slopes = (vals[1:] - vals[:-1]) / (grid[1:] - grid[:-1])
    seg_off = vals[:-1] - seg_slopes * grid[:-1]
    # Big-M chosen to dominate any possible violation over bounds
    M = 2 * (
        max(abs(u.lowBound), abs(u.upBound)) ** 2
        - seg_off.min()
        - min(u.lowBound * seg_slopes.max(), u.upBound * seg_slopes.min())
    )
    for i, (x0, x1, m_i, b_i, z_i) in enumerate(
        zip(grid[:-1], grid[1:], seg_slopes, seg_off, flags)
    ):
        if show and ax is not None:
            ax.plot([x0, x1], [m_i * x0 + b_i, m_i * x1 + b_i], color="orange")
        if x0 > u.lowBound:
            prob.addConstraint(
                name=f"flag_left_{u.name}_{i}",
                constraint=z_i <= (u - u.lowBound) / (x0 - u.lowBound),
            )
        if x1 < u.upBound:
            prob.addConstraint(
                name=f"flag_right_{u.name}_{i}",
                constraint=z_i <= (u.upBound - u) / (u.upBound - x1),
            )
        prob.addConstraint(
            name=f"hull_hi_{u.name}_{i}",
            constraint=u_sq <= m_i * u + b_i + M * (1 - z_i),
        )
    return u_sq, flags, ax
def report(u: pulp.LpVariable, u_sq: pulp.LpVariable, bits, ax=None):
    print(f"{u.name} = {u.value()} ~ ±{np.sqrt(u_sq.value())}")
    print(f"{u_sq.name} = {u_sq.value()} ~ {u.value() ** 2}")
    pick = next(idx for idx, v in enumerate(bits) if v.value())
    print("segment chosen", pick)
    if ax is not None:
        ax.scatter(u.value(), u_sq.value(), color="red")
def run_demo():
    a = pulp.LpVariable(name="a", lowBound=-10, upBound=2.5, cat=pulp.LpContinuous)
    b = pulp.LpVariable(name="b", lowBound=-3, upBound=0.5, cat=pulp.LpContinuous)
    r = pulp.LpVariable(name="r", lowBound=0, upBound=5, cat=pulp.LpContinuous)
    prob = pulp.LpProblem(name="rms_piecewise", sense=pulp.LpMaximize)
    prob.setObjective(a)
    a_sq, a_flags, ax_a = piecewise_quad(a, prob, parts=5, show=True)
    b_sq, b_flags, ax_b = piecewise_quad(b, prob, parts=5, show=True)
    r_sq, r_flags, ax_r = piecewise_quad(r, prob, parts=5, show=True)
    prob.addConstraint(name="euclid", constraint=r_sq == a_sq + b_sq)
    prob.solve()
    if prob.status != pulp.LpStatusOptimal:
        raise RuntimeError(prob.status)
    report(a, a_sq, a_flags, ax_a)
    report(b, b_sq, b_flags, ax_b)
    report(r, r_sq, r_flags, ax_r)
    plt.show()
if __name__ == "__main__":
    run_demo()

What the approximation is doing

The squared variable is represented by a lower set of tangent lines and an upper set of chord lines. The lower side is convex, so a single set of inequalities is enough to ensure u_sq stays above all tangents. The upper side is non-convex and cannot be captured by a single convex constraint. Instead, one and only one segment is activated using binary selectors; a Big-M term deactivates the others. This is why a continuous solver is not suitable, but a mixed integer solver is.

Regarding the Big-M, it should be large enough to exceed any potential violation of the upper constraint over the variable bounds, but not excessively large because that degrades solver performance. The code computes M from the variable bounds, segment slopes and intercepts to dominate the worst-case violation while keeping it reasonably tight.

Your minimal example objective is intentionally simple and will choose a corner solution, so many of the constructed constraints will not bind. In real models with non-trivial objectives, both sides of the approximation matter. Depending on the application, the lower or upper half of the parabola can sometimes be omitted entirely, but when that is unclear, keep both sides.

Why it matters

Knowing the difference between exact linearisation and mixed-integer approximation avoids dead ends. If you truly need exact RMS, you must switch to a non-linear formulation outside of pulp, or change the model to an acceptable alternative to RMS. If an approximation is acceptable, a MILP with piecewise-linear envelopes is a practical path that integrates with standard MILP solvers. Accuracy is under your control via the number of segments, and the model often scales to higher resolution; for example, increasing the segmentation to fifty pieces can still solve quickly in small problems.

Takeaways

Do not attempt to square LP variables directly and expect an LP to handle it. Build a MILP with tangent and chord approximations of v^2, enforce the lower convex side with linear inequalities, and use binary selectors plus a carefully chosen Big-M for the upper side. Keep variable bounds tight to improve M, add segments where you need precision, and simplify by dropping one side of the envelope when the modelling context allows. When exact RMS is a hard requirement, move to an appropriate non-linear optimisation approach.

optimization pulp python