2025, Nov 22 09:00
Stop Gekko ARX simulations diverging at na>1: steady-state initialization and sysid shift='init'
Learn why Gekko ARX simulations diverge when na>1 and how to fix them using steady-state initialization (IMODE=1) and sysid shift='init' for stable dynamic runs.
When simulating temperature dynamics with ARX models in Gekko, it’s common to get good results with na=1 and then see the simulation blow up or drift to implausible values as soon as na>1. The puzzling part is that the first simulated point matches the data, but everything after that diverges. The root cause is not the order itself but how the model is initialized for dynamic simulation.
Minimal example that reproduces the issue
The script below runs identification on the TCLab step test dataset and then simulates the ARX model directly with IMODE=4, setting only the first output value as the initial condition. That works for na=1 but leads to nonsense for higher orders.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from gekko import GEKKO
# Load data
src = 'http://apmonitor.com/dde/uploads/Main/tclab_step_test.txt'
df_raw = pd.read_csv(src)
t_idx = df_raw['Time']
u_mat = df_raw[['Q1', 'Q2']]
y_mat = df_raw[['T1', 'T2']]
# System identification
gm = GEKKO(remote=True)
na_ord, nb_ord = 2, 6
y_hat, theta, kval = gm.sysid(t_idx, u_mat, y_mat, na_ord, nb_ord, diaglevel=1, scale=False)
# Plot input and output data with predictions
plt.figure(figsize=(8, 5))
plt.subplot(2, 1, 1)
plt.plot(t_idx, u_mat)
plt.legend([r'$Q_1$', r'$Q_2$'])
plt.ylabel('Inputs')
plt.subplot(2, 1, 2)
plt.plot(t_idx, y_mat)
plt.plot(t_idx, y_hat)
plt.legend([r'$T_{1,meas}$', r'$T_{2,meas}$',
r'$T_{1,pred}$', r'$T_{2,pred}$'])
plt.ylabel('Outputs')
plt.xlabel('Time')
# ARX model for simulation
y_comp, u_comp = gm.arx(theta)
gm.time = np.linspace(0, 600, 601)
gm.options.IMODE = 4
# Set input profiles
u_comp[0].value = df_raw['Q1']
u_comp[1].value = df_raw['Q2']
# Set initial output values
y_comp[0].value = df_raw['T1'][0]
y_comp[1].value = df_raw['T2'][0]
# Solve and plot simulation results
gm.solve(disp=False)
plt.figure(figsize=(10, 6))
plt.subplot(2, 2, 1)
plt.title('Step Test')
plt.plot(gm.time, u_comp[0].value, 'b-', label=r'$Q_1$')
plt.plot(gm.time, u_comp[1].value, 'r-', label=r'$Q_2$')
plt.ylabel('Heater (%)')
plt.legend()
plt.subplot(2, 2, 3)
plt.plot(gm.time, y_comp[0].value, 'b--', label=r'$T_1$')
plt.plot(gm.time, y_comp[1].value, 'r--', label=r'$T_2$')
plt.plot(t_idx, y_mat, 'k-', label='Measured')
plt.ylabel('Temperature (K)')
plt.xlabel('Time (sec)')
plt.legend()
plt.tight_layout()
plt.show()
Why the simulation goes off track
ARX models need to be initialized for dynamic simulation. Without explicit initialization, the internal history required by the autoregressive part isn’t in a consistent steady state, so higher-order dynamics don’t have a valid starting point. Simply assigning the first output sample as the initial value doesn’t populate the dynamic history that na>1 expects. Attempting to pass multiple initial outputs as short arrays won’t fix it either because the model requires a proper steady-state setup before switching to a dynamic run.
The fix and a working script
Two changes make the simulation consistent. First, add a steady-state initialization run with IMODE=1 using the first input values. Second, when your first data points represent a good steady-state, call sysid with shift='init'. The script below applies both steps.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from gekko import GEKKO
# Load data
src = 'http://apmonitor.com/dde/uploads/Main/tclab_step_test.txt'
df_raw = pd.read_csv(src)
t_idx = df_raw['Time']
u_mat = df_raw[['Q1', 'Q2']]
y_mat = df_raw[['T1', 'T2']]
# System identification
gm = GEKKO(remote=True)
na_ord, nb_ord = 2, 6
y_hat, theta, kval = gm.sysid(t_idx, u_mat, y_mat, na_ord, nb_ord, diaglevel=1, scale=False, shift='init')
# Plot input and output data with predictions
plt.figure(figsize=(8, 5))
plt.subplot(2, 1, 1)
plt.plot(t_idx, u_mat)
plt.legend([r'$Q_1$', r'$Q_2$'])
plt.ylabel('Inputs')
plt.subplot(2, 1, 2)
plt.plot(t_idx, y_mat)
plt.plot(t_idx, y_hat)
plt.legend([r'$T_{1,meas}$', r'$T_{2,meas}$',
r'$T_{1,pred}$', r'$T_{2,pred}$'])
plt.ylabel('Outputs')
plt.xlabel('Time')
# ARX model for simulation
y_comp, u_comp = gm.arx(theta)
# steady-state initialization
gm.options.IMODE = 1
u_comp[0].value = df_raw['Q1'].values[0]
u_comp[1].value = df_raw['Q2'].values[0]
gm.solve(disp=False)
# dynamic simulation
gm.time = np.linspace(0, 600, 601)
gm.options.IMODE = 4
u_comp[0].value = df_raw['Q1']
u_comp[1].value = df_raw['Q2']
y_comp[0].value = df_raw['T1'][0]
y_comp[1].value = df_raw['T2'][0]
gm.solve(disp=False)
plt.figure(figsize=(10, 6))
plt.subplot(2, 1, 1)
plt.title('Step Test')
plt.plot(gm.time, u_comp[0].value, 'b-', label=r'$Q_1$')
plt.plot(gm.time, u_comp[1].value, 'r-', label=r'$Q_2$')
plt.ylabel('Heater (%)')
plt.legend()
plt.subplot(2, 1, 2)
plt.plot(gm.time, y_comp[0].value, 'b--', label=r'$T_1$')
plt.plot(gm.time, y_comp[1].value, 'r--', label=r'$T_2$')
plt.plot(t_idx, y_mat, 'k-', label='Measured')
plt.ylabel('Temperature (K)')
plt.xlabel('Time (sec)')
plt.legend()
plt.tight_layout()
plt.savefig('results.png', dpi=300)
plt.show()
Why this matters
Order selection in ARX models directly affects how much past information the model uses. If the simulation doesn’t start from a consistent steady state, higher-order history terms immediately pollute the trajectory. Proper initialization ensures the dynamic history aligns with the identified model, so step tests and other scenarios behave sensibly across different na values.
Wrap-up
If a Gekko ARX simulation explodes when na>1, initialize the model in steady state before the dynamic run and let sysid align the series with shift='init' when the first samples are steady. This keeps the identified dynamics and the simulation start aligned, so you can push na higher without destabilizing the results, and run step tests that reflect your real initial conditions.