2026, Jan 06 19:00

How to Map LightGBM Booster Importances to Lagged Targets and Covariates in Darts Time Series

Interpret LightGBM feature importances in Darts: map generic booster columns to lagged time series targets and covariates using lagged_feature_names.

LightGBM via Darts is a convenient way to model time series with lagged targets and covariates, but interpretability can suffer right after training. If you try to read feature importances directly from the underlying LightGBM boosters, you’ll often see generic names like Column_0 or Column_1. The question is how to map those back to meaningful, lag-aware identifiers tied to your TimeSeries components and covariates.

Reproducing the setup

The following snippet builds a small forecasting setup with past and future covariates and trains a LightGBMModel. It mirrors a canonical example and focuses on atmospheric pressure as the target, rainfall as past covariate, and temperature as future covariate.

from darts.datasets import WeatherDataset
from darts.models import LightGBMModel
# load demo data
ts_data = WeatherDataset().load()
# select target and covariates
y_series = ts_data['p (mbar)'][:100]
x_past = ts_data['rain (mm)'][:100]
x_future = ts_data['T (degC)'][:106]
# configure and fit LightGBMModel with lags
gbm = LightGBMModel(
    lags=12,
    lags_past_covariates=12,
    lags_future_covariates=[0, 1, 2, 3, 4, 5],
    output_chunk_length=6,
    verbose=-1,
)
gbm.fit(y_series, past_covariates=x_past, future_covariates=x_future)

What’s behind the generic feature names

During training, Darts expands your target and covariates into a design matrix of lagged features. The internal LightGBM learners do not preserve your original TimeSeries component names and lags as readable feature labels. As a result, inspecting the booster after fitting returns generic names such as Column_0, Column_1, and so on, which are not directly informative for analysis.

If you iterate over the underlying estimators and query feature_name() and feature_importance(), you’ll see those generic labels paired with importance scores.

for idx, est in enumerate(gbm.model.estimators_):
    print(f"Target step {idx} importance (gain):")
    booster = est.booster_
    lgbm_feats = booster.feature_name()
    gains = booster.feature_importance(importance_type='gain')
    mapping = dict(zip(lgbm_feats, gains))
    print(mapping)

That gets you the numbers, but not the meaning.

The missing link: lagged feature names from Darts

Darts exposes the names of the features that were actually fed into the model. The list is available on the trained model as model.lagged_feature_names. This is the authoritative source to connect LightGBM’s importances back to the lagged target and covariates.

The features that go into the models are available in model.lagged_feature_names. Feature importances were discussed by a Darts author in Issue#1826, with an additional note about feature names in a comment under Issue#2125.

Mapping importance to meaningful names

Once the model is fitted, pair the gain-based importances from the LightGBM booster with the list of lagged feature names provided by Darts. This yields human-readable attributions per forecasting step.

# per-horizon booster, importance (gain), mapped to Darts' lagged feature names
for step_id, est in enumerate(gbm.model.estimators_):
    booster = est.booster_
    gains = booster.feature_importance(importance_type='gain')
    # names of all lagged features passed into the model
    readable_names = gbm.lagged_feature_names
    # pair them up for inspection
    importance_by_name = dict(zip(readable_names, gains))
    print(f"Step {step_id} importance (gain) with labels:")
    print(importance_by_name)

This produces a dictionary where each feature is identified by a clear, lag-aware name coming from Darts, aligned with the corresponding importance score reported by LightGBM.

Why this matters

When you introduce multiple future covariates and rich datetime attributes, you need to know which lags and which variables actually drive each forecasting step. Without readable names, iterative feature engineering, debugging, and model review become guesswork. Using the feature names surfaced by Darts lets you trace every importance score back to the specific lagged target or covariate component you engineered.

Conclusion

If LightGBM’s boosters return generic Column_* labels, don’t try to reverse-engineer them. Read the authoritative list of feature names from model.lagged_feature_names and pair it with the importance scores from each booster. This provides a clean, step-wise view of which lagged targets and covariates the model relies on, making interpretation and iteration significantly more straightforward.