2025, Oct 18 17:00
How to Fix Category Order in plotnine Bar Charts with polars: Use scale_x_discrete Limits
Learn why polars unpivot scrambles category order in plotnine bar charts and how to fix it by setting explicit discrete axis limits with scale_x_discrete.
Getting the order of discrete categories wrong can ruin an otherwise clear chart. When reshaping data with polars and plotting with plotnine, a common pitfall is that Category labels end up scrambled. In a bar chart like the one below, the bucket labeled 5-10 should appear in the third position but doesn’t. The fix is simple once you control the discrete axis order explicitly.
Repro: polars + plotnine bar chart with misordered categories
import polars as po
import polars.selectors as sl
from plotnine import ggplot, geom_bar, aes, scale_fill_manual, coord_flip
frame_wide = po.read_csv('https://raw.githubusercontent.com/Brinkhuis/Zorguitgaven/refs/heads/master/zorguitgaven.csv')
frame_long = frame_wide.unpivot(sl.numeric(), index='Category')
(
    ggplot() +
    geom_bar(
        data=frame_long.filter(po.col('variable').is_in(['Mannen 2040', 'Vrouwen 2040'])),
        mapping=aes(x='Category', y='value', fill='variable'),
        stat='identity'
    ) +
    scale_fill_manual(values=['#007BC7', '#CA005D']) +
    coord_flip()
)
The plot renders, but the Category order is not what you expect, and 5-10 does not land where it should.
Why the order goes wrong
After unpivot, the Category column repeats the same labels for each numeric column that was melted. The plotting layer sees multiple instances of the same category and chooses an order that doesn’t match the desired sequence. To solve this, the axis needs an explicit ordering.
Direct fix in plotnine: set discrete limits
The simplest way to enforce the x-axis order is to pass an ordered list of categories to scale_x_discrete. That list should represent the intended sequence. A quick, pragmatic variant is to take the first block of Category values from the long table and slice to the expected length.
import polars as po
import polars.selectors as sl
from plotnine import ggplot, geom_bar, aes, scale_fill_manual, scale_x_discrete, coord_flip
wide_tbl = po.read_csv('https://raw.githubusercontent.com/Brinkhuis/Zorguitgaven/refs/heads/master/zorguitgaven.csv')
long_tbl = wide_tbl.unpivot(sl.numeric(), index='Category')
(
    ggplot() +
    geom_bar(
        data=long_tbl.filter(po.col('variable').is_in(['Mannen 2040', 'Vrouwen 2040'])),
        mapping=aes(x='Category', y='value', fill='variable'),
        stat='identity'
    ) +
    scale_x_discrete(limits=long_tbl['Category'].to_list()[:21]) +
    scale_fill_manual(values=['#007BC7', '#CA005D']) +
    coord_flip()
)
The slice helps sidestep duplicates in Category that appear after the reshape.
Cleaner approach: capture the category order before unpivot
To avoid slicing, capture the unique Category sequence from the wide data first, then unpivot and feed that sequence to the scale limits. This keeps the discrete order unambiguous and tied to the original data.
import polars as po
import polars.selectors as sl
from plotnine import ggplot, geom_bar, aes, scale_fill_manual, scale_x_discrete, coord_flip
source_df = po.read_csv('https://raw.githubusercontent.com/Brinkhuis/Zorguitgaven/refs/heads/master/zorguitgaven.csv')
cat_order = source_df['Category']
reshaped_df = source_df.unpivot(sl.numeric(), index='Category')
(
    ggplot() +
    geom_bar(
        data=reshaped_df.filter(po.col('variable').is_in(['Mannen 2040', 'Vrouwen 2040'])),
        mapping=aes(x='Category', y='value', fill='variable'),
        stat='identity'
    ) +
    scale_x_discrete(limits=cat_order) +
    scale_fill_manual(values=['#007BC7', '#CA005D']) +
    coord_flip()
)
Assuming they are in correct order in the initial dataframe. Otherwise, you’ll have to sort it, using some criteria. Like reading the first number.
Why this detail matters
Discrete category order carries meaning. In age bands, cohorts, or ordered buckets, a shuffled axis distorts interpretation, makes comparisons harder, and invites wrong conclusions. When data is reshaped, duplicates can obscure the natural order unless the plotting layer is instructed how to lay out categories. Controlling the discrete scale avoids silent reordering and keeps your bar charts faithful to the source.
Conclusion
When working with polars and plotnine, reshape operations such as unpivot can duplicate category labels and lead to unexpected ordering on discrete axes. The reliable way to keep categories in the intended sequence is to pass explicit limits to scale_x_discrete. Grabbing the Category series before unpivot gives you a clean, unique order; using that sequence in the scale ensures the chart renders exactly as designed.