2025, Oct 19 02:00
Altair: Prevent bar aggregation with duplicate categories using a unique x key and readable axis labels
Altair tip: prevent bar aggregation with duplicate categories. Get one bar per row using a unique x key, transform_calculate, and labelExpr to keep labels.
Altair groups rows that share the same category on the x-axis. If a field repeats, its y-values get added, and you end up with fewer bars than rows. When your source data intentionally contains duplicates and you want one bar per row, you need a unique x-channel key while still showing the original label.
Reproducing the issue
The dataset has four rows with two entries sharing the same value in the field a. The straightforward Altair encoding collapses these into three bars because both rows with B are added together.
data_tbl = pd.DataFrame({
    'a': ['A', 'B', 'C', 'B'],
    'b': [10, 20, 30, 15]
})
alt.Chart(data_tbl).mark_bar().encode(
    x='a',
    y='b'
)
By contrast, plotting with pandas draws one bar per row because it treats each record independently, even when labels repeat.
data_tbl.plot(kind='bar', x='a', y='b')
Why you see fewer bars
In this setup, rows with the same a value are combined, so the two B rows are added into a single bar showing 35. The result is a three-bar chart for A, B, and C, not the expected four bars matching the row count.
Naive workaround and its limitation
Using the DataFrame index as the x-channel produces four bars, but labels switch to numeric indices and you lose the original a values on the axis.
alt.Chart(data_tbl.reset_index()).mark_bar().encode(
    x='index:O',
    y='b'
)
This preserves the row count but not the desired category labels.
The clean fix: unique x keys, original labels
Create a unique x key by combining the row index with the a value, and then display only the a portion on the axis. This preserves one bar per row while keeping the expected text on the x-axis.
alt.Chart(data_tbl.reset_index()).transform_calculate(
    x_token=alt.datum['a'] + '_' + alt.datum['index']
).mark_bar().encode(
    alt.X('x_token:N', title='A', axis=alt.Axis(labelExpr='split(datum.value, "_")[0]')),
    alt.Y('b')
)
The transform creates a composite key that is unique per row. The axis labelExpr splits that key on the underscore and shows only the left part, effectively restoring the original a values on the axis while preventing aggregation.
Why this matters
When you rely on duplicate categories to represent distinct records, silently combining rows changes both the count of marks and the magnitude of bars. Ensuring a one-to-one mapping between rows and marks makes the chart reflect the underlying data without unintended consolidation. At the same time, preserving the original labels keeps the chart readable and faithful to the semantics of the data.
Takeaways
When you expect one bar per row but categories repeat, construct a unique x-channel key and control the axis labels to display the original categories. This approach avoids aggregation, keeps the number of bars equal to the number of rows, and maintains meaningful labels.