2025, Nov 02 21:00
Grouped, Clustered, and Stacked Bar Charts for Categorical Survey Responses by Subgroup using Seaborn countplot or pandas crosstab
Learn how to plot categorical survey responses by subgroup with grouped and stacked bars: quick Seaborn countplot or flexible pandas crosstab as percent
Visualizing categorical survey responses split by a subgroup sounds simple until you try to make it look like clustered or stacked columns. The key is to count occurrences of each category and then choose a plotting path that aligns with the format you want. Below is a compact walkthrough using either Seaborn for a direct grouped plot or pure pandas for clustered and stacked views.
Sample data that illustrates the task
We’ll work with letter-based answers and a gender split. The goal is to plot the frequency of each answer grouped by gender, with the option to switch between clustered and stacked columns.
import pandas as pd
choices = ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'B', 'B', 'A', 'C', 'B', 'A', 'C']
sexes = ['M', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'F', 'M']
tbl = pd.DataFrame({'Choice': choices, 'Sex': sexes})
What’s really going on
The need here is to count occurrences of each categorical value and break those counts down by another categorical field. This is a classic “counts by group” problem. A quick way is to use a plot that aggregates counts for you on the fly, or to first compute a frequency table and then plot it. Both approaches get you to clustered or stacked columns without manual looping or custom grouping.
Solution 1: grouped columns with Seaborn
For a direct grouped plot, a simple countplot does the job. It computes the counts for each category and groups bars by the hue.
import seaborn as sbn
sbn.countplot(data=tbl, x='Choice', hue='Sex')
This produces grouped columns per answer, split by gender.
Solution 2: clustered or stacked columns with pandas crosstab
If you prefer not to rely on Seaborn, compute a contingency table first and then plot it. This makes it trivial to switch between clustered and stacked layouts.
# Clustered columns
pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar()
To stack the columns, flip a single parameter:
# Stacked columns
pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar(stacked=True)
If you need the stacked chart normalized as percent by row, use normalization on the index before plotting:
# Stacked columns normalized as percent
pd.crosstab(tbl['Choice'], tbl['Sex'], normalize='index').plot.bar(stacked=True)
Why this matters
Being able to jump between direct aggregation in the plotting library and an explicit frequency table streamlines work with categorical data. Seaborn’s countplot is a fast path to grouped bars, while pandas crosstab gives full control over the frequency table, stacking, and normalization in one place. That flexibility saves time when switching chart styles or when you need normalized views.
Takeaways
When you need categorical counts by subgroup, go straight to a count-oriented approach. Use a countplot for quick grouped results. If you want clustered, stacked, or normalized columns with minimal friction, compute a crosstab and plot it. With these two options, you’ll cover the common “A/B/C by M/F” scenarios without wrestling with manual grouping or hard-to-find categorical examples.
The article is based on a question from StackOverflow by ArgumentClinician and an answer by mozway.