https://pytroubles.com/en/posts/id1641-visualize-categorical-survey-responses-by-subgroup-grouped-and-stacked-bars-with-seaborn-and-pandas

Visualize Categorical Survey Responses by Subgroup: Grouped and Stacked Bars with Seaborn and pandas

Grouped, Clustered, and Stacked Bar Charts for Categorical Survey Responses by Subgroup using Seaborn countplot or pandas crosstab

Visualize Categorical Survey Responses by Subgroup: Grouped and Stacked Bars with Seaborn and pandas

Learn how to plot categorical survey responses by subgroup with grouped and stacked bars: quick Seaborn countplot or flexible pandas crosstab as percent

2025-11-02T21:00:10+03:00

Visualizing categorical survey responses split by a subgroup sounds simple until you try to make it look like clustered or stacked columns. The key is to count occurrences of each category and then choose a plotting path that aligns with the format you want. Below is a compact walkthrough using either Seaborn for a direct grouped plot or pure pandas for clustered and stacked views.Sample data that illustrates the taskWe’ll work with letter-based answers and a gender split. The goal is to plot the frequency of each answer grouped by gender, with the option to switch between clustered and stacked columns.import pandas as pd choices = ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'B', 'B', 'A', 'C', 'B', 'A', 'C'] sexes = ['M', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'F', 'M'] tbl = pd.DataFrame({'Choice': choices, 'Sex': sexes}) What’s really going onThe need here is to count occurrences of each categorical value and break those counts down by another categorical field. This is a classic “counts by group” problem. A quick way is to use a plot that aggregates counts for you on the fly, or to first compute a frequency table and then plot it. Both approaches get you to clustered or stacked columns without manual looping or custom grouping.Solution 1: grouped columns with SeabornFor a direct grouped plot, a simple countplot does the job. It computes the counts for each category and groups bars by the hue.import seaborn as sbn sbn.countplot(data=tbl, x='Choice', hue='Sex') This produces grouped columns per answer, split by gender.Solution 2: clustered or stacked columns with pandas crosstabIf you prefer not to rely on Seaborn, compute a contingency table first and then plot it. This makes it trivial to switch between clustered and stacked layouts.# Clustered columns pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar() To stack the columns, flip a single parameter:# Stacked columns pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar(stacked=True) If you need the stacked chart normalized as percent by row, use normalization on the index before plotting:# Stacked columns normalized as percent pd.crosstab(tbl['Choice'], tbl['Sex'], normalize='index').plot.bar(stacked=True) Why this mattersBeing able to jump between direct aggregation in the plotting library and an explicit frequency table streamlines work with categorical data. Seaborn’s countplot is a fast path to grouped bars, while pandas crosstab gives full control over the frequency table, stacking, and normalization in one place. That flexibility saves time when switching chart styles or when you need normalized views.TakeawaysWhen you need categorical counts by subgroup, go straight to a count-oriented approach. Use a countplot for quick grouped results. If you want clustered, stacked, or normalized columns with minimal friction, compute a crosstab and plot it. With these two options, you’ll cover the common “A/B/C by M/F” scenarios without wrestling with manual grouping or hard-to-find categorical examples.

categorical survey responses, grouped bar chart, stacked bar chart, clustered columns, Seaborn countplot, pandas crosstab, normalized percent, subgroup analysis, frequency table, Python plotting

2025

2025, Nov 02 21:00

Grouped, Clustered, and Stacked Bar Charts for Categorical Survey Responses by Subgroup using Seaborn countplot or pandas crosstab

Learn how to plot categorical survey responses by subgroup with grouped and stacked bars: quick Seaborn countplot or flexible pandas crosstab as percent

Sample data that illustrates the task

We’ll work with letter-based answers and a gender split. The goal is to plot the frequency of each answer grouped by gender, with the option to switch between clustered and stacked columns.

import pandas as pd
choices = ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'B', 'B', 'A', 'C', 'B', 'A', 'C']
sexes = ['M', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'F', 'M']
tbl = pd.DataFrame({'Choice': choices, 'Sex': sexes})

What’s really going on

The need here is to count occurrences of each categorical value and break those counts down by another categorical field. This is a classic “counts by group” problem. A quick way is to use a plot that aggregates counts for you on the fly, or to first compute a frequency table and then plot it. Both approaches get you to clustered or stacked columns without manual looping or custom grouping.

Solution 1: grouped columns with Seaborn

For a direct grouped plot, a simple countplot does the job. It computes the counts for each category and groups bars by the hue.

import seaborn as sbn
sbn.countplot(data=tbl, x='Choice', hue='Sex')

This produces grouped columns per answer, split by gender.

Solution 2: clustered or stacked columns with pandas crosstab

If you prefer not to rely on Seaborn, compute a contingency table first and then plot it. This makes it trivial to switch between clustered and stacked layouts.

# Clustered columns
pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar()

To stack the columns, flip a single parameter:

# Stacked columns
pd.crosstab(tbl['Choice'], tbl['Sex']).plot.bar(stacked=True)

If you need the stacked chart normalized as percent by row, use normalization on the index before plotting:

# Stacked columns normalized as percent
pd.crosstab(tbl['Choice'], tbl['Sex'], normalize='index').plot.bar(stacked=True)

Why this matters

Being able to jump between direct aggregation in the plotting library and an explicit frequency table streamlines work with categorical data. Seaborn’s countplot is a fast path to grouped bars, while pandas crosstab gives full control over the frequency table, stacking, and normalization in one place. That flexibility saves time when switching chart styles or when you need normalized views.

Takeaways

When you need categorical counts by subgroup, go straight to a count-oriented approach. Use a countplot for quick grouped results. If you want clustered, stacked, or normalized columns with minimal friction, compute a crosstab and plot it. With these two options, you’ll cover the common “A/B/C by M/F” scenarios without wrestling with manual grouping or hard-to-find categorical examples.

The article is based on a question from StackOverflow by ArgumentClinician and an answer by mozway.

histogram python seaborn statistics