2025, Sep 27 05:00

How to Sort Column-Wise Maxima in Pandas: Use Series.sort_values(ascending=False), Not sort_index

Learn to rank column-wise maxima in pandas: compute DataFrame.max(), then sort the Series by values with sort_values(ascending=False) instead of sort_index.

Sorting cumulative statistics in pandas is straightforward once you distinguish between sorting by index and sorting by values. A common stumbling block appears when you compute column-wise maxima and try to rank them, only to discover that the object you’re sorting is a Series, not a DataFrame, and that the sorting call targets the wrong axis. Below is a compact guide using a real-world setup with cumulative rainfall totals by year.

Problem overview

There is a multi-column pandas DataFrame where each column represents a year, and each column contains cumulative rainfall values for sequential days. The last entry in every column is the maximum for that column. The task is to extract the maximum for each year and sort these maxima in descending order. The initial attempt used sorting by index, which doesn’t change the order as needed, and trying to sort along axis=1 fails because the result is a Series.

Code example that reproduces the issue

import pandas as pd
import os
rain_df = pd.read_csv('myfile.txt', sep=' ', skipinitialspace=True)
col_max = rain_df.max()
print(col_max)
# Attempt that sorts labels, not the numeric values
sorted_by_labels = col_max.sort_index()
print(sorted_by_labels)

This produces a Series of maxima per column. Sorting via sort_index reorders by labels such as Avge, 1945, 1946, and so on, rather than by the numeric maxima. Trying to sort along axis=1 triggers the error “ValueError: No axis named 1 for object type Series”, indicating there is no second axis to select on a one-dimensional Series.

What’s really happening

Calling DataFrame.max() across columns returns a pandas Series: each index label is the column name (e.g., a year), and each value is the maximum for that column. Sorting this Series by index reshuffles the labels, not the numeric maxima. Because a Series is one-dimensional, it only has a single axis; axis=1 simply does not exist for that object, which explains the error.

Fix: sort by values in descending order

To rank the maxima from highest to lowest, sort the Series by its values. The key is to call sort_values with ascending=False.

import pandas as pd
import os
rain_df = pd.read_csv('myfile.txt', sep=' ', skipinitialspace=True)
col_max = rain_df.max()
ranked_max = col_max.sort_values(ascending=False)
print(ranked_max)

This preserves the mapping between year labels and their maximum cumulative rainfall, while ordering the results from the largest to the smallest value.

Why this detail matters

When you aggregate columns, the output type drives what you can do next. With a Series, sort_index and sort_values act on different dimensions of the same data: labels versus numbers. Using the wrong method can leave results in an unintuitive order or raise errors about non-existent axes. Understanding this distinction ensures that ranking metrics, selecting top performers, or producing leaderboards behaves as expected.

Takeaways

After aggregations like max across DataFrame columns, remember you’re holding a Series. If the goal is to order by magnitude, use sort_values with ascending=False. If you instead need alphabetical or chronological ordering by labels, use sort_index. Keeping the object type and the sort target aligned saves time and prevents avoidable errors.

The article is based on a question from StackOverflow by Zilore Mumba and an answer by 0ro2.