2025, Nov 26 23:00

Polars read_excel with Calamine: why a tuple in columns fails and the right ways to select columns

Learn why Polars read_excel with the Calamine/fastexcel engine throws TypeError when columns is a tuple, and how to fix it using a list, ranges, or a callable.

Polars read_excel, Calamine, and the not-so-sequential columns parameter

Sometimes the devil is in the details of an engine. A seemingly valid use of the columns argument in polars.read_excel fails with a TypeError when using the Calamine engine, even though the documentation suggests a generic “sequence” should work. Here’s what happens, why it happens, and how to load only the columns you need without detours.

The minimal failing example

The call below attempts to read specific columns by index via a tuple. It looks legitimate given the docs, yet it raises an exception.

import polars as pl
sheet_df = pl.read_excel(
    "/Volumes/Spare/foo.xlsx",
    engine="calamine",
    sheet_name="natsav",
    read_options={"header_row": 2},
    columns=(1, 2, 4, 5, 6, 7),
)
print(sheet_df.head())

The parameter is documented as follows:

Columns to read from the sheet; if not specified, all columns are read. Can be given as a sequence of column names or indices.

However, the call fails with:

_fastexcel.InvalidParametersError: invalid parameters: `use_columns` callable could not be called (TypeError: 'tuple' object is not callable)

What’s actually going on

With engine="calamine" (which is also the default), the call flows into the fastexcel module. The documentation explicitly notes this:

this engine can be used for reading all major types of Excel Workbook (.xlsx, .xlsb, .xls) and is dramatically faster than the other options, using the fastexcel module to bind the Rust-based Calamine parser.

The error name makes that clear as well. The columns argument from Polars is passed through as fastexcel’s use_columns, which is defined as:

use_columns: Union[list[str], list[int], str, Callable[[ColumnInfoNoDtype], bool], NoneType] = None,

And described this way:

Specifies the columns to use. Can either be None to select all columns; a list of strings and ints, the column names and/or indices (starting at 0); a string, a comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”); or a callable, a function that takes a column and returns a boolean.

In other words, the Calamine/fastexcel path does not accept a generic “sequence” and will not treat a tuple as a valid specification for use_columns. If the engine interprets the input as a callable and tries to invoke it, a tuple naturally causes TypeError: 'tuple' object is not callable. That also explains why a callable does work and why the parameter you see is of type builtins.ColumnInfoNoDtype, which belongs to fastexcel rather than Polars itself.

Two ways to load specific columns correctly

If you prefer a predicate approach, you can provide a function that returns bool given a column object. This runs without exception and shows the incoming type.

import polars as pl
def pick_any(col_obj):
    print(type(col_obj))
    return True
excel_df = pl.read_excel(
    "/Volumes/Spare/foo.xlsx",
    engine="calamine",
    sheet_name="natsav",
    read_options={"header_row": 2},
    columns=pick_any,
)
print(excel_df.head())

If you just want a fixed subset by index, pass a list rather than a tuple. The list form is explicitly supported by fastexcel.

import polars as pl
subset_df = pl.read_excel(
    "/Volumes/Spare/foo.xlsx",
    engine="calamine",
    sheet_name="natsav",
    read_options={"header_row": 2},
    columns=[1, 2, 4, 5, 6, 7],
)
print(subset_df.head())

This resolves the TypeError because use_columns accepts list[int] and list[str] but not tuple.

Why this discrepancy matters

The higher-level documentation can be misleading when backend behavior diverges. In this case, Calamine routes through fastexcel, which narrows the accepted column spec to lists, strings, or a callable. Someone has already filed this as a bug, and it highlights a broader point: what looks like a generic “sequence” in one layer can become a stricter contract in another. Being aware of the engine’s expectations saves time and prevents puzzling errors.

Practical takeaways

When using polars.read_excel with engine="calamine", provide the columns selection in a format understood by fastexcel. Use a list of indices or names, a string with Excel-style ranges, or a callable that returns a boolean for each ColumnInfoNoDtype. If a tuple was your first instinct based on a generic “sequence” description, switch to a list for predictable behavior. And if you need introspection or filtering logic, the callable route is available and works as demonstrated.