2025, Oct 19 00:00

Detect and cast Decimal-backed object columns to float64 in Pandas DataFrames automatically

Learn how to detect Decimal columns in Pandas DataFrames and convert them to float64 automatically, without hardcoded names. Use first-row probes or thresholds.

Working with financial or measurement data often brings decimal.Decimal into the mix. The moment such values land in a Pandas DataFrame, they tend to become object-typed columns, which isn’t ideal if you expect numeric behavior out of the box. The goal is to automatically coerce all Decimal-backed columns to float64 at creation time without hardcoding column names.

The setup

Consider a dictionary that includes a list of decimal.Decimal values. After building a DataFrame, the column becomes object-typed, not float64:

import pandas as pd
from decimal import Decimal

payload = {
    'Item': ['Apple', 'Banana', 'Orange'],
    'Price': [Decimal('1.25'), Decimal('0.75'), Decimal('2.00')],
    'Quantity': [10, 20, 15]
}

frame = pd.DataFrame(payload)
print(frame.dtypes)

# Output:
# Item        object
# Price       object
# Quantity     int64
# dtype: object

Even pd.DataFrame.from_records(..., coerce_float=True) won’t change the underlying Decimal values. And while .astype(float) works for a known column, it doesn’t help when you don’t know the column names ahead of time.

Why Decimal ends up as object

When a column contains Decimal instances, Pandas infers a generic object dtype. This preserves the original Python objects but doesn’t provide native numeric behavior. The fix is to detect which columns actually contain Decimal values and convert them to floats in one pass.

A reliable conversion pattern

One pragmatic approach is to infer the target columns from the data itself, build a mapping, and pass it to .astype. If you can trust the first row to be representative, you can use it to detect Decimal-typed columns and cast them:

from decimal import Decimal

probe = frame.iloc[0].map(type).eq(Decimal)
cast_map = dict.fromkeys(frame.columns[probe], float)
# Example: {'Price': float}
converted = frame.astype(cast_map)

This keeps the logic data-driven and avoids hardcoding column names.

Choosing the detection threshold

If relying solely on the first row is too fragile, you can define a threshold over the entire column. Depending on your tolerance, convert a column when all values are Decimal, when at least one is Decimal, or when more than 90% are Decimal:

# Convert a column if all values are Decimal
cast_map = dict.fromkeys(frame.columns[frame.map(type).eq(Decimal).all()], float)

# Convert if at least one value is Decimal
cast_map = dict.fromkeys(frame.columns[frame.map(type).eq(Decimal).any()], float)

# Convert if more than 90% of the values are Decimal
cast_map = dict.fromkeys(
    frame.columns[frame.map(type).eq(Decimal).mean().gt(0.9)],
    float
)
converted = frame.astype(cast_map)

After conversion, you get native numeric behavior on the affected columns. An example of resulting dtypes looks like this:

Item        string[python]
Price              float64
Quantity             Int64
dtype: object

Why this matters

Type discipline in DataFrames is essential for predictable numeric operations, aggregations, and interoperability with other libraries. Letting Decimal-rich columns linger as object invites subtle issues, from performance penalties to unexpected behavior in downstream computations. A data-driven casting step at creation ensures consistent numeric dtypes without maintaining a hardcoded list of column names.

Takeaways

When your input includes decimal.Decimal, let the data tell you which columns to cast. Detect columns with Decimal values, build a mapping to float, and apply .astype. Whether you trust the first row or prefer a threshold across the full column, the approach remains the same: avoid hardcoding, keep the logic declarative, and ensure your numeric columns behave like numeric columns.

The article is based on a question from StackOverflow by Gino and an answer by mozway.