2025, Dec 25 05:00

Python Protocols and Dataclasses: why mutable attributes trigger invariance errors and how to fix them

Learn why Python Protocols and dataclasses clash on mutable attributes, causing invariance type checker errors, and see practical fixes and cross-module tips.

When mixing Python Protocols with dataclasses, it’s easy to end up with a type checker complaining about incompatibility, even if instances look perfectly valid at runtime. The core friction appears when a dataclass uses a concrete type while the corresponding Protocol expects a more general structural interface, and that attribute is mutable.

Minimal reproduction

Consider a straightforward attempt to use structural typing:

import typing
import dataclasses

class SettingsProto(typing.Protocol):
    label: str

@dataclasses.dataclass
class Settings:
    label: str

class HolderProto(typing.Protocol):
    cfg: SettingsProto

@dataclasses.dataclass
class Holder:
    cfg: Settings

def consume_holder(h: HolderProto):
    pass

consume_holder(Holder(cfg=Settings(label="")))

A type checker like Pylance can flag this as incompatible, pointing out that the attribute is invariant because it is mutable and that the concrete type does not match the protocol requirement. The report typically references the attribute being mutable and therefore not safely substitutable.

What’s actually going on

The dataclass Holder accepts Settings (a concrete class) for cfg, while the protocol HolderProto requires cfg to be any implementation of SettingsProto. This clashes only if cfg is mutable. If it is, any consumer holding a HolderProto could legally replace cfg with something that conforms to SettingsProto but is not an instance of Settings and might lack extra members or behavioral guarantees that code elsewhere assumes exist on Settings.

A worked pitfall illustrating why the checker is strict

The problem becomes obvious when the protocol is a subset of fields offered by a richer concrete class. A mutable attribute typed as the protocol lets other code swap in a different implementation, breaking assumptions about the richer class:

from dataclasses import dataclass
from typing import Protocol

class MetricsLike(Protocol):
    value: int

@dataclass
class Metrics:
    value: int
    extra: float

@dataclass
class AltMetrics:
    value: int
    items: list[int]

class BucketLike(Protocol):
    stats: MetricsLike

@dataclass
class Bucket:
    stats: Metrics

b: Bucket = Bucket(stats=Metrics(value=0, extra=1.0))
assert b.stats.extra == 1.0
b_like: BucketLike = b
b_like.stats = AltMetrics(0, [])  # allowed by BucketLike, not by Bucket
assert b.stats.extra == 1.0       # AttributeError at runtime: AltMetrics has no "extra"

The type checker prevents this by requiring the mutable attribute stats in Bucket to accept the same type the protocol promises: a protocol, not the concrete Metrics.

The fix

Adjust the dataclass so that the mutable attribute is typed as the protocol, not the concrete implementation. That’s enough for the static checker:

import typing
import dataclasses

class SettingsProto(typing.Protocol):
    label: str

@dataclasses.dataclass
class Settings:
    label: str

class HolderProto(typing.Protocol):
    cfg: SettingsProto

@dataclasses.dataclass
class Holder:
    cfg: SettingsProto  # accept the protocol here

def consume_holder(h: HolderProto):
    pass

consume_holder(Holder(cfg=Settings(label="")))

This aligns the mutability contract: code using HolderProto can assign any SettingsProto-compatible object to cfg, and the Holder instance agrees to the same capability.

Cross-module reuse and protocol shape

If you maintain two standalone modules that need to share instances based on a common structural interface, duplicate the same Protocol definitions in both places. For this to work predictably, the Protocols must be identical. A property-only protocol surface is a practical pattern in such setups. In an alternative design, a class can keep a “private” concrete attribute and expose a public getter returning the protocol view; downstream code then operates on the protocol-returning accessor without depending on the concrete class.

Be mindful that trying to auto-generate dataclasses from a Protocol using typing.get_type_hints can lead to fields typed as Any in this scenario. That undermines type safety and defeats the purpose of the protocol boundary.

Why this matters

Structural typing promises flexibility, but mutability introduces variance constraints. If a mutable attribute is typed concretely, substituting a broader protocol in the interface becomes unsafe. The checker is preventing a very real class of runtime errors, as the pitfall shows. Making the field accept the protocol aligns the type surface with how the object will be used.

Takeaways

When a Protocol requires a mutable attribute of type SomeProto, ensure your dataclass uses the same protocol type for that attribute rather than a concrete implementation. Do this especially when you expect cross-module reuse based on identical Protocol shapes. If you need to expose a protocol-only view while keeping richer internals, route consumers through a getter that returns the protocol interface.