2025, Oct 29 17:00

Make a dataclass field optional only in __init__ and non-optional on instances

Make a Python dataclass accept an optional constructor argument while the field stays non-optional on instances. Use a custom __init__ for clear typing.

Making a field optional only at construction time, while keeping it non-optional on the instance, is a common need when modeling simple value objects. In plain Python classes this is trivial, but with @dataclass it can lead to type warnings if you are not careful. Here is a concise walkthrough of the problem and a clean way to solve it.

Baseline without @dataclass

A straightforward class sets the second coordinate to the first one when it is omitted:

class Coord:
    def __init__(self, a: int, b: int | None = None):
        self.a = a
        self.b = b if b is not None else a

This keeps the runtime type of b as int for all constructed instances, because None is replaced during initialization.

Attempt with @dataclass that triggers warnings

Rewriting this with @dataclass seems simple at first:

from dataclasses import dataclass
@dataclass
class Coord:
    a: int
    b: int | None
    def __post_init__(self):
        if self.b is None:
            self.b = self.a

The code works at runtime, but now b is annotated as optional. Static analysis can flag harmless code like Coord(1, 2).b + 0 as if it could be operating on None, producing a false positive warning. One way around this is to split the public field and the constructor-only input into separate attributes, but that feels unnecessarily verbose.

Why it happens

There is no built-in @dataclass feature to say “this field is optional for __init__ only, but non-optional on the instance.” Using __post_init__ lets you tweak values after the generated constructor runs, yet it does not change the declared type of the field, so tooling will continue to treat it as optional.

The practical fix

The simplest and most predictable solution is to keep using @dataclass, but provide your own __init__. The decorator will not generate one if you define it, and you still get the rest of dataclasses’ conveniences.

from dataclasses import dataclass
@dataclass
class Coord:
    a: int
    b: int
    def __init__(self, a: int, b: int | None = None):
        self.a = a
        self.b = self.a if b is None else b

This retains the desired construction behavior, avoids optional types on the instance, and preserves the benefits of @dataclass such as auto-generated __eq__, __repr__, optional rich comparison and hashing, and optional automatic slot generation. There is no need to pass any special flags; providing __init__ is enough.

Why this is worth knowing

It is easy to assume __post_init__ can handle all constructor-shaping needs, but it is meant for small adjustments when the generated __init__ mostly fits. When the constructor’s signature or semantics need to diverge, writing __init__ directly is both minimal and clear, while keeping the rest of the dataclass machinery intact.

I somehow did not think of just writing the init function myself. This is perfectly workable solution, as it still provides benefits over raw class like the comparisons and string representation you mentioned.

Conclusion

If a field should behave as non-optional on the instance but be optional at construction time, don’t force it through __post_init__. Define __init__ yourself inside an @dataclass and let the decorator handle everything else. You get precise typing, no false positives from optional annotations, and all the ergonomic advantages that dataclasses provide.

The article is based on a question from StackOverflow by Dominik Kaszewski and an answer by ShadowRanger.