https://pytroubles.com/en/posts/id2349-python-main-vs-import-fixing-class-identity-and-dataclass-equality-breaks-duplicate-modules

Python __main__ vs import: fixing class identity and dataclass equality breaks, duplicate modules

Why Python loads the same module twice (__main__ vs A) and how it breaks class and dataclass equality

Python __main__ vs import: fixing class identity and dataclass equality breaks, duplicate modules

Understand Python’s __main__ vs import pitfall that duplicates modules causing class identity mismatches and dataclass equality. Learn fixes using entrypoints.

2025-12-08T17:00:11+03:00

When the same class appears to come from two different modules, equality breaks in surprising ways. This shows up with dataclasses and with plain classes alike: you compare two instances that should match, but they don’t, because their types report different qualified names. The root cause is not in dataclasses, but in how Python treats the entrypoint module versus an imported module.Reproducing the issueConsider two files. The first one is used as the entrypoint and also defines the class. The second imports the first and constructs an instance.module A:import B class RecordBox: pass if __name__ == '__main__': first = RecordBox() second = B.build_obj() print(type(first)) print(type(second)) module B:import A def build_obj(): return A.RecordBox() Running A.py prints two different origins for what should be the same class, for example:<class '__main__.RecordBox'> <class 'A.RecordBox'> What’s really happeningThe core of the problem is that A.py is executed as the entrypoint and is therefore the module named __main__, not the module named A.When you start the program by executing A.py directly, Python loads that file as the __main__ module. Later, when B imports A, Python looks for a module named A. Since the running entrypoint is named __main__, it won’t be found under A in the import cache and it is loaded again as a separate module object. As a result, there are two distinct module objects in memory: one bound to __main__ and another bound to A. Each defines its own RecordBox class, and these class objects are different. Instances coming from different class objects never compare equal, and their type() representations reveal the mismatch.This situation is very close to a circular import and amplifies the confusion around type identity and equality. It’s expected behavior stemming from the import mechanism, not a dataclass-specific bug.How to fix itThe fix is to avoid executing the file that defines the class as the entrypoint. Instead, structure it so it can be imported first, and only then invoke the runtime logic. One simple way is to move the runtime code into a function and call that function after importing the module.Revised A.py:import B class RecordBox: pass def bootstrap(): x = RecordBox() y = B.build_obj() print(type(x)) print(type(y)) if __name__ == '__main__': bootstrap() Now, import it and call the function so that the module is consistently known as A:python -c "import A; A.bootstrap()" This yields the expected, matching types:<class 'A.RecordBox'> <class 'A.RecordBox'> A practical refinement is to use a separate entrypoint script that imports your library-like module and calls its bootstrap function, or to place the class in a dedicated module and keep execution logic elsewhere. The key is to ensure the class is defined only once under a single module name before any code instantiates it.Why this mattersClass identity underpins equality checks, isinstance tests, and any dataclass comparison semantics. If the same source file is loaded twice under different names, you effectively get two different classes with the same code. That undermines comparisons, disguises subtle bugs, and makes behavior depend on how the program is launched. Keeping a single import path for a module guarantees that types remain identical and that cross-module interactions behave predictably.TakeawaysTreat files that define classes and reusable logic as importable modules, not as entrypoints. Ensure they are imported exactly once under a consistent name, and invoke runtime behavior only after import. This avoids duplicate module loads, prevents class identity mismatches, and keeps equality comparisons reliable.

Python __main__, module import, duplicate modules, class identity, dataclass equality, circular imports, type equality, isinstance, entrypoint script, import cache, bootstrap function

2025

2025, Dec 08 17:00

Why Python loads the same module twice (main vs A) and how it breaks class and dataclass equality

Understand Python’s __main__ vs import pitfall that duplicates modules causing class identity mismatches and dataclass equality. Learn fixes using entrypoints.

Reproducing the issue

Consider two files. The first one is used as the entrypoint and also defines the class. The second imports the first and constructs an instance.

module A:

import B
class RecordBox:
    pass
if __name__ == '__main__':
    first = RecordBox()
    second = B.build_obj()
    print(type(first))
    print(type(second))

module B:

import A
def build_obj():
    return A.RecordBox()

Running A.py prints two different origins for what should be the same class, for example:

<class '__main__.RecordBox'>
<class 'A.RecordBox'>

What’s really happening

The core of the problem is that A.py is executed as the entrypoint and is therefore the module named __main__, not the module named A.

When you start the program by executing A.py directly, Python loads that file as the __main__ module. Later, when B imports A, Python looks for a module named A. Since the running entrypoint is named __main__, it won’t be found under A in the import cache and it is loaded again as a separate module object. As a result, there are two distinct module objects in memory: one bound to __main__ and another bound to A. Each defines its own RecordBox class, and these class objects are different. Instances coming from different class objects never compare equal, and their type() representations reveal the mismatch.

This situation is very close to a circular import and amplifies the confusion around type identity and equality. It’s expected behavior stemming from the import mechanism, not a dataclass-specific bug.

How to fix it

The fix is to avoid executing the file that defines the class as the entrypoint. Instead, structure it so it can be imported first, and only then invoke the runtime logic. One simple way is to move the runtime code into a function and call that function after importing the module.

Revised A.py:

import B
class RecordBox:
    pass
def bootstrap():
    x = RecordBox()
    y = B.build_obj()
    print(type(x))
    print(type(y))
if __name__ == '__main__':
    bootstrap()

Now, import it and call the function so that the module is consistently known as A:

python -c "import A; A.bootstrap()"

This yields the expected, matching types:

<class 'A.RecordBox'>
<class 'A.RecordBox'>

A practical refinement is to use a separate entrypoint script that imports your library-like module and calls its bootstrap function, or to place the class in a dedicated module and keep execution logic elsewhere. The key is to ensure the class is defined only once under a single module name before any code instantiates it.

Why this matters

Class identity underpins equality checks, isinstance tests, and any dataclass comparison semantics. If the same source file is loaded twice under different names, you effectively get two different classes with the same code. That undermines comparisons, disguises subtle bugs, and makes behavior depend on how the program is launched. Keeping a single import path for a module guarantees that types remain identical and that cross-module interactions behave predictably.

Takeaways

Treat files that define classes and reusable logic as importable modules, not as entrypoints. Ensure they are imported exactly once under a consistent name, and invoke runtime behavior only after import. This avoids duplicate module loads, prevents class identity mismatches, and keeps equality comparisons reliable.

python types