2025, Oct 21 08:00

Python testing without surprises: avoid monkeypatching builtins.open, use pytest tmp_path or inject an opener

Learn why monkeypatching builtins.open triggers pdb TypeError on encoding, and how to test file I/O using pytest tmp_path or by injecting an open function.

Monkeypatching builtins like open looks tempting when a function reads a file, but it often backfires during debugging. A common symptom is a TypeError about an unexpected keyword argument encoding when pdb kicks in. Here’s what’s going on, and how to test file I/O without tripping over the debugger.

The minimal repro

The function reads a file and counts unique alphabetic characters. The test replaces builtins.open with a lambda returning an in-memory stream. As soon as pdb is involved, the test explodes with a TypeError.

import builtins
import io
import pytest


def tally_unique_alpha_from_file(p: str) -> int:
    with open(p) as fh:
        return len(set(ch.lower() for ch in fh.read() if ch.isalpha()))


def test_counts(monkeypatch: pytest.MonkeyPatch):
    src = "Quick frog or something"
    monkeypatch.setattr(builtins, "open", lambda _p: io.StringIO(src))
    assert tally_unique_alpha_from_file("blah.txt") == 15

Why this breaks

The lambda replacement only accepts a single positional argument. The real open supports more parameters, including encoding. When you step into pdb (for example via pytest --pdb or pdb.set_trace()), pdb itself calls open to read its rc file as in ~/.pdbrc with an explicit encoding. That call lands in the lambda, which doesn’t accept encoding, and Python raises TypeError. In short, the test replaced a global primitive used by the debugger, so unrelated tooling started failing.

A safer approach: use a real temporary file

You don’t need to mock or monkeypatch anything to test code that reads a file. Pytest provides tmp_path for working with temporary directories. Create a real file, write the content you need, and pass its path to the function. This keeps the debugger and the rest of the runtime intact.

import pathlib


def tally_unique_alpha_from_file(p: str) -> int:
    with open(p) as fh:
        return len(set(ch.lower() for ch in fh.read() if ch.isalpha()))


def test_counts(tmp_path: pathlib.Path):
    fpath = tmp_path / "blah.txt"
    fpath.write_text("Quick frog or something")
    assert tally_unique_alpha_from_file(str(fpath)) == 15

Pytest keeps directories from the last three test runs by default before it starts cleaning them up, and if a test fails you’ll see the fixture values in the output, which makes debugging straightforward. If you prefer standard library tools, an integration test with a temporary file is also viable.

Alternative: inject the opener

If you want to control how files are opened without touching builtins, invert the dependency and accept an opener function. The function still uses open by default, but tests can supply a different callable. This makes the contract explicit and avoids global side effects.

from collections.abc import Callable
from io import StringIO
from typing import TextIO
from unittest.mock import Mock


def tally_unique_alpha_from_file(
    p: str, *, open_fn: Callable[[str], TextIO] = open
) -> int:
    with open_fn(p) as fh:
        return len(set(ch.lower() for ch in fh.read() if ch.isalpha()))


def test_counts_with_mock():
    fake_open = Mock(spec=open)
    fake_open.return_value = StringIO("Quick frog or something")
    assert tally_unique_alpha_from_file("blah.txt", open_fn=fake_open) == 15

This model inverts the dependency and also segregates the interface by stating exactly how the function intends to use the opener. No monkeypatching, no surprises for pdb or other tooling.

Why this matters

Mocking what you don’t own, especially builtins, can break unrelated parts of the runtime such as debuggers, test harnesses, or libraries that rely on standard behavior. Tests that use real files via temporary paths are more robust and easier to reason about. One caveat is that temporary files won’t tell you how many times the file was opened, which can be relevant for verifying caching, but they do make most I/O tests simpler and more reliable.

Takeaways

If file content is all you need, prefer real temporary files through pytest’s tmp_path or an actual temporary file. If you need to observe or steer how a file is opened, inject an opener instead of altering builtins. Both approaches let you debug with pdb without colliding with the encoding-aware calls it makes internally.

The article is based on a question from StackOverflow by Dominik Kaszewski and an answer by jonrsharpe.

debugging monkeypatching pytest python