2025, Nov 16 21:00

Speed Up Large Pytest Runs: Use pytest_collection_finish to Disable Costly DataFrame Logging When Over 100 Tests Are Selected

Speed up pytest by disabling verbose DataFrame logging based on finalized test selection via pytest_collection_finish. Prevent slowdowns in large test runs.

Large test suites often accumulate helpful but heavy diagnostics. A classic example is verbose DataFrame logging: it’s great when you’re chasing a flaky case or drilling into one or two tests, yet it becomes a bottleneck when hundreds of tests run. If that diagnostic stream eats 30% of the wall clock, you want a simple switch that disables it automatically whenever a run includes more than, say, 100 tests.

Problem setup

The idea is straightforward: determine how many tests are about to run, and if that number exceeds a threshold, set an environment variable and make the logging function a no-op. An initial attempt wires this into pytest’s collection phase via a hook, and the logging function checks the flag.

def pytest_collection_modifyitems(config, collected):
    total_tests = len(collected)
    if total_tests > 100:
        print(f'Running {total_tests} tests (more than 100 tests), disabling dataframe logging.')
        os.environ['RUNNING_MORE_THAN_100_TESTS'] = '1'
    else:
        print(f'Running {total_tests} tests.')
def dump_dataframe(tbl: DataFrame, sink):
    if os.environ.get('RUNNING_MORE_THAN_100_TESTS'):
        sink.info('RUNNING_MORE_THAN_100_TESTS is set. dump_dataframe() skipped.')
        return
    else:
        # log df as usual

However, invoking pytest with a selection expression shows a mismatch between what the hook reports and what will actually execute. The output reveals that hundreds of items are seen, but most are then deselected, leaving a much smaller active set. The costly logging still gets disabled as if all of them were going to run.

What’s really happening

The hook above executes before filters and deselection are applied. At that moment, pytest has discovered all potential tests, but it hasn’t yet narrowed them down based on selectors like -k. As a result, the decision uses the pre-filter count, which can be much larger than the actual number of selected items.

The right hook for the job

The fix is to move the decision into the point where collection has finished and selection has been applied. pytest exposes a hook that runs after collection is complete and items are finalized for execution. Using that guarantees the count reflects the tests that will actually run.

def pytest_collection_finish(session):
    selected_total = len(session.items)
    if selected_total > 100:
        print(f'Running {selected_total} tests (more than 100 tests), disabling dataframe logging.')
        os.environ['RUNNING_MORE_THAN_100_TESTS'] = '1'
    else:
        print(f'Running {selected_total} tests.')

With this change, the environment variable is set only when the finalized selection exceeds the threshold. The DataFrame logger’s check stays the same and correctly becomes a no-op only in those larger runs.

Why this detail matters

On big suites, diagnostics that are negligible in isolation can dominate total runtime. Tying toggles to the correct lifecycle event prevents accidental slowdowns. In this case, the difference between “all discovered tests” and “tests that will run” is crucial when selectors and deselection are in play. The correct hook ensures the decision aligns with what’s actually executed.

Takeaways

When you need to condition behavior on the size of the active test set, base your logic on the finalized items rather than the initially discovered pool. Using the post-collection hook keeps expensive features enabled for focused debugging runs and automatically dials them down when the suite grows beyond your threshold. The end result is faster feedback cycles without sacrificing the depth of diagnostics when you truly need them.