2025, Nov 29 07:00

Accurate Python Tracing with sys.settrace: Ignore exec Wrapper Frames by Using Stack Depth

Learn why filename filters fail when debugging Python with sys.settrace under exec, and how using call stack depth isolates real user code from wrapper noise.

Building a Python debugger on top of sys.settrace is a powerful way to observe execution, but there’s a subtle trap when you load user code via exec. The trace hook starts firing for both the compiled user file and the wrapper that prepares and executes it. If you only care about what actually runs inside the input file’s logic, the mixed stream of events becomes noise.

Reproducing the issue

Consider a minimal runner that compiles a .py file and executes it with a trace hook. The execution events come from two places: the outer context that calls exec and the user code you’re trying to inspect.

src_text = pack_source(SRC_PATH)
compiled_mod = compile(src_text, SRC_PATH, 'exec')

sys.settrace(trace_cb)
exec(compiled_mod)
sys.settrace(None)

A straightforward attempt to filter by filename looks reasonable at first glance, because the compiled object carries the input path in its metadata. However, this does not separate the wrapper from the actual code inside the file.

if frm.f_code.co_filename != SRC_PATH:
    return None

The snag is that frame.f_code.co_filename equals the input path even during the initial exec step that wraps the execution. That means the filename check passes for both layers, so the tracer still reports wrapper activity as if it were user logic.

Why it happens

When you compile and exec the input, you’re executing a compiled <module> that points back to the same SRC_PATH. The wrapper that coordinates compile and exec is shallow in terms of the interpreter stack, but it still produces frames whose code object reports the same filename. Because the filename is identical across the wrapper’s entry point and the code inside the file, it isn’t a reliable discriminator for your use case.

The practical fix: use stack depth

Instead of relying on filenames, look at the current interpreter stack depth. The wrapper path is shallow, while calls that descend into user-defined functions are deeper. Filtering by depth cleanly ignores the outer exec context and starts tracing only when execution enters real user logic.

def trace_gate(frm, evt, payload=None):
    lvl = 0
    ptr = frm
    while ptr:
        lvl += 1
        ptr = ptr.f_back
    if lvl > 3:
        # ... inspect/log as needed ...
        return trace_gate
    return None

This keeps the trace hook dormant for the shallow wrapper frames and activates it once the call stack crosses the chosen threshold.

How to choose the cutoff

The threshold of 3 works as a practical baseline because a typical flow looks like this in layers: the main debug script, the exec entry point, the compiled <module>, and then the user’s functions. Once you cross into those functions, the depth grows beyond that baseline. If you add more wrappers—for example, move the logic into a class or import through an additional file—the depth increases accordingly. In other words, the exact number depends on how many layers sit between your runner and the user code.

Why this matters

Precision in tracing is the difference between actionable logs and noise. If wrapper frames are mixed into your event stream, you’ll misattribute calls, inflate line events with non-user activity, and complicate any downstream analysis. Using depth to separate concerns gives you a stable boundary between orchestration and behavior.

Takeaways

If a filename check can’t distinguish between the exec wrapper and the input file’s logic, fall back to the interpreter’s call stack depth. Count frames via f_back, ignore shallow events, and start tracing only after the stack gets deeper than your observed baseline. The exact cutoff is empirical and reflects your wrapper layers, but the approach itself stays robust as your loader evolves.