2025, Sep 28 01:00

Reading lines from a Python pipe: clean EOF detection with readline, no select, no hangs

Learn how to read lines from a Python pipe and detect end-of-stream reliably: use readline EOF instead of select. Avoid blocked threads and shut down readers cleanly.

Reading lines from a pipe and knowing when the writer has shut down seems straightforward, yet it’s easy to overengineer. A common trap is to mix low-level readiness checks with high-level file-like I/O and end up missing the clean end-of-stream signal. The good news: the file interface already gives you the answer, and it’s simpler than it looks.

Problem setup

The goal is to read text line-by-line from a pipe while another part of the program writes small fragments to it. The tricky part is detecting when the writing end is closed so the read loop can exit cleanly. In the example below, the read loop never terminates and the join blocks until a manual interrupt.

from datetime import datetime
from itertools import batched
import os
from select import select
from threading import Thread
from time import sleep
# Each superscript digit encodes to 3 bytes.
sample_txt = '⁰¹²\n³\n⁴\n⁵⁶\n⁷⁸⁹⁰¹²\n³'
sample_buf = bytes(sample_txt, 'utf8')
r_fd, w_fd = os.pipe()
bin_out = open(w_fd, 'wb', buffering=0)
txt_in = open(r_fd, 'r')
collected = []
def pump_reader():
    t_prev = datetime.now()
    while True:
        sleep(1)
        t_now = datetime.now()
        print('A', (t_now - t_prev).total_seconds())
        t_prev = t_now
        r_ready, w_ready, e_ready = select([txt_in], [txt_in], [txt_in], 0)
        if txt_in.closed:
            break
        if e_ready:
            break
        if not r_ready:
            continue
        piece = txt_in.readline()
        print('B', (t_now - t_prev).total_seconds())
        t_prev = t_now
        if piece:
            print('got chunk', repr(piece))
            collected.append(piece)
worker = Thread(target=pump_reader)
worker.start()
for segment in batched(sample_buf, 4):
    payload = bytes(segment)
    sleep(1.6)
    bin_out.write(payload)
bin_out.close()
worker.join()
print(repr(collected))

The observed behavior: all expected lines arrive, including the last one without a trailing newline, but the reader loop never exits and the thread remains alive.

What’s really going on

When the writing end of the pipe is closed and there’s no more data, a line-oriented read via readline returns an empty string. That empty string is the end-of-file signal for text-mode file objects. In other words, once the final partial line is delivered, the next readline call yields "". If the loop doesn’t check for this and simply continues, it won’t terminate.

The clean signal is already available at the file-like layer. There’s no need to combine select with closed flags or exception lists in this scenario. The key is to treat an empty string from readline as the end of the stream.

The fix

Use the truthiness of the value returned by readline. As soon as it returns an empty string, the loop naturally ends. This results in a compact and correct read loop.

def pump_reader():
    while line := txt_in.readline():
        print('got chunk', repr(line))
        collected.append(line)

This form makes the EOF contract explicit and reliable. An equivalent approach is to iterate over the file object directly, which yields lines until the stream ends.

Why this matters

When working with subprocess pipes or any producer-consumer pipeline, a robust shutdown path is as important as steady-state throughput. Relying on the file API’s EOF behavior keeps the code simple and avoids subtle hangs where threads wait forever and joins block until forcibly interrupted. If you already read text via readline, let it communicate the end-of-stream for you.

Takeaways

Don’t fight the file interface. If the reader uses readline, treat an empty string as the end, and the loop will exit without auxiliary checks or timeouts. This minimizes complexity, reduces edge cases around partial lines, and keeps your thread lifecycle predictable.

The article is based on a question from StackOverflow by Steve Jorgensen and an answer by 0ro2.