https://pytroubles.com/en/posts/id2015-fixing-python-file-i-o-cursor-bugs-avoid-re-seeking-handle-newlines-and-count-lines-correctly

Fixing Python File I/O Cursor Bugs: Avoid Re-Seeking, Handle Newlines, and Count Lines Correctly

Fix Python file I/O cursor bugs: stop re-seeking, normalize newlines, and count lines correctly

Fixing Python File I/O Cursor Bugs: Avoid Re-Seeking, Handle Newlines, and Count Lines Correctly

Learn how to fix Python file I/O cursor bugs: avoid seek(0) loops, handle CRLF vs LF newlines, use read-until logic, and get accurate line counts with EOFError.

2025-11-21T15:00:10+03:00

2025-11-21T15:00:11+03:00

Building a small helper over Python file I/O is a common way to learn about cursors, reads, and line boundaries. But there’s a subtle pitfall: if you re-seek to the beginning on every operation, you’ll keep rereading the same content and inflate your line counts. Below is a minimal case of how that happens and how to fix it without introducing external modules or changing the core idea.Reproducing the issueThe example below writes two lines into a file and then tries to iterate line by line until it hits EOFError. The class keeps two file handles, one for writing and one for reading, and maintains a shared cursor.def run(): f = QuickIO("numbers.txt") f.put("1, -2, 5, 0, 19, -7, end\n5, 5, -1, -10, 9, end") rows = 0 while(True): try: print("Line #"+str(rows+1)) f.jump_to_line(rows) except EOFError: break rows += 1 print("The amount of lines are: "+str(rows + 1)) class QuickIO: # members: path, out, inp, pos def __init__(self, path): self.path = path self.pos = 0 self.out = open(path, "w") self.inp = open(path, "r") self.out.seek(self.pos) self.inp.seek(self.pos) def jump_to(self, char_idx): self.pos = char_idx self.out.seek(char_idx) self.inp.seek(char_idx) tmp = open(self.path, "r") tmp.seek(char_idx) if tmp.read(1) == "": tmp.seek(char_idx - 1) if tmp.read(1) == "": tmp.close() raise EOFError tmp.close() def jump_to_line(self, n): self.jump_to(0) for i in range(0, n): print(repr(self.read_until())) def put(self, s): self.out.write(s) self.out.flush() self.jump_to(self.pos + len(s)) def read_until(self, token="\n"): data = self.inp.read() self.jump_to(self.pos) end = 0 while len(data) > end: if data[end:end+len(token)] == token: break if len(data[end:end+len(token)]) != len(token): self.jump_to(self.pos + len(data)) return data end += 1 self.jump_to(self.pos + end + len(token)) return data[0:end] run() What actually goes wrongThe repeated lines and wrong totals stem from resetting the file position. The method that is supposed to move to a specific line calls a hard reset: jump_to(0). That means every iteration starts from the beginning and discards the same content again. As a result, the loop never advances through the file the way you expect, and the output repeats the first line while the counter drifts out of sync.There’s also a platform-sensitive wrinkle: when you write text with newlines on Windows, the sequence can become \r\n. If your logic assumes a plain \n delimiter but the underlying file contains \r\n, you can observe “skipped” lines or mismatches. Opening both reader and writer with newline="\n" keeps the content consistent with the delimiter you search for.Fixing the logicThe first change is to stop seeking to the start of the file on every line jump. Read lines relative to where the cursor currently is. The second change is to make read_until raise EOFError when nothing more can be read, advance the cursor when a delimiter is found, and otherwise move to the end. Finally, normalize line endings by setting newline="\n" on both the writer and the reader.class QuickIO: # members: path, out, inp, pos def __init__(self, path): self.path = path self.pos = 0 self.out = open(path, "w", newline="\n") self.inp = open(path, "r", newline="\n") self.out.seek(self.pos) self.inp.seek(self.pos) def jump_to(self, char_idx): self.pos = char_idx self.out.seek(char_idx) self.inp.seek(char_idx) tmp = open(self.path, "r") tmp.seek(char_idx) if tmp.read(1) == "": tmp.seek(char_idx - 1) if tmp.read(1) == "": tmp.close() raise EOFError tmp.close() def jump_to_line(self, n): for _ in range(n): self.read_until('\n') return self.read_until('\n') def put(self, s): self.out.write(s) self.out.flush() self.jump_to(self.pos + len(s)) def read_until(self, token="\n"): data = self.inp.read() if data == "": raise EOFError self.jump_to(self.pos) end = 0 while len(data) > end: if data[end:end+len(token)] == token: self.jump_to(self.pos + end + len(token)) return data[0:end] end += 1 self.jump_to(self.pos + len(data)) return data[0:end] Drive it like this, asking for the “next” line each time by passing 0:def run(): f = QuickIO("numbers.txt") f.put("1, -2, 5, 0, 19, -7, end\n5, 5, -1, -10, 9, end") line_count = 0 while True: try: print(f"Line #{line_count + 1}") print(repr(f.jump_to_line(0))) except EOFError: break line_count += 1 print("The amount of lines are:", line_count) run() This yields the expected, non-duplicated result and the accurate line count.Why this mattersFile iteration correctness depends on one invariant: the read position must move forward deterministically. Any unconditional seek to the beginning in a per-line operation breaks that invariant and leads to rereads and inflated counters. Consistent newline handling is equally important; searching for \n while the file contains \r\n introduces off-by-one behavior and “missing” delimiters that are hard to spot.TakeawaysAdvance relative to the current cursor instead of re-seeking to zero, normalize newline behavior by opening both ends with newline="\n", and have your read-until logic either return the segment and advance past the delimiter or raise EOFError if nothing remains. If you later decide to streamline this further, you can also open the file once in r+ mode and rely on readline(), but the adjustments above are enough to make the current approach correct and predictable.

Python file I/O, file cursor, seek, re-seeking, newline handling, CRLF vs LF, Windows, EOFError, line counting, read_until, jump_to_line, example code, bug fix, tutorial

2025

2025, Nov 21 15:00

Fix Python file I/O cursor bugs: stop re-seeking, normalize newlines, and count lines correctly

Learn how to fix Python file I/O cursor bugs: avoid seek(0) loops, handle CRLF vs LF newlines, use read-until logic, and get accurate line counts with EOFError.

Reproducing the issue

The example below writes two lines into a file and then tries to iterate line by line until it hits EOFError. The class keeps two file handles, one for writing and one for reading, and maintains a shared cursor.

def run():
    f = QuickIO("numbers.txt")
    f.put("1, -2, 5, 0, 19, -7, end\n5, 5, -1, -10, 9, end")
    rows = 0
    while(True):
        try:
            print("Line #"+str(rows+1))
            f.jump_to_line(rows)
        except EOFError:
            break
        rows += 1
    print("The amount of lines are: "+str(rows + 1))
class QuickIO:
    # members: path, out, inp, pos
    def __init__(self, path):
        self.path = path
        self.pos = 0
        self.out = open(path, "w")
        self.inp = open(path, "r")
        self.out.seek(self.pos)
        self.inp.seek(self.pos)
    def jump_to(self, char_idx):
        self.pos = char_idx
        self.out.seek(char_idx)
        self.inp.seek(char_idx)
        tmp = open(self.path, "r")
        tmp.seek(char_idx)
        if tmp.read(1) == "":
            tmp.seek(char_idx - 1)
            if tmp.read(1) == "":
                tmp.close()
                raise EOFError
        tmp.close()
    def jump_to_line(self, n):
        self.jump_to(0)
        for i in range(0, n):
            print(repr(self.read_until()))
    def put(self, s):
        self.out.write(s)
        self.out.flush()
        self.jump_to(self.pos + len(s))
    def read_until(self, token="\n"):
        data = self.inp.read()
        self.jump_to(self.pos)
        end = 0
        while len(data) > end:
            if data[end:end+len(token)] == token:
                break
            if len(data[end:end+len(token)]) != len(token):
                self.jump_to(self.pos + len(data))
                return data
            end += 1
        self.jump_to(self.pos + end + len(token))
        return data[0:end]
run()

What actually goes wrong

The repeated lines and wrong totals stem from resetting the file position. The method that is supposed to move to a specific line calls a hard reset: jump_to(0). That means every iteration starts from the beginning and discards the same content again. As a result, the loop never advances through the file the way you expect, and the output repeats the first line while the counter drifts out of sync.

There’s also a platform-sensitive wrinkle: when you write text with newlines on Windows, the sequence can become \r\n. If your logic assumes a plain \n delimiter but the underlying file contains \r\n, you can observe “skipped” lines or mismatches. Opening both reader and writer with newline="\n" keeps the content consistent with the delimiter you search for.

Fixing the logic

The first change is to stop seeking to the start of the file on every line jump. Read lines relative to where the cursor currently is. The second change is to make read_until raise EOFError when nothing more can be read, advance the cursor when a delimiter is found, and otherwise move to the end. Finally, normalize line endings by setting newline="\n" on both the writer and the reader.

class QuickIO:
    # members: path, out, inp, pos
    def __init__(self, path):
        self.path = path
        self.pos = 0
        self.out = open(path, "w", newline="\n")
        self.inp = open(path, "r", newline="\n")
        self.out.seek(self.pos)
        self.inp.seek(self.pos)
    def jump_to(self, char_idx):
        self.pos = char_idx
        self.out.seek(char_idx)
        self.inp.seek(char_idx)
        tmp = open(self.path, "r")
        tmp.seek(char_idx)
        if tmp.read(1) == "":
            tmp.seek(char_idx - 1)
            if tmp.read(1) == "":
                tmp.close()
                raise EOFError
        tmp.close()
    def jump_to_line(self, n):
        for _ in range(n):
            self.read_until('\n')
        return self.read_until('\n')
    def put(self, s):
        self.out.write(s)
        self.out.flush()
        self.jump_to(self.pos + len(s))
    def read_until(self, token="\n"):
        data = self.inp.read()
        if data == "":
            raise EOFError
        self.jump_to(self.pos)
        end = 0
        while len(data) > end:
            if data[end:end+len(token)] == token:
                self.jump_to(self.pos + end + len(token))
                return data[0:end]
            end += 1
        self.jump_to(self.pos + len(data))
        return data[0:end]

Drive it like this, asking for the “next” line each time by passing 0:

def run():
    f = QuickIO("numbers.txt")
    f.put("1, -2, 5, 0, 19, -7, end\n5, 5, -1, -10, 9, end")
    line_count = 0
    while True:
        try:
            print(f"Line #{line_count + 1}")
            print(repr(f.jump_to_line(0)))
        except EOFError:
            break
        line_count += 1
    print("The amount of lines are:", line_count)
run()

This yields the expected, non-duplicated result and the accurate line count.

Why this matters

File iteration correctness depends on one invariant: the read position must move forward deterministically. Any unconditional seek to the beginning in a per-line operation breaks that invariant and leads to rereads and inflated counters. Consistent newline handling is equally important; searching for \n while the file contains \r\n introduces off-by-one behavior and “missing” delimiters that are hard to spot.

Takeaways

Advance relative to the current cursor instead of re-seeking to zero, normalize newline behavior by opening both ends with newline="\n", and have your read-until logic either return the segment and advance past the delimiter or raise EOFError if nothing remains. If you later decide to streamline this further, you can also open the file once in r+ mode and rely on readline(), but the adjustments above are enough to make the current approach correct and predictable.

file oop python