https://pytroubles.com/en/posts/id1350-fixing-byte-for-byte-binary-set-file-sync-in-python-update-in-memory-bytes-then-write-once

Fixing Byte-for-Byte Binary .SET File Sync in Python: Update In-Memory Bytes, Then Write Once

Byte-for-Byte Alignment of Binary .SET Files in Python: Update a Bytearray in Memory, Then Write and Truncate

Fixing Byte-for-Byte Binary .SET File Sync in Python: Update In-Memory Bytes, Then Write Once

Learn to sync binary .SET files byte-for-byte in Python: update in memory, then write once and truncate. Avoid in-loop writes and offset issues for stable sync

2025-10-28T11:00:08+03:00

2025-10-28T11:00:09+03:00

When you need to align the contents of two binary .SET files byte-for-byte, the instinct to patch differences in place is understandable. In this case, the goal was to copy differing bytes from a working GPS receiver’s SET file into the one from a receiver stuck in a boot loop. The initial Python approach ran without errors on Windows 11, but the target file didn’t change. The fix turned out to be about when and how the data is written.Problem setup and failing exampleThe intent was to compare two files and write differences directly into the target file while iterating:from itertools import zip_longestcounter = 0with open(r"C:\Users\jimal\Desktop\APP2\mgnShell.set", "r+b") as bad_fp, open(r"E:\APP\mgnShell.set", "rb") as good_fp: bad_blob = bad_fp.read() bad_bytes = bytearray(bad_blob) good_blob = good_fp.read() good_bytes = bytearray(good_blob) for b_bad, b_good in zip_longest(bad_bytes, good_bytes): counter += 1 if b_bad != b_good: bad_fp.seek(bad_bytes.index(b_bad)) if b_good == None: bad_fp.write(bytes(0x00)) else: bad_fp.write(bytes(b_good)) print(f"Done! Count was {counter}")What actually went wrongThe core issue wasn’t about OS permissions or folder attributes. The problem lived in the write strategy. Writing inside the loop fights against the flow of the operation. The whole set of changes needs to be assembled first, and only then should the write happen. In other words, update the in-memory bytearray, and after the loop completes, write that finalized buffer back to disk.There’s a second part to the fix: directly modifying the file while you iterate the data complicates control over offsets and final size. Updating the bytearray first, then writing once, ensures the file receives a coherent, complete result.Working approachThe corrected flow reads both files, computes the differences, updates the target buffer in memory, and only after the loop writes the final buffer to the file and truncates it:counter = 0with open(r"C:\Users\jimal\Desktop\APP2\mgnShell.set", "rb+") as dst_fp, open(r"E:\APP\mgnShell.set", "rb") as src_fp: dst_raw = dst_fp.read() dst_arr = bytearray(dst_raw) src_raw = src_fp.read() src_arr = bytearray(src_raw) for b_dst, b_src, idx in zip(dst_arr, src_arr, range(len(dst_arr))): counter += 1 if b_dst != b_src: dst_arr[idx] = b_src data_out = dst_arr[0:len(src_arr)] dst_fp.seek(0) dst_fp.write(data_out) dst_fp.truncate() print(f"Done! Count was {counter}")Why this mattersBinary file edits are unforgiving. A small misstep in write timing or offset management can lead to no-ops or partial writes that silently miss the intended effect. Consolidating all modifications in memory and committing once gives you a clean boundary: differences are computed first; persistence happens second. It also lines up with a straightforward mental model—read, transform, write—which is easier to validate and reason about.If the goal is to make two files identical, sometimes the simplest path is to replace one with the other. When you do need to transform data, it can be simpler and safer to open files for reading, prepare the result in memory, and then open for writing to persist the final buffer. If something appears off, basic print debugging—printing types, lengths, and progress through the loop—helps confirm what the code is really doing at each step.TakeawaysFor byte-level synchronization between two files, compute all changes in memory, then write once at the end. Seek to the start, write the prepared buffer, and truncate to ensure the on-disk file matches the source length. If you only need identical content, consider a direct replacement. And when diagnosing, make the invisible visible: instrument the code with prints to verify assumptions about data and control flow.

binary .SET files, byte-for-byte sync, Python bytearray, write once then truncate, file compare, in-memory update, GPS receiver boot loop, Windows 11, file offsets, binary patching, zip_longest

2025

2025, Oct 28 11:00

Byte-for-Byte Alignment of Binary .SET Files in Python: Update a Bytearray in Memory, Then Write and Truncate

Learn to sync binary .SET files byte-for-byte in Python: update in memory, then write once and truncate. Avoid in-loop writes and offset issues for stable sync

Problem setup and failing example

The intent was to compare two files and write differences directly into the target file while iterating:

from itertools import zip_longest

counter = 0

with open(r"C:\Users\jimal\Desktop\APP2\mgnShell.set", "r+b") as bad_fp, open(r"E:\APP\mgnShell.set", "rb") as good_fp:
    bad_blob = bad_fp.read()
    bad_bytes = bytearray(bad_blob)
    good_blob = good_fp.read()
    good_bytes = bytearray(good_blob)
    for b_bad, b_good in zip_longest(bad_bytes, good_bytes):
        counter += 1
        if b_bad != b_good:
            bad_fp.seek(bad_bytes.index(b_bad))
            if b_good == None:
                bad_fp.write(bytes(0x00))
            else:
                bad_fp.write(bytes(b_good))
            
print(f"Done! Count was {counter}")

What actually went wrong

The core issue wasn’t about OS permissions or folder attributes. The problem lived in the write strategy. Writing inside the loop fights against the flow of the operation. The whole set of changes needs to be assembled first, and only then should the write happen. In other words, update the in-memory bytearray, and after the loop completes, write that finalized buffer back to disk.

There’s a second part to the fix: directly modifying the file while you iterate the data complicates control over offsets and final size. Updating the bytearray first, then writing once, ensures the file receives a coherent, complete result.

Working approach

The corrected flow reads both files, computes the differences, updates the target buffer in memory, and only after the loop writes the final buffer to the file and truncates it:

counter = 0

with open(r"C:\Users\jimal\Desktop\APP2\mgnShell.set", "rb+") as dst_fp, open(r"E:\APP\mgnShell.set", "rb") as src_fp:
    dst_raw = dst_fp.read()
    dst_arr = bytearray(dst_raw)
    src_raw = src_fp.read()
    src_arr = bytearray(src_raw)
    for b_dst, b_src, idx in zip(dst_arr, src_arr, range(len(dst_arr))):
        counter += 1
        if b_dst != b_src:
            dst_arr[idx] = b_src
    data_out = dst_arr[0:len(src_arr)]
    dst_fp.seek(0)
    dst_fp.write(data_out)
    dst_fp.truncate()
                
print(f"Done! Count was {counter}")

Why this matters

Binary file edits are unforgiving. A small misstep in write timing or offset management can lead to no-ops or partial writes that silently miss the intended effect. Consolidating all modifications in memory and committing once gives you a clean boundary: differences are computed first; persistence happens second. It also lines up with a straightforward mental model—read, transform, write—which is easier to validate and reason about.

If the goal is to make two files identical, sometimes the simplest path is to replace one with the other. When you do need to transform data, it can be simpler and safer to open files for reading, prepare the result in memory, and then open for writing to persist the final buffer. If something appears off, basic print debugging—printing types, lengths, and progress through the loop—helps confirm what the code is really doing at each step.

Takeaways

For byte-level synchronization between two files, compute all changes in memory, then write once at the end. Seek to the start, write the prepared buffer, and truncate to ensure the on-disk file matches the source length. If you only need identical content, consider a direct replacement. And when diagnosing, make the invisible visible: instrument the code with prints to verify assumptions about data and control flow.

The article is based on a question from StackOverflow by jacob malu and an answer by jacob malu.

binaryfiles python python-3.x