2025, Dec 23 17:00

Match MurmurHash3 x64 128-bit output in Python: correct hi/lo word order with mmh3 unsigned results

Learn why your MurmurHash3 x64 128-bit hex differs from a tool and how to fix it in Python with mmh3: use unsigned 64-bit halves, then concat hi/lo to match.

When you try to mirror the MurmurHash3 output from an online tool set to “Murmur Hash 3 x64 128 bit,” a straightforward call to a Python binding can yield a hex string with the right length but the wrong order. The output looks close enough to be confusing, yet it does not match byte-for-byte. The fix is simple once you compute the same variant and treat the 64-bit words the same way the tool does.

Reproducing the mismatch

The following snippet computes a 128-bit MurmurHash3 using a 64-bit implementation and prints it as a zero-padded hex string. The behavior matches the report: the result is the same digits, but in the opposite 64-bit word order compared to the online tool’s output.

import pymmh3
def demo_mmh128(txt):
    hashed_val = pymmh3.hash128(txt, seed=0, x64arch=True)
    hex_out = format(hashed_val & ((1 << 128) - 1), '032x')
    print(hex_out)
ua = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
demo_mmh128(ua)

Observed output: 4dc49759f17d090f97fca4dadead58f9. Expected output: 97fca4dadead58f94dc49759f17d090f.

What actually happens

MurmurHash3 has different implementations for x86 and x64, and they do not produce the same values. Computing a 128-bit result via a 64-bit path matters here. Another detail is how the result is represented: some bindings return signed integers, which can surface as a negative number for one of the 64-bit halves. Treating the halves as unsigned and then concatenating them in the correct order reproduces the tool’s behavior.

The fix

Use the 64-bit Murmur3 variant that returns two 64-bit words, force them to be unsigned, and then join the high and low parts to form a single 128-bit hex string. The following code does exactly that and matches the online tool:

import mmh3
def murmur3_x64_hex128(s):
    hi64, lo64 = mmh3.hash64(s, seed=0, x64arch=True, signed=False)
    return f"{hi64:016x}{lo64:016x}"
ua = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
print(murmur3_x64_hex128(ua))

This yields 97fca4dadead58f94dc49759f17d090f, which aligns with the site’s “Murmur Hash 3 x64 128 bit” output. The key points are selecting the x64 path, using unsigned results, and concatenating the two 64-bit integers to form the 128-bit digest.

Why this detail matters

Hash-based logic is often used for deduplication, partitioning, caching, or telemetry aggregation. A silent mismatch between environments—caused by mixing x86 vs x64 variants or signed vs unsigned word representations—breaks parity and leads to hard-to-spot inconsistencies. Knowing which Murmur3 variant you compute and how you format the output ensures your pipelines produce identical values across languages and tools.

Takeaways

If you need to match the output from that online tool, compute MurmurHash3 with the x64 implementation, keep the 64-bit words unsigned, and concatenate them to build the 128-bit hex. The example above does that using mmh3, which is also recommended over pymmh3 for performance. When reporting issues or validating behavior, provide a minimal, runnable Python snippet and the exact input string to make verification straightforward.