2025, Oct 17 06:00

Compute a dictionary's Cartesian product of keys and multiply Fraction values: one-pass itertools, recursion, performance and memory trade-offs

Learn how to build a dict of key tuples to Fraction products in Python using one-pass itertools or recursion, with benchmarks and memory considerations included.

Computing the Cartesian product of a dictionary with itself is a common pattern when you want tuples of keys and a numeric aggregation over the corresponding values. In this case the goal is to enumerate all key tuples of length repeat and, for each tuple, multiply the matched Fraction values. A straightforward approach builds two independent products—one for keys and one for values—and zips them by index. It works, but it forces two passes and allocates intermediate lists that can be avoided.

Baseline: two products, one for keys and one for values

The following example demonstrates the pattern that loops twice: once to build all key tuples and once to build all value tuples, using math.prod to multiply values. It stays within the standard library.

import itertools as tools
import math as m
from fractions import Fraction as F
weights = {
    'P1': F(7, 120), 'P2': F(20, 120), 'P3': F(6, 120),
    'P4': F(10, 120), 'P6': F(7, 120), 'P6': F(18, 120),
    'P7': F(16, 120), 'P8': F(19, 120), 'P9': F(17, 120)
}
repeat_count = 2  # variable; can be any non-negative integer
key_tuples = list(tools.product(weights, repeat=repeat_count))
value_products = list(map(m.prod, tools.product(weights.values(), repeat=repeat_count)))
result = {key_tuples[i]: value_products[i] for i in range(len(value_products))}
print(result)

This produces the expected mapping of tuple-of-keys to the multiplicative product of their Fraction values. The drawback is the duplicated effort and extra memory from maintaining two parallel lists.

What is actually needed

The keys must be emitted as tuples and the values must be the regular product over Fractions. There is no dependency on any non-standard packages, and repeat is a variable, so solutions tied to a fixed number of nested loops do not generalize. With these constraints, the core need is a single pass that emits both the tuple of keys and its multiplied value at the same time, or a way to build the same mapping without constructing two separate Cartesian products.

Solution 1: one-pass dictionary comprehension with itertools

A concise way to avoid the double pass is to iterate the Cartesian product of the keys only and compute the product of the corresponding values on the fly. This keeps the logic in one place and eliminates the index-based coupling between two lists.

import itertools as tools
import math as m
from fractions import Fraction as F
pool = {
    'P1': F(7, 120), 'P2': F(20, 120), 'P3': F(6, 120),
    'P4': F(10, 120), 'P6': F(7, 120), 'P6': F(18, 120),
    'P7': F(16, 120), 'P8': F(19, 120), 'P9': F(17, 120)
}
def cart_dict_onepass(bucket, times=2):
    return {
        keys: m.prod(bucket[k] for k in keys)
        for keys in tools.product(bucket, repeat=times)
    }
print(cart_dict_onepass(pool, times=2))

This approach uses a single comprehension that iterates over tuple-of-keys and multiplies their values with math.prod. It adheres to the standard library and scales with any repeat value.

Solution 2: no-itertools recursive builder

You can avoid itertools entirely by building the mapping recursively. Start from the neutral element of the operation—an empty tuple paired with 1—and cross-multiply one dimension at a time. This is essentially “the cross-product of a cross-product” applied repeat times.

from fractions import Fraction as F
pool = {
    'P1': F(7, 120), 'P2': F(20, 120), 'P3': F(6, 120),
    'P4': F(10, 120), 'P6': F(7, 120), 'P6': F(18, 120),
    'P7': F(16, 120), 'P8': F(19, 120), 'P9': F(17, 120)
}
def unfold(state, times=2):
    if times == 0:
        return state
    return unfold(
        {
            prefix + (k2,): state[prefix] * pool[k2]
            for prefix in state for k2 in pool
        },
        times - 1,
    )
def build_map(times=2):
    return unfold({(): 1}, times)
# If times is guaranteed to be > 0, you can skip one level
# by seeding with one dimension already expanded.
def build_map_nonempty(times=2):
    return unfold({(k,): pool[k] for k in pool}, times - 1)
print(build_map(times=2))

This yields the same dictionary as the itertools version, with the advantage that it does not depend on itertools at all. It is worth noting that Python does not optimize tail recursion; extremely large repeat values would hit recursion limits long before they become computationally feasible for a Cartesian product of this size.

Performance notes

On the example dataset, the direct nested iteration specialized to repeat equal to two was measured at about 121 μs. The recursive approach without itertools ran in the range of roughly 136–167 μs, while a one-pass itertools comprehension clocked around 398 μs. The two-pass baseline that builds separate key and value products took approximately 346 μs. These figures were observed on the same machine and input and illustrate that the no-itertools recursive approach can be faster in practice on this workload, while the fixed two-loop form remains the quickest when repeat is known to be exactly two.

Further measurements explored how repeat impacts wall-clock time. As expected for Cartesian products, time grows exponentially with repeat, and the ratios between methods stayed roughly constant across different repeat values. Increasing the dictionary size showed a similar picture: with repeat equal to two, costs grew roughly quadratically with the number of entries, and the relative ordering between methods did not change. Put differently, for this task the asymptotic behavior is O(n^repeat) and the practical choice between the outlined implementations is mostly about constant factors and dependencies rather than big-O differences.

Why it is worth knowing

Cartesians blow up quickly, so shaving intermediate allocations and redundant passes matters. A one-pass formulation avoids coupling two lists by index and keeps the computation local to the keys being iterated. The recursive formulation replaces library iteration with a simple state expansion step and, in the measurements above, delivered the best time among the general solutions. Understanding these trade-offs helps choose an approach that balances readability, dependencies and speed for your constraints.

Wrap-up

If repeat is fixed and small, a direct double comprehension such as {(k1, k2): pool[k1] * pool[k2] for k1 in pool for k2 in pool} is hard to beat. When repeat is a variable, choose between a one-pass itertools comprehension that computes values on the fly or a no-itertools recursive expansion seeded with {(): 1}. Both stay in the standard library, both produce the same mapping of key tuples to Fraction products, and both avoid the extra pass and memory overhead of building separate key and value products. As a stylistic note, prefer clear import names over terse aliases unless the alias is widely recognized; clarity pays off when you revisit the code under pressure.

The article is based on a question from StackOverflow by Ameya and an answer by chrslg.