2025, Oct 19 10:00
gdal2tiles.py Hangs in Conda/Docker? Stop Capturing Output to Fix Deadlocks and Make Timeouts Work
Fix gdal2tiles.py freezes in Conda Docker: stop capture_output in Python subprocess to avoid pipe deadlocks, letting GDAL tiling stream logs so timeouts work.
When a Python pipeline inside a Conda-based Docker image runs GDAL utilities, it’s reasonable to expect that a timeout will stop anything that stalls. But gdal2tiles.py can appear to freeze so deeply that even subprocess timeouts never fire. Below is a practical look at why this happens and how to fix it without changing your workflow.
Repro: a two-step GDAL workflow that freezes on gdal2tiles.py
The script first colorizes a GeoTIFF with gdaldem color-relief, then generates tiles with gdal2tiles.py. The first step always succeeds; the second step hangs with no error output and ignores the timeout.
try:
    logging.info("starting gdaldem color-relief")
    palette_data = build_palette_map(prod_cfg['cmap'], prod_cfg['vmin'], prod_cfg['vmax'])
    with open(palette_path, 'w') as fh:
        fh.write(palette_data)
    dem_cmd = ['gdaldem', 'color-relief', src_tiff, palette_path, tinted_tiff, '-alpha']
    subprocess.run(dem_cmd, check=True, capture_output=True, text=True, timeout=60)
    logging.info(f"colorized to {tinted_tiff}")
    logging.info(f"tiling {tinted_tiff} with gdal2tiles.py")
    tiles_cmd = [
        'gdal2tiles.py',
        '--profile=raster',
        '--zoom=5-12',
        '--webp-quality=90',
        tinted_tiff,
        tiles_dir
    ]
    subprocess.run(tiles_cmd, check=True, capture_output=True, text=True, timeout=180)
    logging.info(f"done -> {tiles_dir}")
    return tiles_dir
The environment and options were already validated. The correct gdal2tiles.py was discovered via a which check inside the Conda env (/opt/conda/envs/radar-env/bin/gdal2tiles.py). Removing --processes didn’t help, and the issue persisted regardless of tile format options. Swapping subprocess.run for Popen with communicate(timeout=) still wouldn’t raise TimeoutExpired.
What actually goes wrong
gdal2tiles.py can emit a lot of output. When that output is captured by Python (capture_output=True), it flows into buffered pipes. If those buffers fill up, the child process blocks while trying to write more, and the parent process blocks waiting for the child to exit. No error is emitted, and the Python-side timeout does not get a chance to trigger because the process is stuck in that I/O deadlock.
That explains why gdaldem color-relief succeeds while gdal2tiles.py does not. The former emits comparatively little output; the latter can be very verbose while generating many tiles, easily saturating the buffers inside a non-interactive container run.
The fix: stop capturing the output
The solution is simple and effective: let gdal2tiles.py write directly to the container’s stdout/stderr instead of capturing it. Removing capture_output=True breaks the deadlock, allowing the process to stream its logs and complete normally, and the Python timeout will work as expected if needed.
try:
    logging.info("starting gdaldem color-relief")
    palette_data = build_palette_map(prod_cfg['cmap'], prod_cfg['vmin'], prod_cfg['vmax'])
    with open(palette_path, 'w') as fh:
        fh.write(palette_data)
    dem_cmd = ['gdaldem', 'color-relief', src_tiff, palette_path, tinted_tiff, '-alpha']
    subprocess.run(dem_cmd, check=True, text=True, timeout=60)
    logging.info(f"colorized to {tinted_tiff}")
    logging.info(f"tiling {tinted_tiff} with gdal2tiles.py")
    tiles_cmd = [
        'gdal2tiles.py',
        '--profile=raster',
        '--zoom=5-12',
        '--webp-quality=90',
        tinted_tiff,
        tiles_dir
    ]
    subprocess.run(tiles_cmd, check=True, text=True, timeout=180)
    logging.info(f"done -> {tiles_dir}")
    return tiles_dir
Why this matters
Geospatial tooling is often chatty by design. In containerized, non-interactive runs, capturing that output can inadvertently create a hard deadlock that looks like an application hang and masks any timeout management you’ve set up. Letting output stream to the terminal sidesteps the bottleneck and gives you live insight into progress messages such as “Generating Base Tiles,” which is helpful when you need to validate activity rather than silence.
Conclusion
If gdal2tiles.py freezes inside a Dockerized Python process and ignores your timeouts, avoid capturing its stdout/stderr. Remove capture_output=True so logs are streamed. This small change typically unblocks long-running tile generation while keeping the rest of the pipeline intact, including the ability to detect failures via check=True and enforce timeouts if they’re legitimately reached.
The article is based on a question from StackOverflow by mrotskcud and an answer by Abhay Jain.