2025, Oct 29 01:00

Transfer Topography to a Land-Use Grid: Nearest-Neighbor Lon/Lat Resampling of NetCDF4 in Python

Learn how to resample elevation from a topography grid onto a land-use grid using NetCDF4 and Pyresample. Nearest-neighbor by lon/lat in Python stays aligned.

When two NetCDF4 datasets cover the same region but live on different grids, a direct copy of values from one to the other won’t line up. A common case looks like this: a land-use file with its own lon/lat grid and a topography file with another lon/lat layout. The task is to transfer elevation values onto the land-use grid using nearest-neighbor by geographic coordinates. Below is a clear path to do that in Python without distorting the data model.

What goes wrong with a direct write

Reading an elevation array from the topography file and trying to assign it into the land-use file usually fails or produces the wrong spatial alignment. The shapes and coordinates differ, so the assignment isn’t meaningful even if the files cover roughly the same geographic extent.

from netCDF4 import Dataset
import numpy as np

lu_ds = Dataset('LUindexHCLIM3.nc', mode='a')
z_ds = Dataset('topography.nc', mode='r')

lon_target = np.array(lu_ds.variables['lon'])
lat_target = np.array(lu_ds.variables['lat'])

z_src = np.array(z_ds.variables['HGT_M'])

# Wrong idea: the grids are not the same, so this is not a valid assignment
z_on_lu = lu_ds.createVariable('HGT_M', 'f4', ('y', 'x'))
z_on_lu[:] = z_src  # mismatched grid/shape

lu_ds.close()
z_ds.close()

The elevation values belong to a different grid. Without mapping coordinates from the source grid to the destination grid, the data will be misplaced.

Why this happens

Longitude and latitude arrays encode the actual geographic location of each cell. Two variables that refer to different lon/lat arrays are on different grids, even if their bounding boxes overlap. Assigning one array to the other by index doesn’t respect geographic proximity. What you need is resampling: for each target cell, find the nearest source coordinate and take its value.

The practical fix: resample by nearest lon/lat with pyresample

A concise way to transfer values from one irregular or regular grid to another is to use pyresample. The idea is simple: define the source swath using the topography lon/lat, define the destination swath using the land-use lon/lat, then call nearest-neighbor resampling to produce an array shaped like the destination grid.

from netCDF4 import Dataset
import numpy as np
from pyresample import geometry, kd_tree

lu_nc = Dataset('LUindexHCLIM3.nc', mode='a')
ztop_nc = Dataset('topography.nc', mode='r')

# Pull fields from the land-use file
landcover_main = np.array(lu_nc.variables['Main_Nature_Cover'])
lon_dest = np.array(lu_nc.variables['lon'])
lat_dest = np.array(lu_nc.variables['lat'])

# Pull fields from the topography file
z_vals = np.array(ztop_nc.variables['HGT_M'])
lon_src = np.array(ztop_nc.variables['lon'])
lat_src = np.array(ztop_nc.variables['lat'])

# Build source and destination definitions for pyresample
src_swath = geometry.SwathDefinition(lons=lon_src, lats=lat_src)
dst_swath = geometry.SwathDefinition(lons=lon_dest, lats=lat_dest)

# Nearest-neighbor resampling to the land-use grid
z_on_landuse = kd_tree.resample_nearest(
    src_swath,
    z_vals,
    dst_swath,
    radius_of_influence=500000,
    fill_value=None
)

# Convert masked output to NaN-filled ndarray for convenience
z_on_landuse = z_on_landuse.filled(np.nan)

# Store the resampled heights into the land-use file
z_var = lu_nc.createVariable('HGT_M', 'f4', ('y', 'x'))
z_var[:] = z_on_landuse

lu_nc.close()
ztop_nc.close()

This assigns each target gridcell the elevation of the nearest source coordinate within the chosen radius. You can experiment with different interpolation schemes if needed.

Why it’s worth doing right

Working directly with lon/lat avoids brittle assumptions about array shapes. Resampling ensures the resulting field sits exactly on the land-use grid, so every downstream computation uses consistent cell boundaries and locations. This is particularly important when mixing datasets that were produced by different models or preprocessing pipelines.

Takeaways

When two NetCDF4 variables live on different grids, treat them as separate spatial layers and bring them together via resampling. Nearest-neighbor using pyresample is a straightforward way to transfer topography onto a land-use grid based on actual coordinates. Verify that both datasets cover the same area, resample by lon/lat, and then write the result into the target file. If the output quality needs tuning, try different interpolation options in the same workflow.

The article is based on a question from StackOverflow by Elsri and an answer by PhoenixFire1081.