https://pytroubles.com/en/posts/id2879-openeo-python-how-to-extract-datacube-timestamps-earliest-latest-without-downloading-rasters

openEO Python: How to Extract DataCube Timestamps (Earliest/Latest) Without Downloading Rasters

Extracting real timestamps from an openEO DataCube in Python: avoid min_time/max_time traps and keep downloads light

openEO Python: How to Extract DataCube Timestamps (Earliest/Latest) Without Downloading Rasters

Get actual openEO DataCube timestamps in Python—earliest and latest—without downloading rasters. Avoid min_time/max_time confusion with spatial aggregation

2026-01-04T09:00:12+03:00

When exploring multi-temporal satellite cubes with the openEO Python client, it’s tempting to ask the cube for “the minimum time” or “the maximum time” and expect dates in return. In practice, those calls don’t yield timestamps, and the results can be counterintuitive when visualized. Here’s a clean way to extract the actual time steps present in your DataCube — or derive the earliest/latest date — without pulling down full raster data.Reproducing the confusionIt looks straightforward to compute temporal extrema directly on the cube. However, these operations collapse the temporal dimension and return a new DataCube, not the time coordinates you’re after.# These reduce along the temporal axis and yield DataCubes, not timestamps min_dc = dc.min_time() max_dc = dc.max_time() Reading the global metadata extent does not help either, as it only exposes the collection’s declared time bounds and not the actual dates present in your filtered subset.# This often reflects collection bounds like ['2015-07-04T00:00:00Z', None] # rather than the real timestamps of your cube selection cube.metadata.extent["temporal"] What’s really happeningIn openEO, load_collection assembles a process graph — a declarative recipe — without touching the data. Real data is fetched only when execute or save_result is called. The min_time and max_time operations collapse the temporal dimension by taking value-wise extrema over time, so they don’t return temporal coordinates. That explains why a “min” visualization can show more vegetation than a “max”: those are value extrema over time, not chronological minimum or maximum dates.A minimal, timestamp-first approachThe most efficient way to list timestamps without downloading heavy imagery is to request a lightweight band and spatially aggregate it down to a single number per time step. This keeps the server-side computation minimal and returns a small dictionary keyed by timestamps after execute. Using the SCL band and a simple cloud mask as an example, you get one mean value per date — and the timestamps are exactly the keys you need.# Example: enumerate timestamps for scenes in 2025 # with less than 10% cloud coverage over a given area year_str = "2025" s2_scl_cube = conn.load_collection( "SENTINEL2_L2A", temporal_extent=year_str, spatial_extent=dict(zip(["west", "north", "east", "south"], bbox_rect)), bands="SCL", max_cloud_cover=10, ) # SCL classes that indicate clouds or shadows CLOUD_FLAGS = [3, 8, 9, 10, 11] def flag_cloud(px): m = constant(0) for c in CLOUD_FLAGS: m = m.or_(px.eq(c)) return m.int() # One mean value per timestamp after spatial aggregation cloud_series = ( s2_scl_cube .apply(flag_cloud) .aggregate_spatial(geometries=region_geom, reducer="mean") ) # Execute: returns a small dict with timestamps as keys series_dict = cloud_series.execute() print(series_dict) Example result{ '2025-02-05T00:00:00Z': [[0.0]], '2025-02-10T00:00:00Z': [[0.0]], '2025-02-20T00:00:00Z': [[0.0]], '2025-03-19T00:00:00Z': [[0.0]], '2025-04-03T00:00:00Z': [[0.0]], '2025-04-11T00:00:00Z': [[0.1250]], '2025-09-22T00:00:00Z': [[0.0171]], '2025-10-02T00:00:00Z': [[0.6767]] } The keys are the timestamps available in the cube selection. If you only need the date list, extract the keys.timestamps = list(series_dict.keys()) Why this mattersThis pattern leverages openEO’s process-graph model properly. You keep the data flow light, avoid downloading full raster tiles, and still obtain authoritative time steps from the backend. It also resolves the mismatch between chronological questions and value-based temporal reductions, which are different operations with different results.Practical wrap-upDon’t expect min_time or max_time to return dates; they compute value extrema over time and collapse the temporal dimension. When you need the actual timestamps in your subset, request a lean band, aggregate spatially to a single value per time slice, and execute to receive a compact dictionary keyed by timestamps. From there, you can use the keys as your complete list of dates or derive the earliest and latest as needed, all without pulling heavy imagery.

openEO, Python client, DataCube timestamps, earliest date, latest date, min_time, max_time, temporal reduction, extract timestamps, aggregate_spatial, SCL band, cloud mask, Sentinel-2

2026

2026, Jan 04 09:00

Extracting real timestamps from an openEO DataCube in Python: avoid min_time/max_time traps and keep downloads light

Get actual openEO DataCube timestamps in Python—earliest and latest—without downloading rasters. Avoid min_time/max_time confusion with spatial aggregation

Reproducing the confusion

It looks straightforward to compute temporal extrema directly on the cube. However, these operations collapse the temporal dimension and return a new DataCube, not the time coordinates you’re after.

# These reduce along the temporal axis and yield DataCubes, not timestamps
min_dc = dc.min_time()
max_dc = dc.max_time()

Reading the global metadata extent does not help either, as it only exposes the collection’s declared time bounds and not the actual dates present in your filtered subset.

# This often reflects collection bounds like ['2015-07-04T00:00:00Z', None]
# rather than the real timestamps of your cube selection
cube.metadata.extent["temporal"]

What’s really happening

In openEO, load_collection assembles a process graph — a declarative recipe — without touching the data. Real data is fetched only when execute or save_result is called. The min_time and max_time operations collapse the temporal dimension by taking value-wise extrema over time, so they don’t return temporal coordinates. That explains why a “min” visualization can show more vegetation than a “max”: those are value extrema over time, not chronological minimum or maximum dates.

A minimal, timestamp-first approach

The most efficient way to list timestamps without downloading heavy imagery is to request a lightweight band and spatially aggregate it down to a single number per time step. This keeps the server-side computation minimal and returns a small dictionary keyed by timestamps after execute. Using the SCL band and a simple cloud mask as an example, you get one mean value per date — and the timestamps are exactly the keys you need.

# Example: enumerate timestamps for scenes in 2025
# with less than 10% cloud coverage over a given area
year_str = "2025"
s2_scl_cube = conn.load_collection(
    "SENTINEL2_L2A",
    temporal_extent=year_str,
    spatial_extent=dict(zip(["west", "north", "east", "south"], bbox_rect)),
    bands="SCL",
    max_cloud_cover=10,
)
# SCL classes that indicate clouds or shadows
CLOUD_FLAGS = [3, 8, 9, 10, 11]
def flag_cloud(px):
    m = constant(0)
    for c in CLOUD_FLAGS:
        m = m.or_(px.eq(c))
    return m.int()
# One mean value per timestamp after spatial aggregation
cloud_series = (
    s2_scl_cube
      .apply(flag_cloud)
      .aggregate_spatial(geometries=region_geom, reducer="mean")
)
# Execute: returns a small dict with timestamps as keys
series_dict = cloud_series.execute()
print(series_dict)

Example result

{
 '2025-02-05T00:00:00Z': [[0.0]],
 '2025-02-10T00:00:00Z': [[0.0]],
 '2025-02-20T00:00:00Z': [[0.0]],
 '2025-03-19T00:00:00Z': [[0.0]],
 '2025-04-03T00:00:00Z': [[0.0]],
 '2025-04-11T00:00:00Z': [[0.1250]],
 '2025-09-22T00:00:00Z': [[0.0171]],
 '2025-10-02T00:00:00Z': [[0.6767]]
}

The keys are the timestamps available in the cube selection. If you only need the date list, extract the keys.

timestamps = list(series_dict.keys())

Why this matters

This pattern leverages openEO’s process-graph model properly. You keep the data flow light, avoid downloading full raster tiles, and still obtain authoritative time steps from the backend. It also resolves the mismatch between chronological questions and value-based temporal reductions, which are different operations with different results.

Practical wrap-up

Don’t expect min_time or max_time to return dates; they compute value extrema over time and collapse the temporal dimension. When you need the actual timestamps in your subset, request a lean band, aggregate spatially to a single value per time slice, and execute to receive a compact dictionary keyed by timestamps. From there, you can use the keys as your complete list of dates or derive the earliest and latest as needed, all without pulling heavy imagery.

python python-3.x sentinel2