5 Python Libraries for Earth Observation
Posted on January 14, 2025
8 minutes
rioxarray
rioxarray is a powerful Python library that extends the capabilities of xarray by adding geospatial support for working with raster datasets. It is especially useful for Earth Observation (EO) workflows where spatial metadata and coordinate reference systems (CRS) are crucial.
Purpose:
Handling raster data with geospatial metadata (e.g., CRS, bounds, resolution). Simplifying raster operations by combining geospatial features of Rasterio with the multi-dimensional array handling of xarray.
Key Features:
- Geospatial Metadata Handling Automatically reads CRS, resolution, and bounding boxes from raster files. It also supports CRS transformations and re-projection.
- Seamless Integration with xarray Works with xarray’s labeled data model, enabling easy combination of raster data with other geospatial or temporal datasets.
- Raster I/O Reads and writes GeoTIFFs and other raster formats supported by GDAL, including cloud-optimized GeoTIFFs (COGs).
- Spatial Operations Allows cropping rasters to bounding boxes or shapes, reprojecting rasters to different CRS, and masking rasters using vector geometries.
- Advanced Data Manipulation Performs raster math with labeled dimensions (e.g., band arithmetic) and generates statistics and histograms.
Example Workflow:
import rioxarray
# Open a GeoTIFF file
raster = rioxarray.open_rasterio("example.tif")
print(raster)
# Reproject raster to a new CRS
reprojected_raster = raster.rio.reproject("EPSG:4326")
# Crop raster to bounding box
cropped_raster = raster.rio.clip_box(minx, miny, maxx, maxy)
# Save cropped raster to a new file
cropped_raster.rio.to_raster("cropped_output.tif")
pytstac-client
pystac-client is a Python library designed for interacting with STAC (SpatioTemporal Asset Catalog) APIs. It is particularly useful for Earth Observation (EO) workflows where managing and querying large catalogs of geospatial assets (e.g., satellite imagery, climate data) is essential.
Purpose:
- Accessing and searching STAC-compliant APIs.
- Simplifying the discovery and filtering of EO datasets.
- Fetching metadata and assets for further processing and analysis.
Key Features:
- STAC API Support Fully compliant with the STAC API specifications, it allows querying of public and private STAC catalogs.
- Flexible Search Capabilities Users can filter data by date, bounding box, collection, or other properties. Advanced search parameters like intersects and temporal ranges are also supported.
- Integration with pystac It seamlessly integrates with the pystac library for managing STAC items, collections, and catalogs. Users can easily access metadata and links for related resources.
- Ease of Use The library offers a user-friendly Python interface for querying and fetching assets, making it accessible to developers and data scientists alike.
- Cloud-Native Optimized for accessing cloud-hosted geospatial data, it works well with modern formats like Cloud-Optimized GeoTIFFs (COGs) and Zarr.
Example Workflow:
from pystac_client import Client
# Connect to a public STAC API
stac_api_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
catalog = Client.open(stac_api_url)
print(catalog.description)
# Define search parameters
search = catalog.search(
collections=["sentinel-2-l2a"], # Collection name
bbox=[-120.0, 35.0, -119.0, 36.0], # Bounding box
datetime="2023-01-01/2023-01-31", # Temporal range
limit=10 # Number of results
)
# Retrieve items
items = list(search.get_items())
print(f"Found {len(items)} items.")
# Access metadata for the first item
first_item = items[0]
print(first_item.properties)
# Access an asset (e.g., a Cloud-Optimized GeoTIFF)
asset_href = first_item.assets["B04"].href
print("Asset URL:", asset_href)
import rasterio
# Open and process the asset
with rasterio.open(asset_href) as dataset:
data = dataset.read(1) # Read the first band
print(data.shape)
Earth Engine Python API
The Earth Engine Python API is a Python library for accessing Google Earth Engine (GEE), a cloud-based platform for planetary-scale geospatial analysis. It provides powerful tools for working with satellite imagery, vector data, and geospatial datasets directly in Python.
Purpose:
- Accessing and analyzing large-scale geospatial datasets in the cloud.
- Performing computationally intensive geospatial analyses without the need for local resources.
- Simplifying workflows for satellite imagery processing and time-series analysis.
Key Features:
- Access to Global Datasets Provides access to an extensive catalog of geospatial datasets, including Landsat, Sentinel, MODIS, and many others.
- Cloud-Based Processing Eliminates the need for local computation by leveraging Google’s cloud infrastructure for data processing and storage.
- Flexible Geospatial Analysis Supports advanced geospatial operations such as mosaicking, masking, resampling, and calculating spectral indices.
- Integration with Python Ecosystem Works seamlessly with Python libraries like NumPy, Pandas, Matplotlib, and GeoPandas for advanced analysis and visualization.
- Interactive Visualization Integrates with Jupyter notebooks and supports exporting data for use in other GIS platforms.
Example Workflow:
import ee
# Initialize the Earth Engine API
ee.Initialize()
# Print some information to confirm
print("Earth Engine initialized successfully!")
# Load a Landsat 8 collection
landsat = ee.ImageCollection("LANDSAT/LC08/C01/T1_SR").filterDate("2023-01-01", "2023-12-31")
# Define a region of interest (ROI) and filter the dataset
roi = ee.Geometry.Rectangle([-120.0, 35.0, -119.0, 36.0])
filtered = landsat.filterBounds(roi)
# Compute NDVI for the filtered images
def compute_ndvi(image):
return image.normalizedDifference(["B5", "B4"]).rename("NDVI")
ndvi_collection = filtered.map(compute_ndvi)
# Export the first image to Google Drive
ndvi_image = ndvi_collection.first()
task = ee.batch.Export.image.toDrive(
image=ndvi_image,
description="NDVI_Image",
region=roi,
scale=30
)
task.start()
odc-stac
odc-stac is a Python library for integrating Open Data Cube (ODC) workflows with SpatioTemporal Asset Catalogs (STAC). It enables users to load and manipulate geospatial data from STAC-compliant APIs in a way that is compatible with ODC, providing a seamless bridge between modern geospatial data sources and powerful analysis pipelines.
Purpose:
- Accessing STAC-compliant geospatial data catalogs for analysis in ODC.
- Simplifying the integration of cloud-hosted datasets into geospatial workflows.
- Leveraging the multi-dimensional analysis power of Open Data Cube with STAC data.
Key Features:
- STAC Integration with ODC Allows direct loading of STAC items into ODC-compatible data structures, enabling seamless access to geospatial datasets.
- Support for Cloud-Native Formats Optimized for Cloud-Optimized GeoTIFFs (COGs) and other modern geospatial data formats, ensuring efficient access and processing.
- Flexible Querying Provides powerful filters for spatial, temporal, and property-based searches to retrieve datasets that match specific criteria.
- Lazy Loading Supports deferred data loading, enabling users to work with metadata and fetch data only when needed, optimizing memory and performance.
- Compatibility with Data Cubes Facilitates the creation of xarray-based data cubes, enabling advanced analysis, visualization, and modeling.
Example Workflow:
from odc.stac import stac_load
import pystac_client
# Connect to a STAC API
stac_api_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
catalog = pystac_client.Client.open(stac_api_url)
# Search for Sentinel-2 data
search = catalog.search(
collections=["sentinel-2-l2a"],
bbox=[-120.0, 35.0, -119.0, 36.0],
datetime="2023-01-01/2023-01-31",
limit=10,
)
# Get the STAC items
items = list(search.get_items())
# Load STAC data into an xarray dataset
ds = stac_load(
items,
bands=["B04", "B08"], # Load red and NIR bands
crs="EPSG:4326",
resolution=(-0.0001, 0.0001), # Approx. 10m resolution
)
print(ds)
# Compute NDVI
ndvi = (ds["B08"] - ds["B04"]) / (ds["B08"] + ds["B04"])
ndvi.plot()
# Export the NDVI dataset to a GeoTIFF
ndvi.rio.to_raster("ndvi_output.tif")
titiler
titiler is a modern dynamic tile server built on FastAPI and Rasterio. It specializes in serving Cloud-Optimized GeoTIFFs (COGs) and other raster data as map tiles and web services. titiler allows users to interact with raster data dynamically, making it ideal for web mapping and visualization workflows.
Purpose:
- Dynamically serve raster data as map tiles, imagery, and metadata over the web.
- Support visualization and interaction with geospatial data in browsers and GIS applications.
- Facilitate lightweight web services for Earth Observation (EO) and geospatial workflows.
Key Features:
- COG Support Efficiently serves Cloud-Optimized GeoTIFFs, enabling fast and scalable visualization of raster datasets directly from cloud storage.
- Dynamic Tiling Supports dynamic tile generation for map previews, multi-band compositions, and subsetting data without preprocessing.
- FastAPI Integration Built on FastAPI, ensuring high performance, asynchronous processing, and easy customization of endpoints.
- Advanced Rendering Options Supports customization of color maps, band combinations, resampling methods, and scaling for visualizing raster data.
- Standards-Compliant Services Implements OGC-compliant endpoints such as WMTS and XYZ tiles, making it compatible with web mapping libraries like Leaflet and OpenLayers.
Example Workflow:
1. Installing titiler:
pip install titiler
2. Serving a COG as Map Tiles
from titiler.application import app
from fastapi import FastAPI
# Create a FastAPI app
my_app = FastAPI()
# Mount titiler's application
my_app.mount("/", app)
Run the application:
uvicorn my_app:app --host 0.0.0.0 --port 8000
Access tiles at: `http://localhost:8000/cog/tilejson.json?url=<URL_to_COG>`
Let’s Connect
Are you working on an interesting project involving earth observation processing? We’d love to hear about it! Our team of GIS professionals is always eager to collaborate and share insights. Connect with us at https://rottengrapes.tech/contact