Module 6: Flood Mapping with SAR

Synthetic Aperture Radar (SAR) is the primary tool for flood mapping because it penetrates clouds and operates day and night. This module covers the physics of radar backscatter over floodwater, the Otsu automatic thresholding method, and a complete flood detection pipeline from SAR imagery to GeoJSON polygon export.

1. SAR Backscatter & Flood Physics

When microwave radiation from a SAR sensor strikes a smooth water surface, specular reflection directs most energy away from the satellite — producing very low backscatter (dark pixels in SAR imagery). In contrast, rough land surfaces scatter energy in all directions (diffuse scattering), producing moderate to high backscatter.

This contrast is the basis for SAR-based flood detection. The key observable is the change in backscatter coefficient between pre-flood and flood acquisitions:

$$\Delta\sigma^0_{dB} = \sigma^0_{flood} - \sigma^0_{pre} \ll 0$$

A large negative change (typically −3 to −10 dB) indicates flooding. Sentinel-1 C-band (5.4 GHz) VV polarization is most commonly used because it is most sensitive to the water surface roughness contrast. Co-polarized (VV) backscatter drops more dramatically over open water than cross-polarized (VH).

Specular Reflection

Smooth open water acts as a mirror. SAR signal reflects away from the sensor. Backscatter: −20 to −30 dB. Appears dark in SAR imagery.

Double-Bounce

Flooded urban areas or forests: signal bounces off water surface and vertical structures (walls, tree trunks). Produces bright returns (−5 to 0 dB).

Volume Scattering

Dry vegetation canopy: multiple internal reflections within the canopy volume. Moderate backscatter (−10 to −15 dB). Unchanged during flood if water is below the canopy.

Key Challenge: Radar Shadow & Layover

In mountainous terrain, radar shadow (no signal return) can be confused with floodwater. Similarly, wind roughening of water surfaces increases backscatter, reducing the water–land contrast. Terrain correction using a DEM and multi-temporal compositing help mitigate these issues.

2. Otsu Automatic Thresholding

Flood detection requires separating SAR pixels into two classes: water and non-water. Otsu's method finds the optimal threshold that maximizes the between-class variance, equivalent to minimizing the within-class variance. For a bimodal histogram (the typical case for SAR flood imagery), this yields an optimal separation.

$$t^* = \arg\max_t \left[\omega_0(t)\omega_1(t)\left(\mu_0(t) - \mu_1(t)\right)^2\right]$$

where $\omega_0(t)$ and $\omega_1(t)$ are the probabilities of the two classes separated by threshold $t$, and $\mu_0(t)$and $\mu_1(t)$ are their respective mean intensities. The algorithm exhaustively evaluates all possible thresholds and selects the one maximizing this criterion.

Algorithm Steps

Compute the histogram of the SAR backscatter image (typically 256 bins).
For each possible threshold $t$, compute the class probabilities $\omega_0, \omega_1$ and class means $\mu_0, \mu_1$.
Compute the between-class variance $\sigma_B^2(t) = \omega_0\omega_1(\mu_0 - \mu_1)^2$.
Select $t^*$ that maximizes $\sigma_B^2$.
Classify all pixels with backscatter $\leq t^*$ as water.

Limitations & Refinements

Otsu assumes a bimodal distribution, which may fail when the flood extent is very small (the water peak is negligible) or when there are multiple land cover types. Common refinements include: (1) tile-based adaptive thresholding, where Otsu is applied independently to spatial subsets; (2) region-growing from seed pixels below a conservative threshold; (3) Expectation-Maximization fitting of a Gaussian mixture model to the bimodal histogram.

3. Complete Flood Detection Pipeline

This pipeline generates synthetic Sentinel-1 SAR imagery (pre-flood and flood), applies Otsu thresholding, performs morphological refinement to remove noise, estimates the flooded area, and exports flood polygons as GeoJSON. The 3-panel visualization shows the pre-flood baseline, the flood image, and the detection overlay.

SAR Flood Detection Pipeline with Otsu Thresholding

Python

script.py161 lines

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from scipy import ndimage
import json

np.random.seed(42)

# Image dimensions (simulating ~10km x 10km at 10m resolution, downsampled)
H, W = 300, 300
pixel_size = 33.3  # meters per pixel (10km / 300)

# --- Generate synthetic SAR backscatter (dB) ---

# Pre-flood: land with varying roughness
land_base = np.random.normal(-8, 2.5, (H, W))  # typical land backscatter
# Add spatial correlation (smoothing simulates terrain)
land_base = ndimage.gaussian_filter(land_base, sigma=3)

# Add a river channel (permanent water)
river_y = np.zeros(W, dtype=int)
for x in range(W):
    river_y[x] = int(H * 0.45 + 15 * np.sin(2 * np.pi * x / W * 2))
river_mask = np.zeros((H, W), dtype=bool)
for x in range(W):
    river_mask[max(0, river_y[x]-3):min(H, river_y[x]+4), x] = True

# Pre-flood image
pre_flood = land_base.copy()
pre_flood[river_mask] = np.random.normal(-22, 2, np.sum(river_mask))  # water is dark

# --- Generate flood event ---
# Flood zone: river expands + low-lying areas on south side
flood_zone = np.zeros((H, W), dtype=bool)
for x in range(W):
    # Expanded river
    flood_zone[max(0, river_y[x]-25):min(H, river_y[x]+35), x] = True
# Add irregular flood patches (low-lying areas)
patches = ndimage.gaussian_filter(np.random.randn(H, W), sigma=20)
flood_zone |= (patches > 0.3) & (np.arange(H)[:, None] > H * 0.35) & (np.arange(H)[:, None] < H * 0.7)
# Exclude high areas (simulated elevation)
elevation = ndimage.gaussian_filter(np.random.randn(H, W), sigma=15) * 20
flood_zone &= elevation < 8

# Flood image
flood_img = land_base.copy()
flood_img[flood_zone] = np.random.normal(-21, 2.0, np.sum(flood_zone))
flood_img[river_mask] = np.random.normal(-23, 1.5, np.sum(river_mask))

# --- Otsu thresholding on flood image ---
# Convert to 8-bit for Otsu
img_min, img_max = flood_img.min(), flood_img.max()
img_8bit = ((flood_img - img_min) / (img_max - img_min) * 255).astype(np.uint8)

# Manual Otsu implementation
hist, bin_edges = np.histogram(img_8bit, bins=256, range=(0, 255))
hist_norm = hist / hist.sum()

best_t, best_var = 0, 0
for t in range(1, 255):
    w0 = hist_norm[:t].sum()
    w1 = hist_norm[t:].sum()
    if w0 == 0 or w1 == 0:
        continue
    mu0 = np.sum(np.arange(t) * hist_norm[:t]) / w0
    mu1 = np.sum(np.arange(t, 256) * hist_norm[t:]) / w1
    var_between = w0 * w1 * (mu0 - mu1) ** 2
    if var_between > best_var:
        best_var = var_between
        best_t = t

threshold_dB = img_min + (best_t / 255) * (img_max - img_min)
print(f"Otsu threshold: {best_t}/255 = {threshold_dB:.1f} dB")

# Binary flood mask
flood_mask_raw = flood_img < threshold_dB

# Morphological refinement
flood_mask = ndimage.binary_opening(flood_mask_raw, structure=np.ones((3,3)))
flood_mask = ndimage.binary_closing(flood_mask, structure=np.ones((5,5)))
flood_mask = ndimage.binary_fill_holes(flood_mask)
# Remove small objects
labeled, n_features = ndimage.label(flood_mask)
sizes = ndimage.sum(flood_mask, labeled, range(1, n_features + 1))
min_size = 50  # minimum pixel count
for i, s in enumerate(sizes):
    if s < min_size:
        flood_mask[labeled == (i + 1)] = False

# Area estimation
flood_pixels = np.sum(flood_mask)
flood_area_km2 = flood_pixels * (pixel_size ** 2) / 1e6
total_area_km2 = H * W * (pixel_size ** 2) / 1e6
print(f"Flood pixels: {flood_pixels} / {H*W}")
print(f"Flooded area: {flood_area_km2:.2f} km2 out of {total_area_km2:.2f} km2")
print(f"Flood extent: {100*flood_pixels/(H*W):.1f}%")

# --- Generate simplified GeoJSON polygon ---
# Find flood boundary contours (simplified)
boundary = ndimage.binary_dilation(flood_mask) & ~flood_mask
by, bx = np.where(boundary)
# Convert to mock geographic coordinates
lon_min, lat_min = 28.0, -2.0  # example: Central Africa
lons = lon_min + bx / W * 0.1
lats = lat_min + (H - by) / H * 0.1
# Subsample for GeoJSON
step = max(1, len(lons) // 50)
coords = [[round(float(lons[i]), 5), round(float(lats[i]), 5)] for i in range(0, len(lons), step)]
if len(coords) > 2:
    coords.append(coords[0])  # close polygon

geojson = {
    "type": "FeatureCollection",
    "features": [{
        "type": "Feature",
        "properties": {"area_km2": round(flood_area_km2, 2), "threshold_dB": round(threshold_dB, 1)},
        "geometry": {"type": "Polygon", "coordinates": [coords]}
    }]
}
print(f"\nGeoJSON polygon exported with {len(coords)} vertices")
print(json.dumps(geojson['features'][0]['properties'], indent=2))

# --- 3-Panel Visualization ---
fig, axes = plt.subplots(1, 3, figsize=(16, 5.5))

# Panel 1: Pre-flood
ax = axes[0]
im = ax.imshow(pre_flood, cmap='gray', vmin=-28, vmax=0)
ax.set_title('Pre-Flood SAR (dB)', fontsize=13, fontweight='bold', color='white')
ax.set_xlabel('Range (pixels)')
ax.set_ylabel('Azimuth (pixels)')
plt.colorbar(im, ax=ax, shrink=0.8, label='sigma0 (dB)')

# Panel 2: Flood SAR
ax2 = axes[1]
im2 = ax2.imshow(flood_img, cmap='gray', vmin=-28, vmax=0)
ax2.set_title('Flood SAR (dB)', fontsize=13, fontweight='bold', color='white')
ax2.set_xlabel('Range (pixels)')
ax2.axhline(y=0, color='cyan', linewidth=0)
plt.colorbar(im2, ax=ax2, shrink=0.8, label='sigma0 (dB)')

# Panel 3: Detection overlay
ax3 = axes[2]
ax3.imshow(flood_img, cmap='gray', vmin=-28, vmax=0, alpha=0.6)
flood_overlay = np.ma.masked_where(~flood_mask, flood_mask.astype(float))
ax3.imshow(flood_overlay, cmap='cool', alpha=0.6, vmin=0, vmax=1)
ax3.contour(flood_mask.astype(float), levels=[0.5], colors='cyan', linewidths=1.5)
ax3.set_title(f'Flood Detection (area={flood_area_km2:.1f} km2)', fontsize=13, fontweight='bold', color='white')
ax3.set_xlabel('Range (pixels)')

for ax in axes:
    ax.tick_params(colors='white')
    for spine in ax.spines.values():
        spine.set_color('#334155')

plt.tight_layout()
plt.savefig('output.png', dpi=130, bbox_inches='tight', facecolor='#0a0a0a')
plt.close()
print("\n3-panel SAR flood detection figure saved.")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Post-Processing & Accuracy Assessment

Raw binary flood masks require post-processing to remove false positives (radar shadow, dark agricultural fields) and false negatives (wind-roughened water, flooded vegetation). The following techniques are standard in operational systems:

Morphological Operations

Binary opening (erosion + dilation) removes isolated noise pixels. Binary closing (dilation + erosion) fills small holes within flood patches. Connected-component labeling followed by size filtering removes clusters smaller than a minimum area threshold (typically 0.5–1 ha).

Terrain Masking

A Digital Elevation Model (e.g., Copernicus DEM at 30 m) is used to mask out areas with slope greater than 5–10 degrees, which cannot be flooded. Radar shadow and layover masks derived from the DEM and sensor geometry further reduce false positives in mountainous terrain.

Permanent Water Mask

The JRC Global Surface Water dataset or Copernicus Water Bodies product provides a reference mask of permanent water. Subtracting this mask from the flood detection yields only the new flood extent, excluding rivers, lakes, and reservoirs that were already present before the event.

Accuracy Metrics

Flood maps are validated against optical imagery (when cloud-free), aerial survey, or crowdsourced observations. Standard metrics include:

$$Precision = \frac{TP}{TP + FP}, \quad Recall = \frac{TP}{TP + FN}, \quad F_1 = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}$$

Operational systems like the Copernicus Emergency Management Service (CEMS) typically achieve F1 scores of 0.85–0.95 for open water flooding but 0.60–0.75 for flooded vegetation, where the double-bounce signal complicates detection.

Operational Flood Mapping Services

Major operational services include: Copernicus EMS (Rapid Mapping activation within hours), UNOSAT (UN Satellite Centre, flood analysis for humanitarian response), NASA LANCE (near-real-time MODIS/VIIRS flood products), and the Global Flood Monitoring System (GFMS) at the University of Maryland. Sentinel-1's 6-day revisit time (12-day per satellite, two in constellation) makes it the primary SAR data source for these services in Europe and globally.

5. Change Detection & Multi-Temporal Analysis

While single-image thresholding works well when a clear bimodal distribution exists, change detection between pre-event and co-event imagery is more robust. The log-ratio approach is widely used:

$$R_{dB} = \sigma^0_{flood,dB} - \sigma^0_{ref,dB} = 10\log_{10}\left(\frac{\sigma^0_{flood}}{\sigma^0_{ref}}\right)$$

The reference image is ideally a temporal median of multiple pre-event acquisitions (e.g., 10–20 Sentinel-1 scenes over 2–3 months), which reduces speckle noise. Pixels where $R_{dB} < -3$ dB are classified as newly flooded. This threshold can be refined per-region using local statistics.

For time-critical applications, the Bayesian approach models the probability of flooding given the observed backscatter change:

$$P(flood|\sigma^0) = \frac{P(\sigma^0|flood) \cdot P(flood)}{P(\sigma^0)}$$

where the prior $P(flood)$ can incorporate hydrological model predictions or proximity to rivers. The likelihood $P(\sigma^0|flood)$ is modeled from the backscatter distribution of known water bodies in the scene. This probabilistic framework provides uncertainty estimates alongside the binary classification.

6. Key Takeaways

✓SAR flood detection exploits the specular reflection of smooth water, producing dark pixels with very low backscatter ($\sigma^0 < -20$ dB).

✓Otsu thresholding automatically separates the bimodal SAR histogram into water and land classes without manual parameter tuning.

✓Morphological operations, terrain masking, and permanent water subtraction are essential post-processing steps for operational accuracy.

✓Change detection (log-ratio of pre/post images) is more robust than single-image thresholding, especially in complex landscapes.

✓Double-bounce scattering in flooded urban areas and forests produces bright returns, requiring specialized detection algorithms.

Share:X Reddit LinkedIn