Bayesian MCMC for Urban Model Calibration

Every urban simulation carries free parameters—growth rates, carrying capacities, diffusion coefficients. Bayesian inference with Markov Chain Monte Carlo (MCMC) lets us turn observed city data into full posterior distributions over those parameters, quantifying not just best-fit values but our uncertainty about them.

1. Bayes' Theorem

We begin with the foundational identity of Bayesian statistics. Given observed data $D$ and a parameter vector $\boldsymbol{\theta}$, the posterior distribution combines our prior beliefs with the information in the data:

$$P(\boldsymbol{\theta} \mid D) = \frac{P(D \mid \boldsymbol{\theta})\,P(\boldsymbol{\theta})}{P(D)}$$

Since the evidence $P(D)$ is a normalising constant that does not depend on $\boldsymbol{\theta}$, we write the unnormalised posterior as:

$$P(\boldsymbol{\theta} \mid D) \;\propto\; P(D \mid \boldsymbol{\theta})\,P(\boldsymbol{\theta})$$

The three ingredients are:

Prior $P(\boldsymbol{\theta})$ — encodes domain knowledge before seeing data (e.g., growth rates are positive).
Likelihood $P(D \mid \boldsymbol{\theta})$ — probability of the observations given a particular parameter setting.
Posterior $P(\boldsymbol{\theta} \mid D)$ — the updated belief after incorporating the data.

For an urban growth model with parameters $r$ (intrinsic growth rate) and $K$ (carrying capacity), we want to infer $\boldsymbol{\theta} = (r, K)$ from a time series of observed population counts $D = \{N_0, N_1, \dots, N_T\}$.

2. Constructing the Likelihood

Consider the logistic growth model as the deterministic skeleton:

$$\frac{dN}{dt} = r\,N\!\left(1 - \frac{N}{K}\right)$$

With analytic solution:

$$N(t) = \frac{K}{1 + \left(\frac{K}{N_0} - 1\right)e^{-rt}}$$

We assume each observation is corrupted by Gaussian noise with unknown standard deviation $\sigma$:

$$N_i^{\text{obs}} = N(t_i; r, K) + \varepsilon_i, \qquad \varepsilon_i \sim \mathcal{N}(0, \sigma^2)$$

The log-likelihood for independent observations becomes:

$$\ln P(D \mid r, K, \sigma) = -\frac{T}{2}\ln(2\pi\sigma^2) - \frac{1}{2\sigma^2}\sum_{i=1}^{T}\bigl(N_i^{\text{obs}} - N(t_i; r, K)\bigr)^2$$

Working in log-space is essential for numerical stability: likelihood values can be astronomically small, but their logarithms remain manageable floating-point numbers.

3. The Metropolis-Hastings Algorithm

MCMC generates a Markov chain whose stationary distribution is the posterior. The Metropolis-Hastings algorithm is the workhorse. At each step:

MH Algorithm Steps

Given current state $\boldsymbol{\theta}$, propose a candidate $\boldsymbol{\theta}'$ from a proposal distribution $q(\boldsymbol{\theta}' \mid \boldsymbol{\theta})$.
Compute the acceptance ratio:

$$\alpha = \min\!\left(1,\;\frac{P(D \mid \boldsymbol{\theta}')\,P(\boldsymbol{\theta}')\,q(\boldsymbol{\theta} \mid \boldsymbol{\theta}')}{P(D \mid \boldsymbol{\theta})\,P(\boldsymbol{\theta})\,q(\boldsymbol{\theta}' \mid \boldsymbol{\theta})}\right)$$

Draw $u \sim \text{Uniform}(0,1)$. If $u < \alpha$, accept: set $\boldsymbol{\theta} \leftarrow \boldsymbol{\theta}'$. Otherwise reject: keep $\boldsymbol{\theta}$.

For a symmetric proposal where $q(\boldsymbol{\theta}' \mid \boldsymbol{\theta}) = q(\boldsymbol{\theta} \mid \boldsymbol{\theta}')$ (e.g., a Gaussian random walk centered on the current state), the proposal terms cancel and the acceptance ratio simplifies to the posterior ratio:

$$\alpha = \min\!\left(1,\;\frac{P(D \mid \boldsymbol{\theta}')\,P(\boldsymbol{\theta}')}{P(D \mid \boldsymbol{\theta})\,P(\boldsymbol{\theta})}\right)$$

In log-space, we compute:

$$\ln\alpha = \min\!\bigl(0,\;\bigl[\ln P(D \mid \boldsymbol{\theta}') + \ln P(\boldsymbol{\theta}')\bigr] - \bigl[\ln P(D \mid \boldsymbol{\theta}) + \ln P(\boldsymbol{\theta})\bigr]\bigr)$$

and accept if $\ln u < \ln\alpha$. This avoids exponentiation of large negative numbers.

4. Burn-in, Thinning, and Convergence

The initial samples from the chain are influenced by the starting point and do not represent the posterior. We discard these as the burn-in period. Typical practice: discard the first 25–50% of samples.

Thinning retains every $k$-th sample to reduce autocorrelation. If the chain has autocorrelation length $\tau$, we thin by keeping every $\tau$-th sample. The effective sample size is:

$$n_{\text{eff}} = \frac{n}{1 + 2\sum_{k=1}^{\infty}\rho(k)}$$

where $\rho(k)$ is the lag-$k$ autocorrelation.

Acceptance rate is a practical diagnostic. For Gaussian proposals in moderate dimensions, optimal acceptance is around 23–44%. Too high means the proposals are too timid (slow exploration); too low means they are too ambitious (many rejections).

Convergence Checklist

Trace plots should show “hairy caterpillar” mixing (no trends or stuck periods)
Multiple chains from different starting points should converge to the same region
Gelman-Rubin statistic $\hat{R} < 1.1$
Effective sample size $n_{\text{eff}} > 200$ per parameter

5. Full MH Sampler for Logistic Growth

Below we implement a complete Metropolis-Hastings sampler that calibrates a logistic growth model to synthetic city population data. The code generates synthetic observations, runs the MCMC chain, and produces trace plots plus posterior histograms.

Metropolis-Hastings MCMC for Logistic Urban Growth

Python

script.py107 lines

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

# --- True parameters and synthetic data ---
r_true, K_true, sigma_true = 0.05, 500_000, 15_000
N0 = 50_000
t_obs = np.arange(0, 80, 2)  # observations every 2 years

def logistic(t, r, K, N0):
    """Analytic logistic growth solution."""
    return K / (1.0 + (K / N0 - 1.0) * np.exp(-r * t))

N_true = logistic(t_obs, r_true, K_true, N0)
N_obs = N_true + np.random.normal(0, sigma_true, size=len(t_obs))

# --- Log-likelihood (Gaussian errors) ---
def log_likelihood(r, K, sigma):
    N_pred = logistic(t_obs, r, K, N0)
    residuals = N_obs - N_pred
    return -0.5 * len(t_obs) * np.log(2 * np.pi * sigma**2) \
           -0.5 * np.sum(residuals**2) / sigma**2

# --- Log-prior ---
def log_prior(r, K, sigma):
    if r <= 0 or K <= 0 or sigma <= 0:
        return -np.inf
    # Weakly informative priors
    lp = 0.0
    lp += -0.5 * ((r - 0.05) / 0.03)**2       # Gaussian on r
    lp += -0.5 * ((K - 500000) / 200000)**2    # Gaussian on K
    lp += -np.log(sigma)                         # Jeffreys prior on sigma
    return lp

# --- Metropolis-Hastings sampler ---
n_iter = 15_000
burn_in = 5_000

# Proposal standard deviations (tuned for ~30% acceptance)
prop_std = np.array([0.005, 30000, 3000])

# Initial guess
theta = np.array([0.03, 400_000, 20_000])
chain = np.zeros((n_iter, 3))
log_post_current = log_likelihood(*theta) + log_prior(*theta)
n_accept = 0

for i in range(n_iter):
    # Propose (symmetric Gaussian random walk)
    theta_prop = theta + np.random.normal(0, prop_std)
    lp = log_prior(*theta_prop)
    if lp == -np.inf:
        chain[i] = theta
        continue
    ll = log_likelihood(*theta_prop)
    log_post_prop = ll + lp
    # Accept/reject
    log_alpha = log_post_prop - log_post_current
    if np.log(np.random.uniform()) < log_alpha:
        theta = theta_prop
        log_post_current = log_post_prop
        n_accept += 1
    chain[i] = theta

acc_rate = n_accept / n_iter
samples = chain[burn_in:]

# --- Visualisation ---
fig, axes = plt.subplots(3, 2, figsize=(10, 9))
labels = ['r', 'K', 'sigma']
truths = [r_true, K_true, sigma_true]

for j in range(3):
    # Trace plot
    axes[j, 0].plot(chain[:, j], alpha=0.5, linewidth=0.4, color='teal')
    axes[j, 0].axhline(truths[j], color='crimson', linestyle='--', label='True')
    axes[j, 0].axvline(burn_in, color='orange', linestyle=':', label='Burn-in')
    axes[j, 0].set_ylabel(labels[j])
    axes[j, 0].legend(fontsize=8)
    if j == 0:
        axes[j, 0].set_title('Trace plot')

# Posterior histogram
    axes[j, 1].hist(samples[:, j], bins=50, density=True,
                     color='teal', alpha=0.7, edgecolor='black', linewidth=0.3)
    axes[j, 1].axvline(truths[j], color='crimson', linestyle='--', label='True')
    axes[j, 1].axvline(np.mean(samples[:, j]), color='gold', linestyle='-', label='Mean')
    axes[j, 1].set_ylabel('Density')
    axes[j, 1].legend(fontsize=8)
    if j == 0:
        axes[j, 1].set_title('Posterior')

axes[2, 0].set_xlabel('Iteration')
axes[2, 1].set_xlabel('Parameter value')
fig.suptitle(f'MH MCMC for Logistic Growth  (acceptance = {acc_rate:.1%})', fontsize=13, y=1.01)
plt.tight_layout()
plt.savefig('output.png', dpi=130, bbox_inches='tight')
plt.show()

# Summary statistics
print(f"Acceptance rate: {acc_rate:.1%}")
for j, lbl in enumerate(labels):
    m = np.mean(samples[:, j])
    s = np.std(samples[:, j])
    print(f"  {lbl}: mean = {m:.4f}, std = {s:.4f}  (true = {truths[j]})")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

6. Urban Interpretation

The posterior distribution over $(r, K)$ carries direct planning implications:

A wide posterior on $K$ means the city's ultimate size is highly uncertain—infrastructure plans should account for a range of scenarios.
The joint posterior can reveal correlations: if $r$ and $K$ are negatively correlated, fast-growing cities tend to saturate at lower populations (resource-limited growth).
Bayesian credible intervals provide honest uncertainty bounds for population forecasts, unlike point estimates from least-squares fitting.
The approach extends naturally to spatial models: calibrate diffusion coefficients, attraction kernels, or zoning impact parameters from observed land-use data.

Key Takeaway

MCMC does not just find the “best” parameters—it maps out the full landscape of plausible parameter combinations, enabling robust decision-making under uncertainty. This is essential for urban planning where stakes are high and data is noisy.

Share:X Reddit LinkedIn

← Part 7 Overview KPZ Equation →