Chapter 12.3: MFG Numerics โ€” Schemes, Policy Iteration, and Convergence

Numerical Methods for Mean Field Games

Solving the MFG system numerically requires careful discretisation that preserves the mathematical structure: the HJB discretisation must be monotone (to ensure convergence to the viscosity solution), and the FP discretisation must be positivity-preserving(densities must remain non-negative). This chapter develops the Achdou-Capuzzo Dolcetta upwind scheme for HJB, the Scharfetter-Gummel exponential fitting scheme for FP, and the superlinearly convergent policy iteration algorithm.

12.3.1 Achdou-Capuzzo Dolcetta Upwind Scheme

The HJB equation contains the nonlinear term\(\frac{1}{2}|\nabla u|^2\), which must be discretised carefully to maintain the monotonicity propertyโ€”the discrete operator must be non-decreasing in each argument. This ensures convergence to the unique viscosity solution.

Upwind Discretisation of the Hamiltonian

For the quadratic Hamiltonian \(H(p) = \frac{1}{2}|p|^2\), the monotone discretisation uses upwind differences:

$$H_h(u_{i-1}, u_i, u_{i+1}) = \frac{1}{2}\left[\left(\frac{u_i - u_{i-1}}{h}\right)^+\right]^2 + \frac{1}{2}\left[\left(\frac{u_{i+1} - u_i}{h}\right)^-\right]^2$$

where \(a^+ = \max(a, 0)\) and\(a^- = \min(a, 0)\). This ensures that information propagates in the correct direction: the value at\(i\) depends on upstream values.

The Full Discrete HJB

The time-stepping scheme reads:

$$\frac{u_i^n - u_i^{n+1}}{\Delta t} - \nu \frac{u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n}}{h^2} + H_h(u_{i-1}^n, u_i^n, u_{i+1}^n) = F(x_i, \rho_i^n)$$

Note the time direction: we step backward from\(n+1\) to\(n\). In the general form with a transition matrix:

$$u_i^n = \min_{\alpha} \left\{\Delta t \cdot L(x_i, \alpha) + \sum_j P_{ij}(\alpha) \, u_j^{n+1}\right\}$$

where \(P_{ij}(\alpha)\) is the transition probability matrix induced by control\(\alpha\) and diffusion\(\nu\).

12.3.2 Scharfetter-Gummel Scheme for Fokker-Planck

The Fokker-Planck equation involves a drift-diffusion flux:

$$J_{i+1/2} = -\nu \frac{\rho_{i+1} - \rho_i}{h} + v_{i+1/2} \cdot \bar{\rho}_{i+1/2}$$

where \(v = -\nabla u\) is the drift velocity. A naive central difference can produce negative densities when the drift is strong (large Pรฉclet number). The Scharfetter-Gummel (exponential fitting)scheme avoids this by using the exact solution of the local drift-diffusion problem:

$$J_{i+1/2}^{\text{SG}} = \frac{\nu}{h}\left[\rho_i B(z_{i+1/2}) - \rho_{i+1} B(-z_{i+1/2})\right]$$

where \(z_{i+1/2} = v_{i+1/2} h / \nu\) is the local Pรฉclet number and \(B(z) = z / (e^z - 1)\) is the Bernoulli function:

$$B(z) = \frac{z}{e^z - 1} \quad \Rightarrow \quad B(0) = 1, \quad B(z) \to z e^{-z} \text{ as } z \to +\infty$$

The Scharfetter-Gummel scheme automatically interpolates between central differences (small \(|z|\), diffusion-dominated) and upwind differences (large \(|z|\), drift-dominated), guaranteeing positivity of the density at all times.

Discrete FP Update

$$\rho_i^{n+1} = \rho_i^n + \frac{\Delta t}{h}\left(J_{i-1/2}^{\text{SG}} - J_{i+1/2}^{\text{SG}}\right)$$

In matrix form: \(\rho^{n+1} = P(\alpha^*)^T \rho^n\), where the transpose of the transition matrix for the HJB naturally gives the FP operator. This duality is the discrete analog of the continuous adjoint relationship between HJB and FP.

12.3.3 Policy Iteration

Policy iteration is a powerful algorithm for solving the HJB equation that converges superlinearly (Newton-like), compared to the linear convergence of value iteration. The algorithm alternates between two steps:

Policy Iteration Algorithm

  1. Policy evaluation: Given a policy\(\alpha^{(k)}\), solve the linearsystem for \(u^{(k)}\):

    $$-\nu \Delta u^{(k)} + \alpha^{(k)} \cdot \nabla u^{(k)} + \frac{1}{2}|\alpha^{(k)}|^2 = F(x, \rho)$$

  2. Policy improvement: Update the policy:

    $$\alpha^{(k+1)} = -\nabla u^{(k)}$$

The key advantage: Step 1 requires solving a linear elliptic PDE (since the policy is fixed), which is much cheaper than the nonlinear HJB. Step 2 is a pointwise operation. The convergence is superlinear because policy iteration is equivalent to Newtonโ€™s method applied to the HJB equation.

Convergence Rate

Policy iteration typically converges in 5โ€“10 iterations regardless of the problem size, compared to \(O(1/\epsilon)\) iterations for value iteration. The cost per iteration is dominated by the linear solve, which scales as\(O(N)\) for tridiagonal systems (1D) or\(O(N \log N)\) for sparse systems (2D+).

12.3.4 The CCD Perspective: Equilibrium as Fixed Point

The discrete MFG system reveals a beautiful circular structure. At equilibrium\((\rho^*, u^*)\):

  • HJB: The density\(\rho^*\) constitutes the value function \(u^*\) through the Bellman equation. Density determines costs, which determine optimal behaviour.
  • Optimal control: The value function\(u^*\) determines the optimal policy\(\alpha^* = -\nabla u^*\).
  • FP: The optimal actions\(\alpha^*\) reconstitute the density\(\rho^*\) through the Fokker-Planck equation. Behaviour determines density.

The full loop is:

$$\rho^* \xrightarrow{\text{HJB}} u^* \xrightarrow{\nabla} \alpha^* \xrightarrow{\text{FP}} \rho^*$$

This is the self-consistency condition that defines the MFG Nash equilibrium. Numerically, all three algorithms (fixed-point iteration, policy iteration, and Newtonโ€™s method) are different strategies for finding this fixed point.

12.3.5 Full Numerical MFG Solver

We implement a complete MFG solver combining policy iteration for the HJB, the Scharfetter-Gummel scheme for the FP, and an outer fixed-point loop for self-consistency.

Full MFG Solver: Policy Iteration + Scharfetter-Gummel

Python
script.py233 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

12.3.6 Policy Iteration Convergence Analysis

We demonstrate the superlinear convergence of policy iteration compared to value iteration. The key diagnostic is the convergence rate: policy iteration achieves machine precision in 5โ€“10 steps, while value iteration requires hundreds.

Policy Iteration vs Value Iteration: Convergence Comparison

Python
script.py142 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Key Takeaways

  • The Achdou-Capuzzo Dolcetta upwind scheme ensures monotonicity of the HJB discretisation, guaranteeing convergence to the viscosity solution.
  • The Scharfetter-Gummel exponential fitting scheme ensures positivity of the FP density, using the Bernoulli function\(B(z) = z/(e^z - 1)\).
  • Policy iteration converges superlinearly (Newton-like), typically in 5โ€“10 steps, far outperforming value iteration.
  • The discrete MFG equilibrium has a circular structure:\(\rho^* \to u^* \to \alpha^* \to \rho^*\).
  • The transition matrix duality \(P_{ij}(\alpha)\) for HJB and \(P_{ji}(\alpha^*)^T\) for FP encodes the adjoint relationship between optimality and consistency.