Part VI: Computational & Systems Biophysics | Chapter 4

Systems Biophysics

Gene regulatory networks, chemotaxis, Turing pattern formation, population dynamics, and noise in gene circuits

From Molecules to Living Systems

Systems biophysics applies physical and mathematical principles to understand the emergent behavior of biological systems — from gene regulatory circuits to developing organisms to ecosystems. While molecular biophysics focuses on individual components, systems biophysics asks how these components interact to produce collective behavior: switching, oscillations, pattern formation, adaptation, and robustness.

The quantitative framework draws on nonlinear dynamics, stochastic processes, reaction-diffusion theory, and information theory. A recurring theme is that simple interaction motifs — feedback loops, ultrasensitive switches, noise filters — generate the rich repertoire of cellular and organismal behavior.

1. Gene Regulatory Networks

Gene regulation underlies cellular decision-making: a cell's fate is determined by which genes are on or off. Transcription factors (TFs) bind to DNA regulatory elements to activate or repress gene expression, forming complex networks with feedback loops.

Derivation: The Hill Function for Gene Regulation

Consider a gene whose transcription rate depends on the concentration of an activating transcription factor [TF]. If $n$ TF molecules must bind cooperatively to the promoter to activate transcription, the binding equilibrium is:

$$n \cdot \text{TF} + \text{DNA} \rightleftharpoons \text{TF}_n\text{-DNA}$$

The equilibrium dissociation constant is $K^n = [\text{TF}]^n[\text{DNA}] / [\text{TF}_n\text{-DNA}]$. The fraction of promoters occupied is:

$$f = \frac{[\text{TF}]^n}{K^n + [\text{TF}]^n}$$

The protein production rate is the sum of basal and activated transcription:

$$\boxed{[\text{protein}] = \alpha_0 + \alpha \frac{[\text{TF}]^n}{K^n + [\text{TF}]^n}}$$

where $\alpha_0$ is the basal rate and $\alpha$ is the maximum activated rate. This is the Hill function. The Hill coefficient $n$ controls the steepness of the response:

$n = 1$: Michaelis-Menten (hyperbolic) response — no cooperativity
$n = 2$: Moderate cooperativity, as in hemoglobin O$_2$ binding
$n \to \infty$: Step function (digital switch)

Ultrasensitivity ($n > 1$): The input concentration range required to go from 10% to 90% activation is:

$$\frac{[\text{TF}]_{90}}{[\text{TF}]_{10}} = 81^{1/n}$$

For $n = 1$, this ratio is 81 (a gradual response); for $n = 4$, it is $81^{1/4} = 3$ (a sharp, switch-like response).

Signal-to-noise ratio: For a Hill function response, the sensitivity (gain) is $g = d[\text{protein}]/d[\text{TF}]$. At the midpoint ($[\text{TF}] = K$), the maximum gain is:

$$g_{\max} = \frac{\alpha n}{4K}$$

Higher cooperativity amplifies the signal relative to input noise, improving the signal-to-noise ratio by a factor of $n$.

Bistability from Positive Feedback

A gene that activates its own expression (autoactivation) with cooperativity can exhibit bistability — two stable steady states (ON and OFF) for the same parameter values. The steady-state equation for a self-activating gene with degradation rate $\gamma$ is:

$$\frac{d[P]}{dt} = \alpha \frac{[P]^n}{K^n + [P]^n} - \gamma [P] = 0$$

Graphically, the production rate (sigmoidal) and degradation rate (linear) can intersect at one or three points. Three intersections give two stable states (low and high expression) separated by an unstable intermediate. The condition for bistability requires $n \geq 2$and a sufficiently large $\alpha/(\gamma K)$ ratio.

2. Bacterial Chemotaxis

Chemotaxis — directed cell movement along chemical gradients — is one of the best-understood signal transduction systems. E. coli navigates by alternating between "runs" (straight swimming) and "tumbles" (random reorientation), biasing the run length when moving up an attractant gradient.

Derivation: Keller-Segel Equations for Chemotaxis

At the population level, the density of cells $\rho(\mathbf{x}, t)$ in a chemoattractant field $c(\mathbf{x}, t)$ evolves according to the Keller-Segel equations:

$$\boxed{\frac{\partial\rho}{\partial t} = D\nabla^2\rho - \chi\nabla\cdot(\rho\nabla c)}$$

The first term is diffusion (random motility with coefficient $D$). The second term is chemotactic drift: cells move up the gradient of $c$ with chemotactic sensitivity $\chi$.

Deriving $\chi$ from receptor-ligand binding:The chemotactic response depends on the fraction of receptors bound to ligand. For a receptor with dissociation constant $K_d$:

$$f(c) = \frac{c}{K_d + c}$$

The chemotactic sensitivity is proportional to the derivative of receptor occupancy:

$$\chi = \chi_0 \frac{df}{dc} = \chi_0 \frac{K_d}{(K_d + c)^2}$$

This predicts maximum sensitivity at low ligand concentrations ($c \ll K_d$) and decreased sensitivity at saturation ($c \gg K_d$).

The chemoattractant may itself be produced, consumed, or diffuse:

$$\frac{\partial c}{\partial t} = D_c\nabla^2 c + f(c, \rho)$$

where $f(c, \rho)$ represents production/consumption. A famous result is that the Keller-Segel system can exhibit chemotactic collapse: if $\chi/D$ exceeds a critical threshold, cells aggregate into singular concentrations (blow-up in finite time in 2D), modeling the formation of bacterial colonies.

Derivation: E. coli Run-and-Tumble Drift Velocity

An E. coli cell swims at speed $v \approx 20$ $\mu$m/s during runs (average duration $\tau_{\text{run}} \approx 1$ s) and randomly reorients during tumbles (duration $\tau_{\text{tumble}} \approx 0.1$ s). The tumble bias (fraction of time tumbling) is modulated by the chemotaxis signaling pathway.

In a gradient, runs up the gradient are extended (lower tumble rate) while runs down the gradient are shortened. If the tumble rate is $\lambda(\theta)$ where $\theta$is the angle between the swimming direction and the gradient, and the run length increases by a factor $(1 + \beta\cos\theta)$ for small biases:

$$\tau_{\text{run}}(\theta) = \tau_0(1 + \beta\cos\theta)$$

The drift velocity along the gradient direction, averaged over all orientations (in 3D):

$$v_{\text{drift}} = \frac{v\beta}{3}$$

where the factor 1/3 comes from averaging $\cos^2\theta$ over a sphere. Typically $\beta \sim 0.1\text{--}0.3$, giving drift velocities of$\sim 1\text{--}5$ $\mu$m/s, much less than the swimming speed but sufficient for efficient navigation over cell-length scales.

3. Turing Pattern Formation

Alan Turing (1952) showed that two interacting chemicals that diffuse at different rates can spontaneously generate stable spatial patterns from an initially uniform state. This diffusion-driven instability is a fundamental mechanism for biological pattern formation.

Derivation: Turing Instability Conditions

Consider a two-component reaction-diffusion system with concentrations $u(\mathbf{x},t)$(activator) and $v(\mathbf{x},t)$ (inhibitor):

$$\frac{\partial u}{\partial t} = D_u\nabla^2 u + f(u,v)$$

$$\frac{\partial v}{\partial t} = D_v\nabla^2 v + g(u,v)$$

Let $(u_0, v_0)$ be a spatially uniform steady state: $f(u_0, v_0) = 0$,$g(u_0, v_0) = 0$. Linearize about this state with perturbations$\tilde{u}, \tilde{v} \propto e^{\sigma t + i\mathbf{k}\cdot\mathbf{x}}$:

$$\begin{pmatrix} \sigma\tilde{u} \\ \sigma\tilde{v} \end{pmatrix} = \begin{pmatrix} f_u - D_u k^2 & f_v \\ g_u & g_v - D_v k^2 \end{pmatrix} \begin{pmatrix} \tilde{u} \\ \tilde{v} \end{pmatrix}$$

where $f_u = \partial f/\partial u|_0$, etc., are elements of the Jacobian, and$k = |\mathbf{k}|$ is the wavenumber. The growth rate $\sigma(k)$satisfies the characteristic equation:

$$\sigma^2 - \sigma[\text{tr}(J_k)] + \det(J_k) = 0$$

where $J_k$ is the matrix above. For Turing instability, we need the uniform state to be stable without diffusion but unstable with diffusion:

Condition 1 (stable without diffusion, $k=0$):

$$f_u + g_v < 0 \quad \text{and} \quad f_u g_v - f_v g_u > 0$$

Condition 2 (unstable with diffusion for some $k > 0$):We need $\det(J_k) < 0$ for some $k$:

$$D_v f_u + D_u g_v > 2\sqrt{D_u D_v(f_u g_v - f_v g_u)}$$

This requires $D_v f_u + D_u g_v > 0$. Since the trace condition gives$f_u + g_v < 0$, we need $f_u > 0$ (activator self-activates) and$g_v < 0$ (inhibitor self-inhibits) with $D_v \gg D_u$ (the inhibitor must diffuse much faster than the activator). The canonical requirement is:

$$\boxed{d = \frac{D_v}{D_u} \gg 1 \quad \text{(long-range inhibition, short-range activation)}}$$

Critical wavenumber: The most unstable wavenumber$k_c$ (fastest growing mode) is found by minimizing $\det(J_k)$ with respect to $k^2$:

$$k_c^2 = \sqrt{\frac{f_u g_v - f_v g_u}{D_u D_v}}$$

The resulting pattern wavelength is:

$$\boxed{\lambda_c = \frac{2\pi}{k_c} = 2\pi\left(\frac{D_u D_v}{f_u g_v - f_v g_u}\right)^{1/4}}$$

Turing Pattern Formation: Linear Stability and 1D Simulation

Python

script.py137 lines

import numpy as np

# Turing Pattern Simulation on a 1D Grid
np.random.seed(42)

print("=== Turing Pattern Formation (1D Reaction-Diffusion) ===")
print()

# Schnakenberg model: a classic Turing system
# du/dt = Du * d2u/dx2 + a - u + u^2*v
# dv/dt = Dv * d2v/dx2 + b - u^2*v
# Parameters chosen for Turing instability

a = 0.1
b = 0.9
Du = 0.05
Dv = 1.0  # Dv >> Du (required for Turing instability)

# Steady state
u0 = a + b
v0 = b / (a + b)**2
print(f"Schnakenberg model: du/dt = Du*d2u/dx2 + a - u + u^2*v")
print(f"                    dv/dt = Dv*d2v/dx2 + b - u^2*v")
print(f"Parameters: a={a}, b={b}, Du={Du}, Dv={Dv}")
print(f"Diffusion ratio d = Dv/Du = {Dv/Du:.0f}")
print(f"Homogeneous steady state: u0={u0:.4f}, v0={v0:.4f}")
print()

# Jacobian at steady state
fu = -1 + 2*u0*v0
fv = u0**2
gu = -2*u0*v0
gv = -u0**2

print("Jacobian at steady state:")
print(f"  fu = {fu:.4f}, fv = {fv:.4f}")
print(f"  gu = {gu:.4f}, gv = {gv:.4f}")
print(f"  trace = {fu+gv:.4f} (must be < 0 for stability)")
print(f"  det = {fu*gv - fv*gu:.4f} (must be > 0 for stability)")

# Turing instability check
lhs = Dv*fu + Du*gv
rhs = 2*np.sqrt(Du*Dv*(fu*gv - fv*gu))
print(f"\nTuring condition: Dv*fu + Du*gv > 2*sqrt(Du*Dv*det(J))")
print(f"  LHS = {lhs:.4f}, RHS = {rhs:.4f}")
print(f"  Turing instability: {'YES' if lhs > rhs else 'NO'}")

# Critical wavenumber
kc2 = np.sqrt((fu*gv - fv*gu) / (Du*Dv))
kc = np.sqrt(kc2)
lambda_c = 2*np.pi/kc
print(f"\nCritical wavenumber: kc = {kc:.4f}")
print(f"Pattern wavelength: lambda_c = {lambda_c:.4f}")

# Dispersion relation: sigma(k)
print(f"\nDispersion relation sigma(k):")
print(f"{'k':>8} {'sigma_+':>12} {'sigma_-':>12} {'Unstable?':>10}")
k_vals = np.linspace(0, 5, 25)
for k in k_vals:
    k2 = k**2
    a11 = fu - Du*k2
    a12 = fv
    a21 = gu
    a22 = gv - Dv*k2
    tr = a11 + a22
    det = a11*a22 - a12*a21
    disc = tr**2 - 4*det
    if disc >= 0:
        sig_p = (tr + np.sqrt(disc))/2
        sig_m = (tr - np.sqrt(disc))/2
    else:
        sig_p = tr/2
        sig_m = tr/2
    unstable = "YES" if sig_p > 0 else "no"
    if k == 0 or abs(k - kc) < 0.3 or unstable == "YES" or k > 4:
        print(f"{k:>8.3f} {sig_p:>12.6f} {sig_m:>12.6f} {unstable:>10}")

# Numerical simulation of Turing pattern
print(f"\n=== Numerical Simulation ===")
L = 20.0  # domain length
N = 200    # grid points
dx = L / N
dt = 0.001
n_steps = 50000
print_interval = 10000

x = np.linspace(0, L, N, endpoint=False)
u = u0 + 0.01 * np.random.randn(N)
v = v0 + 0.01 * np.random.randn(N)

print(f"Domain: [0, {L}], Grid: {N} points, dx={dx:.4f}")
print(f"dt={dt}, Total steps: {n_steps}")
print()

# Diffusion operator (periodic boundaries)
def laplacian(field, dx):
    return (np.roll(field, 1) + np.roll(field, -1) - 2*field) / dx**2

for step in range(n_steps + 1):
    if step % print_interval == 0:
        u_min, u_max = np.min(u), np.max(u)
        v_min, v_max = np.min(v), np.max(v)
        u_std = np.std(u)
        print(f"Step {step:>6}: u=[{u_min:.4f}, {u_max:.4f}] std={u_std:.4f}  v=[{v_min:.4f}, {v_max:.4f}]")

if step == n_steps:
        break

# Reaction terms
    ru = a - u + u**2 * v
    rv = b - u**2 * v

# Diffusion terms
    lap_u = laplacian(u, dx)
    lap_v = laplacian(v, dx)

# Forward Euler
    u = u + dt * (Du * lap_u + ru)
    v = v + dt * (Dv * lap_v + rv)

# Print final spatial pattern (sampled)
print(f"\nFinal pattern (sampled every 10 grid points):")
print(f"{'x':>8} {'u':>10} {'v':>10}")
for i in range(0, N, 10):
    print(f"{x[i]:>8.2f} {u[i]:>10.4f} {v[i]:>10.4f}")

# Count peaks (pattern features)
peaks = 0
for i in range(1, N-1):
    if u[i] > u[i-1] and u[i] > u[i+1]:
        peaks += 1
observed_wavelength = L / max(peaks, 1)
print(f"\nNumber of peaks: {peaks}")
print(f"Observed wavelength: {observed_wavelength:.2f}")
print(f"Predicted wavelength: {lambda_c:.2f}")
print(f"Agreement: {100*min(observed_wavelength, lambda_c)/max(observed_wavelength, lambda_c):.1f}%")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Population Dynamics

Population dynamics describes how the abundances of interacting species change over time. The same mathematical framework applies to ecological populations, competing cell lineages, molecular species in a cell, or viral dynamics in an infection.

Derivation: Lotka-Volterra Predator-Prey System

Let $x(t)$ be the prey population and $y(t)$ the predator population. The Lotka-Volterra equations are:

$$\frac{dx}{dt} = \alpha x - \beta xy \quad \text{(prey: growth minus predation)}$$

$$\frac{dy}{dt} = \delta xy - \gamma y \quad \text{(predator: conversion minus death)}$$

Fixed points: (1) Trivial: $(x^*, y^*) = (0, 0)$ (unstable saddle); (2) Coexistence: $(x^*, y^*) = (\gamma/\delta, \alpha/\beta)$ (center — neutrally stable).

Conserved quantity: The Lotka-Volterra system has a conserved integral of motion. Divide the two equations:

$$\frac{dy}{dx} = \frac{\delta xy - \gamma y}{\alpha x - \beta xy} = \frac{y(\delta x - \gamma)}{x(\alpha - \beta y)}$$

Separating variables: $\frac{\alpha - \beta y}{y} dy = \frac{\delta x - \gamma}{x} dx$. Integrating both sides:

$$\alpha\ln y - \beta y = \delta x - \gamma\ln x + C$$

Rearranging:

$$\boxed{H(x, y) = \delta x - \gamma\ln x + \beta y - \alpha\ln y = \text{const}}$$

This conserved quantity means the orbits in $(x, y)$ phase space are closed curves around the coexistence fixed point. The populations oscillate periodically with the period depending on initial conditions. The system has no limit cycle; all solutions are periodic orbits.

Logistic growth and competitive exclusion: The logistic equation$dN/dt = rN(1 - N/K)$ incorporates a carrying capacity $K$. For two competing species:

$$\frac{dN_1}{dt} = r_1 N_1 \left(1 - \frac{N_1 + \alpha_{12}N_2}{K_1}\right)$$

$$\frac{dN_2}{dt} = r_2 N_2 \left(1 - \frac{N_2 + \alpha_{21}N_1}{K_2}\right)$$

The competitive exclusion principle (Gause): stable coexistence requires $\alpha_{12} < K_1/K_2$ AND $\alpha_{21} < K_2/K_1$ — each species must inhibit itself more than it inhibits the other. Otherwise, one species drives the other to extinction.

Lotka-Volterra Dynamics and Competitive Exclusion

Python

script.py154 lines

import numpy as np

# Lotka-Volterra Predator-Prey Dynamics
np.random.seed(42)

print("=== Lotka-Volterra Predator-Prey System ===")
print()

# Parameters
alpha = 1.0   # prey growth rate
beta = 0.5    # predation rate
delta = 0.2   # predator conversion efficiency
gamma = 0.3   # predator death rate

print(f"Parameters: alpha={alpha}, beta={beta}, delta={delta}, gamma={gamma}")
print(f"Fixed point: x* = gamma/delta = {gamma/delta:.2f}, y* = alpha/beta = {alpha/beta:.2f}")
print()

# RK4 integration
def lotka_volterra(state, params):
    x, y = state
    a, b, d, g = params
    dxdt = a*x - b*x*y
    dydt = d*x*y - g*y
    return np.array([dxdt, dydt])

def rk4_step(f, state, params, dt):
    k1 = f(state, params)
    k2 = f(state + 0.5*dt*k1, params)
    k3 = f(state + 0.5*dt*k2, params)
    k4 = f(state + dt*k3, params)
    return state + (dt/6)*(k1 + 2*k2 + 2*k3 + k4)

params = (alpha, beta, delta, gamma)

# Simulate multiple trajectories
dt = 0.01
T = 50
n_steps = int(T/dt)

initial_conditions = [
    (2.0, 1.0),
    (3.0, 0.5),
    (1.0, 2.0),
]

for x0, y0 in initial_conditions:
    state = np.array([x0, y0])
    H0 = delta*x0 - gamma*np.log(x0) + beta*y0 - alpha*np.log(y0)
    print(f"Initial condition: x0={x0}, y0={y0}, H={H0:.4f}")

# Track time series
    times = []
    xs = []
    ys = []
    Hs = []

for i in range(n_steps + 1):
        t = i * dt
        if i % (n_steps // 10) == 0:
            x, y = state
            H = delta*x - gamma*np.log(x) + beta*y - alpha*np.log(y)
            times.append(t)
            xs.append(x)
            ys.append(y)
            Hs.append(H)

if i < n_steps:
            state = rk4_step(lotka_volterra, state, params, dt)

print(f"{'Time':>8} {'Prey x':>10} {'Pred y':>10} {'H(x,y)':>10}")
    for t, x, y, H in zip(times, xs, ys, Hs):
        print(f"{t:>8.1f} {x:>10.4f} {y:>10.4f} {H:>10.6f}")

H_drift = abs(Hs[-1] - Hs[0])
    print(f"  Conserved quantity drift: |H_final - H_initial| = {H_drift:.2e}")
    print()

# Phase portrait analysis
print("=== Phase Portrait Analysis ===")
print("Arrows showing direction of flow at grid points:")
print(f"{'x':>6} {'y':>6} {'dx/dt':>10} {'dy/dt':>10} {'Direction':>12}")
for x in [0.5, 1.0, 1.5, 2.0, 3.0]:
    for y in [0.5, 1.0, 1.5, 2.0, 3.0]:
        dxdt = alpha*x - beta*x*y
        dydt = delta*x*y - gamma*y
        if abs(dxdt) < 0.01 and abs(dydt) < 0.01:
            direction = "FIXED PT"
        elif abs(dxdt) > abs(dydt):
            direction = "right" if dxdt > 0 else "left"
        else:
            direction = "up" if dydt > 0 else "down"
        print(f"{x:>6.1f} {y:>6.1f} {dxdt:>10.4f} {dydt:>10.4f} {direction:>12}")

# Oscillation period estimation
print()
print("=== Oscillation Period ===")
x0, y0 = 2.0, 1.0
state = np.array([x0, y0])
dt_fine = 0.001
x_prev = x0

# Find period by tracking zero crossings of (x - x*)
x_star = gamma/delta
crossings = []
for i in range(100000):
    t = i * dt_fine
    state = rk4_step(lotka_volterra, state, params, dt_fine)
    x_curr = state[0]
    # Detect upward crossing through x*
    if x_prev < x_star and x_curr >= x_star:
        crossings.append(t)
    x_prev = x_curr

if len(crossings) >= 2:
    periods = [crossings[i+1] - crossings[i] for i in range(len(crossings)-1)]
    avg_period = np.mean(periods)
    print(f"Detected {len(crossings)} upward crossings of x = x* = {x_star:.2f}")
    print(f"Average period: {avg_period:.4f}")
    print(f"Linearized period (approx): {2*np.pi/np.sqrt(alpha*gamma):.4f}")
else:
    print("Insufficient crossings detected")

# Competitive exclusion
print()
print("=== Competitive Exclusion Principle ===")
print("Two species with logistic growth and competition:")
print()

r1, r2 = 0.5, 0.3
K1, K2 = 100, 80
scenarios = [
    ("Coexistence", 0.6, 0.5),
    ("Species 1 wins", 1.5, 0.5),
    ("Species 2 wins", 0.5, 1.5),
    ("Bistable (exclusion)", 1.5, 1.5),
]

for name, a12, a21 in scenarios:
    cond1 = a12 < K1/K2
    cond2 = a21 < K2/K1
    print(f"{name}: alpha12={a12}, alpha21={a21}")
    print(f"  Condition 1 (a12 < K1/K2 = {K1/K2:.2f}): {'YES' if cond1 else 'NO'}")
    print(f"  Condition 2 (a21 < K2/K1 = {K2/K1:.2f}): {'YES' if cond2 else 'NO'}")
    if cond1 and cond2:
        print(f"  -> Stable coexistence")
    elif not cond1 and not cond2:
        print(f"  -> Bistable: winner depends on initial conditions")
    elif cond1:
        print(f"  -> Species 2 wins (species 1 excluded)")
    else:
        print(f"  -> Species 1 wins (species 2 excluded)")
    print()

Click Run to execute the Python code

Code will be executed with Python 3 on the server

5. Noise in Gene Circuits

Gene expression is inherently stochastic: individual biochemical reactions (transcription, translation, degradation) occur as random events. This molecular noise generates cell-to-cell variability even in genetically identical populations under identical conditions.

Derivation: Intrinsic vs Extrinsic Noise

The total variance in protein expression across a population has two contributions:

$$\sigma_{\text{total}}^2 = \sigma_{\text{int}}^2 + \sigma_{\text{ext}}^2$$

Intrinsic noise ($\sigma_{\text{int}}^2$): arises from the stochastic nature of individual biochemical reactions (random binding of polymerase, stochastic mRNA production and degradation). It is specific to each gene copy.

Extrinsic noise ($\sigma_{\text{ext}}^2$): arises from fluctuations in shared cellular components (ribosomes, RNA polymerase, cell volume, growth rate). It affects all genes in a cell equally.

Two-reporter decomposition (Elowitz et al., 2002): Express two identical but distinguishable reporters (e.g., CFP and YFP) from identical promoters in the same cell. Let $x_1$ and $x_2$ be their expression levels:

$$\sigma_{\text{int}}^2 = \frac{1}{2}\langle(x_1 - x_2)^2\rangle$$

$$\sigma_{\text{ext}}^2 = \langle x_1 x_2\rangle - \langle x_1\rangle\langle x_2\rangle$$

The logic: when both reporters are high or both are low (correlated), the fluctuation is extrinsic (shared cause). When one is high and the other low (anticorrelated), the fluctuation is intrinsic (independent stochastic events).

The noise strength is conventionally measured by the squared coefficient of variation:

$$\eta^2 = \frac{\sigma^2}{\langle x\rangle^2} = \eta_{\text{int}}^2 + \eta_{\text{ext}}^2$$

Derivation: Fano Factor and Transcriptional Bursting

For a simple birth-death process (constitutive production at rate $k$, degradation at rate $\gamma$), the steady-state distribution is Poisson with mean$\langle n\rangle = k/\gamma$ and:

$$\text{Fano factor} = F = \frac{\sigma^2}{\langle n\rangle} = 1 \quad \text{(Poisson)}$$

In reality, gene expression occurs in bursts: the promoter switches stochastically between ON and OFF states, and during each ON period, multiple mRNAs are produced. The three-stage model:

$$\text{Gene}_{\text{OFF}} \underset{k_{\text{off}}}{\overset{k_{\text{on}}}{\rightleftharpoons}} \text{Gene}_{\text{ON}} \xrightarrow{k_m} \text{mRNA} \xrightarrow{k_p} \text{Protein}$$

In the limit of fast promoter switching relative to mRNA lifetime ($k_{\text{on}}, k_{\text{off}} \gg \gamma_m$), the mRNA is produced in geometrically distributed bursts of mean size:

$$b = \frac{k_m}{k_{\text{off}}}$$

Each mRNA burst is translated into a burst of proteins. The protein burst sizeis $b_p = k_p / \gamma_m$ (mean number of proteins per mRNA). The resulting Fano factor for protein number is:

$$\boxed{F = 1 + b_p = 1 + \frac{k_p}{\gamma_m}}$$

More generally, when both transcriptional and translational bursting contribute:

$$F = 1 + b \quad \text{where } b = b_{\text{transcription}} \times b_{\text{translation}}$$

Experimental measurements in E. coli and yeast show $F \approx 2\text{--}10$(super-Poissonian), consistent with burst sizes of 1–9 proteins. In mammalian cells,$F$ can be much larger due to infrequent, large transcriptional bursts.

Stochastic Gene Expression and Noise Decomposition

Python

script.py237 lines

import numpy as np

# Gene Circuit Noise Simulation using Gillespie Algorithm
np.random.seed(42)

print("=== Stochastic Gene Expression: Gillespie Algorithm ===")
print()

# Simple birth-death process first
def gillespie_birth_death(k_prod, gamma, n_init, t_max, max_events=100000):
    """Simulate constitutive gene expression"""
    t = 0
    n = n_init
    times = [t]
    counts = [n]

events = 0
    while t < t_max and events < max_events:
        rate_prod = k_prod
        rate_deg = gamma * n
        rate_total = rate_prod + rate_deg

if rate_total == 0:
            break

# Time to next event (exponential)
        dt = -np.log(np.random.random()) / rate_total
        t += dt

# Which event?
        if np.random.random() < rate_prod / rate_total:
            n += 1  # production
        else:
            n -= 1  # degradation

times.append(t)
        counts.append(n)
        events += 1

return np.array(times), np.array(counts)

# 1. Simple constitutive expression (should be Poisson)
print("1. Constitutive Expression (Birth-Death Process)")
k_prod = 10.0  # production rate
gamma = 0.1    # degradation rate
mean_expected = k_prod / gamma

print(f"   k_prod = {k_prod}, gamma = {gamma}")
print(f"   Expected mean = k/gamma = {mean_expected:.0f}")
print(f"   Expected Fano factor = 1 (Poisson)")
print()

# Run many cells
n_cells = 500
final_counts = []
for _ in range(n_cells):
    times, counts = gillespie_birth_death(k_prod, gamma, int(mean_expected), 200)
    # Sample at steady state (after 100 time units)
    ss_idx = np.searchsorted(times, 100)
    if ss_idx < len(counts):
        # Average over last portion
        late_idx = np.searchsorted(times, 150)
        if late_idx < len(counts):
            final_counts.append(counts[late_idx])

final_counts = np.array(final_counts)
mean_obs = np.mean(final_counts)
var_obs = np.var(final_counts)
fano_obs = var_obs / mean_obs if mean_obs > 0 else 0

print(f"   Simulated ({n_cells} cells):")
print(f"   Mean = {mean_obs:.2f} (expected {mean_expected:.0f})")
print(f"   Variance = {var_obs:.2f} (expected {mean_expected:.0f})")
print(f"   Fano factor = {fano_obs:.3f} (expected 1.000)")

# Distribution
bins_edges = np.arange(int(mean_obs - 4*np.sqrt(var_obs)), int(mean_obs + 4*np.sqrt(var_obs)) + 2)
hist, _ = np.histogram(final_counts, bins=bins_edges)
print(f"\n   Distribution (histogram):")
print(f"   {'Count':>8} {'Frequency':>10}")
for i, h in enumerate(hist):
    if h > 0:
        count_val = (bins_edges[i] + bins_edges[i+1]) / 2
        bar = '*' * int(h * 40 / max(hist))
        print(f"   {count_val:>8.0f} {h:>10} {bar}")

# 2. Bursty expression
print()
print("2. Bursty Transcription (ON/OFF promoter)")

def gillespie_bursty(k_on, k_off, k_transcribe, gamma_m, k_translate, gamma_p, t_max, max_events=200000):
    """Two-stage model: promoter switching + mRNA + protein"""
    t = 0
    gene_on = False  # start OFF
    mRNA = 0
    protein = 0
    times = [t]
    proteins = [protein]

events = 0
    while t < t_max and events < max_events:
        # Rates
        r_on = k_on if not gene_on else 0
        r_off = k_off if gene_on else 0
        r_txn = k_transcribe if gene_on else 0
        r_mrna_deg = gamma_m * mRNA
        r_tln = k_translate * mRNA
        r_prot_deg = gamma_p * protein

rate_total = r_on + r_off + r_txn + r_mrna_deg + r_tln + r_prot_deg
        if rate_total == 0:
            break

dt = -np.log(np.random.random()) / rate_total
        t += dt

rand = np.random.random() * rate_total
        cumsum = 0
        cumsum += r_on
        if rand < cumsum:
            gene_on = True
        else:
            cumsum += r_off
            if rand < cumsum:
                gene_on = False
            else:
                cumsum += r_txn
                if rand < cumsum:
                    mRNA += 1
                else:
                    cumsum += r_mrna_deg
                    if rand < cumsum:
                        mRNA -= 1
                    else:
                        cumsum += r_tln
                        if rand < cumsum:
                            protein += 1
                        else:
                            protein -= 1

times.append(t)
        proteins.append(protein)
        events += 1

return np.array(times), np.array(proteins)

# Parameters for bursty expression
k_on = 0.05     # slow switching -> bursty
k_off = 0.5
k_txn = 5.0
gamma_m = 0.5
k_tln = 2.0
gamma_p = 0.01

burst_size = k_txn / k_off * k_tln / gamma_m
frac_on = k_on / (k_on + k_off)
mean_protein = frac_on * k_txn / gamma_m * k_tln / gamma_p
expected_fano = 1 + k_tln / gamma_m  # translational burst size

print(f"   k_on={k_on}, k_off={k_off}, k_txn={k_txn}")
print(f"   gamma_m={gamma_m}, k_tln={k_tln}, gamma_p={gamma_p}")
print(f"   Fraction ON = {frac_on:.3f}")
print(f"   Translational burst size b = k_tln/gamma_m = {k_tln/gamma_m:.1f}")
print(f"   Expected Fano factor F = 1 + b = {expected_fano:.1f}")
print()

# Run multiple cells
n_cells_bursty = 300
final_proteins = []
for cell in range(n_cells_bursty):
    times, prots = gillespie_bursty(k_on, k_off, k_txn, gamma_m, k_tln, gamma_p, 500, 200000)
    # Sample at late time
    late_idx = np.searchsorted(times, 400)
    if late_idx < len(prots):
        final_proteins.append(prots[late_idx])

final_proteins = np.array(final_proteins)
mean_p = np.mean(final_proteins)
var_p = np.var(final_proteins)
fano_p = var_p / mean_p if mean_p > 0 else 0

print(f"   Simulated ({len(final_proteins)} cells):")
print(f"   Mean protein = {mean_p:.1f}")
print(f"   Variance = {var_p:.1f}")
print(f"   Fano factor = {fano_p:.2f} (expected ~{expected_fano:.1f})")
print(f"   CV^2 = {var_p/mean_p**2:.4f}")

# 3. Two-reporter noise decomposition
print()
print("3. Two-Reporter Noise Decomposition")
print("   (Simulating two independent copies of the same gene in each cell)")

n_cells_2rep = 200
reporter1 = []
reporter2 = []

for _ in range(n_cells_2rep):
    # Shared extrinsic factor (e.g., cell volume variation)
    extrinsic_factor = np.exp(0.2 * np.random.randn())  # log-normal

# Two independent reporters with shared extrinsic noise
    k_prod_cell = k_prod * extrinsic_factor
    gamma_cell = gamma

# Reporter 1
    t1, c1 = gillespie_birth_death(k_prod_cell, gamma_cell, int(mean_expected), 200)
    idx1 = np.searchsorted(t1, 150)
    r1 = c1[min(idx1, len(c1)-1)]

# Reporter 2 (independent intrinsic, same extrinsic)
    t2, c2 = gillespie_birth_death(k_prod_cell, gamma_cell, int(mean_expected), 200)
    idx2 = np.searchsorted(t2, 150)
    r2 = c2[min(idx2, len(c2)-1)]

reporter1.append(r1)
    reporter2.append(r2)

r1 = np.array(reporter1, dtype=float)
r2 = np.array(reporter2, dtype=float)

# Noise decomposition
eta_total_sq = np.var(r1) / np.mean(r1)**2
sigma_int_sq = 0.5 * np.mean((r1 - r2)**2)
sigma_ext_sq = np.mean(r1 * r2) - np.mean(r1) * np.mean(r2)
sigma_total_sq = np.var(r1)

eta_int_sq = sigma_int_sq / np.mean(r1)**2
eta_ext_sq = sigma_ext_sq / np.mean(r1)**2

print(f"   Noise decomposition:")
print(f"   eta^2_total   = {eta_total_sq:.4f}")
print(f"   eta^2_intrinsic = {eta_int_sq:.4f} ({100*eta_int_sq/max(eta_total_sq,1e-10):.1f}%)")
print(f"   eta^2_extrinsic = {eta_ext_sq:.4f} ({100*eta_ext_sq/max(eta_total_sq,1e-10):.1f}%)")
print(f"   Sum (int+ext)   = {eta_int_sq + eta_ext_sq:.4f}")
print(f"   Correlation(r1,r2) = {np.corrcoef(r1,r2)[0,1]:.3f}")
print(f"   (High correlation => extrinsic noise dominates)")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

6. Applications

Synthetic Biology

Synthetic biology designs and constructs novel genetic circuits with desired behaviors: toggle switches (bistable circuits for cellular memory), repressilators (oscillatory circuits), logic gates, and pulse generators. The design principles from systems biophysics — feedback topology, cooperativity, noise filtering — guide the engineering of these circuits. Applications include biosensors, engineered metabolic pathways, and therapeutic circuits (e.g., CAR-T cells with synthetic logic).

Developmental Biology

Turing-type mechanisms underlie many developmental patterns: digit formation in limbs (BMP-WNT interactions), skin pigmentation in zebrafish and mammals, hair follicle spacing, and lung branching morphogenesis. Reaction-diffusion models, refined with experimentally measured parameters, now make quantitative predictions that are tested with genetic perturbations.

Epidemiology and Ecology

SIR (Susceptible-Infected-Recovered) models are Lotka-Volterra-type systems applied to disease transmission. The basic reproduction number $R_0$ determines whether an epidemic occurs. Population dynamics models (predator-prey, competition, mutualism) inform conservation biology, fisheries management, and understanding biodiversity. Stochastic models account for demographic noise in small populations, critical for extinction risk assessment.

Chapter Summary

• Hill function $\alpha[TF]^n/(K^n + [TF]^n)$ models cooperative gene regulation; $n > 1$ gives ultrasensitivity.
• Keller-Segel equations describe chemotaxis with diffusion and directed migration; sensitivity $\chi \propto K_d/(K_d+c)^2$.
• Turing instability requires short-range activation and long-range inhibition ($D_v/D_u \gg 1$); pattern wavelength $\lambda_c = 2\pi/k_c$.
• Lotka-Volterra predator-prey has a conserved quantity $H = \delta x - \gamma\ln x + \beta y - \alpha\ln y$; competitive exclusion requires each species to limit itself more than the other.
• Gene expression noise decomposes into intrinsic and extrinsic components; the Fano factor $F = 1 + b$ quantifies super-Poissonian bursting.

Share:X Reddit LinkedIn

← Computational Biophysics Course Overview →