Part VII: Medical Biophysics & Metabolism | Chapter 3

Entropy, Information & Aging

Shannon information in biology, thermodynamics of aging, free radical damage, telomere dynamics, and the maximum entropy production principle

The Arrow of Time in Biology: Entropy, Information, and the Aging Process

Aging is perhaps the most universal and least understood biological phenomenon. From a thermodynamic perspective, aging represents the gradual loss of a biological system's ability to maintain its far-from-equilibrium state. Schrödinger recognized in 1944 that living systems maintain order by feeding on "negative entropy" from their environment. Shannon's information theory (1948) provided the mathematical framework to quantify biological information, from the genome to gene regulatory networks.

This chapter connects three deep ideas: (1) information theory as applied to biological systems, (2) the thermodynamics of aging as entropy accumulation, and (3) specific molecular mechanisms (ROS damage, telomere shortening) interpreted through the lens of thermodynamic and information-theoretic principles. We explore how the maximum entropy production principle may govern biological organization itself.

1. Information Theory in Biology

Claude Shannon's 1948 paper "A Mathematical Theory of Communication" introduced the concept of information entropy, which quantifies uncertainty and information content. This framework has profound applications in molecular biology.

Derivation: Shannon Entropy and the Information Content of DNA

Shannon entropy for a discrete random variable with $n$ possible outcomes, each with probability $p_i$, is:

$$H = -\sum_{i=1}^{n} p_i \log_2(p_i) \quad \text{(bits)}$$

For DNA, there are 4 possible bases at each position (A, T, G, C). If all bases are equally probable ($p_i = 1/4$):

$$H_{\max} = -4 \times \frac{1}{4}\log_2\frac{1}{4} = -\log_2\frac{1}{4} = \log_2 4 = 2 \text{ bits per base pair}$$

The human genome contains approximately $3.2 \times 10^9$ base pairs, so the maximum information content is:

$$I_{\max} = 3.2 \times 10^9 \times 2 = 6.4 \times 10^9 \text{ bits} \approx 6.4 \text{ Gbits} \approx 800 \text{ MB}$$

However, the actual information content is less because: (1) base frequencies are not uniform (GC content varies), (2) there are extensive repetitive sequences (~50% of the genome), and (3) only ~1.5% codes for proteins. The effective information content is estimated at roughly 50–100 MB.

Mutual information between gene expression and environment quantifies how much information about the environment is encoded in gene expression patterns:

$$I(X;Y) = \sum_{x,y} p(x,y) \log_2 \frac{p(x,y)}{p(x)p(y)}$$

where $X$ represents gene expression states and $Y$ represents environmental conditions. This equals $I(X;Y) = H(X) + H(Y) - H(X,Y)$. For a perfectly adapted organism, the mutual information between its internal state and the environment is maximized — the organism is a model of its environment, as originally proposed by Conant and Ashby (1970).

2. The Thermodynamics of Aging

Aging can be understood as the progressive departure from the optimal steady state of a dissipative structure. The thermodynamic framework connects entropy production, metabolic rate, and the degradation of biological order.

Derivation: Entropy Production Rate and Aging

According to non-equilibrium thermodynamics, the entropy production rate of a living system can be written as a sum of thermodynamic forces $X_k$ and fluxes $J_k$:

$$\sigma = \frac{dS_i}{dt} = \sum_k J_k X_k \geq 0$$

For metabolic processes, the dominant contribution comes from the dissipation of chemical free energy:

$$\sigma \approx \frac{1}{T}\sum_r \dot{\xi}_r (-\Delta G_r)$$

where $\dot{\xi}_r$ is the rate of reaction $r$ and $\Delta G_r$ is its free energy change. Prigogine's minimum entropy production theorem states that for a system near equilibrium with fixed boundary conditions, the steady state minimizes the entropy production rate:

$$\frac{d\sigma}{dt} \leq 0 \quad \text{(near equilibrium, linear regime)}$$

In the context of aging, we can model the organism's state as characterized by a set of order parameters $\{q_i\}$ that describe its organization. The free energy of maintaining this organization is:

$$G_{\text{maint}} = \sum_i G_i(q_i)$$

Aging corresponds to the gradual increase in the cost of maintaining organization, as repair mechanisms become less efficient. The entropy production rate increases with age as the system departs from its optimal steady state:

$$\sigma(t) = \sigma_0 + \alpha t + \beta t^2$$

where $\sigma_0$ is the basal entropy production, $\alpha$ represents linear degradation, and $\beta$ represents accelerating deterioration. Disease states correspond to sudden increases in $\sigma$ — departure from the minimum entropy production steady state.

The total entropy produced over a lifetime is $S_{\text{total}} = \int_0^{t_{\text{life}}} \sigma(t) \, dt$. Interestingly, the lifetime entropy production per unit mass is approximately constant across mammalian species at about $\sim 10^4$ kJ/(kg·K), consistent with a universal "entropy budget" for life.

3. Free Radical Theory and Redox Biology

Denham Harman proposed in 1956 that aging is caused by the accumulation of oxidative damage from reactive oxygen species (ROS). While the picture has become more nuanced, the thermodynamics of redox biology remain central to understanding aging.

Derivation: Mitochondrial ROS Production and Damage Accumulation

The Nernst equation for a mitochondrial redox couple determines the tendency for electron leak:

$$E = E^{\circ\prime} - \frac{RT}{nF}\ln\frac{[\text{Red}]}{[\text{Ox}]}$$

When the electron transport chain is highly reduced (high NADH/NAD$^+$ ratio), single electrons can "leak" to molecular oxygen:

$$\text{O}_2 + e^- \to \text{O}_2^{\bullet-} \quad (E^{\circ\prime} = -0.33 \text{ V})$$

The rate of superoxide production depends on the redox state of the electron carriers. For a simplified model where electron leak rate is proportional to the fraction of reduced carriers:

$$R_{\text{ROS}} = k_{\text{leak}} \cdot f_{\text{red}} \cdot [\text{O}_2]$$

where $f_{\text{red}}$ is the fraction of reduced ubiquinone. Under normal conditions, about 0.1–2% of electrons leak to form superoxide. The steady-state ROS concentration is determined by the balance of production and scavenging:

$$\frac{d[\text{ROS}]}{dt} = R_{\text{ROS}} - k_{\text{SOD}}[\text{SOD}][\text{O}_2^{\bullet-}] - k_{\text{GPx}}[\text{GPx}][\text{H}_2\text{O}_2] - k_{\text{cat}}[\text{Cat}][\text{H}_2\text{O}_2]$$

At steady state ($d[\text{ROS}]/dt = 0$), the ROS concentration is:

$$[\text{ROS}]_{ss} = \frac{R_{\text{ROS}}}{k_{\text{scav}}[\text{Antioxidants}]}$$

The damage accumulation model describes oxidative damage as the integral of net damage rate over time:

$$D(t) = D_0 + \int_0^t (k_{\text{damage}} \cdot [\text{ROS}]_{ss} - k_{\text{repair}}) \, d\tau$$

If repair capacity declines with accumulated damage ($k_{\text{repair}}(D) = k_{r,0}(1 - D/D_{\max})$):

$$\frac{dD}{dt} = k_d \cdot [\text{ROS}]_{ss} - k_{r,0}\left(1 - \frac{D}{D_{\max}}\right)$$

This nonlinear ODE produces the characteristic sigmoidal aging curve: slow damage accumulation when young (repair keeps up), followed by accelerating damage when repair capacity is overwhelmed. Death occurs when $D \to D_{\max}$. The mitochondrial "vicious cycle" hypothesis adds that damaged mitochondria produce more ROS, creating positive feedback: $R_{\text{ROS}}(D) = R_0(1 + \gamma D)$.

4. Telomere Dynamics

Leonard Hayflick discovered in 1961 that normal human cells can only divide a limited number of times (the Hayflick limit, ~50–70 divisions). This is now understood to result from telomere shortening.

Derivation: Hayflick Limit from Telomere Shortening

Telomeres are repetitive DNA sequences (TTAGGG)$_n$ at chromosome ends that shorten with each cell division due to the end-replication problem. After $n$ divisions:

$$L(n) = L_0 - n \cdot \Delta L$$

where $L_0$ is the initial telomere length (typically 10–15 kbp in humans), and $\Delta L \approx 50$–200 bp per division (varies by cell type). Cells enter senescence when telomeres reach a critical length $L_{\text{crit}} \approx 4$–6 kbp, triggering the DNA damage response.

The Hayflick limit is therefore:

$$n_{\max} = \frac{L_0 - L_{\text{crit}}}{\Delta L}$$

For typical values $L_0 = 12$ kbp, $L_{\text{crit}} = 5$ kbp,$\Delta L = 100$ bp:

$$n_{\max} = \frac{12000 - 5000}{100} = 70 \text{ divisions}$$

Telomerase kinetics: In stem cells and cancer cells, telomerase extends telomeres. The dynamics become:

$$\frac{dL}{dt} = k_{\text{tel}} - \frac{\Delta L}{\tau_{\text{div}}}$$

where $k_{\text{tel}}$ is the telomerase elongation rate and $\tau_{\text{div}}$is the cell division time. The immortalization condition is:

$$k_{\text{tel}} \geq \frac{\Delta L}{\tau_{\text{div}}}$$

When this condition is satisfied, telomere length is maintained indefinitely. The steady-state telomere length with telomerase is:

$$L_{ss} = L_{\text{crit}} + \frac{(k_{\text{tel}} \cdot \tau_{\text{div}} - \Delta L) \cdot \tau_{\text{div}}}{\Delta L} \cdot L_{\text{crit}}$$

More precisely, if we model telomerase as adding $\delta$ bp per division when active (with probability $p_{\text{tel}}$): $\langle\Delta L_{\text{net}}\rangle = p_{\text{tel}} \cdot \delta - \Delta L$. The Hayflick limit is abolished when $p_{\text{tel}} \cdot \delta > \Delta L$. In cancer, telomerase reactivation (85% of cancers) or the ALT pathway (15%) achieves this, enabling unlimited replication.

5. Maximum Entropy Production Principle

The maximum entropy production principle (MEPP) proposes that non-equilibrium systems tend to organize in ways that maximize entropy production, subject to constraints. This provides a thermodynamic rationale for biological complexity.

Derivation: MEPP and Biological Organization

Consider a system receiving energy flux $\Phi_{\text{in}}$ from the environment. The entropy production rate depends on the internal organization:

$$\sigma = \frac{\Phi_{\text{in}}}{T_{\text{hot}}} - \frac{\Phi_{\text{out}}}{T_{\text{cold}}} + \frac{dS_{\text{internal}}}{dt}$$

At steady state ($\Phi_{\text{in}} = \Phi_{\text{out}} = \Phi$):

$$\sigma = \Phi\left(\frac{1}{T_{\text{cold}}} - \frac{1}{T_{\text{hot}}}\right)$$

Maximizing $\sigma$ means maximizing the energy flux $\Phi$ through the system. A more complex internal structure (with more degradation pathways) can process more energy flux, thus producing more entropy. This connects to Jaynes' maximum entropy inference: the most probable macroscopic state is the one consistent with constraints that maximizes entropy.

For a biological system with internal degrees of freedom $\{q_i\}$ subject to constraints $\{f_k(q_i) = c_k\}$, the MEPP predicts that the system evolves to maximize:

$$\mathcal{L} = \sigma(\{q_i\}) - \sum_k \lambda_k (f_k(\{q_i\}) - c_k)$$

where $\lambda_k$ are Lagrange multipliers for the constraints. This leads to thePhysiological Fitness Landscape: the fitness of an organism can be mapped as a function in the space of physiological parameters, with the optimum corresponding to the maximum entropy production state consistent with genetic and environmental constraints.

The MEPP has been applied to explain: (1) the evolution of metabolic networks that maximize energy dissipation, (2) the emergence of multicellularity as a strategy for increased entropy production, (3) the scaling of metabolic rate with body size (Kleiber's law as an optimization result), and (4) ecosystem structure and succession. While controversial, the MEPP provides a unifying thermodynamic framework for understanding biological organization and its inevitable degradation in aging.

Derivation: Information Loss Through Mutation Accumulation

Somatic mutations accumulate with age at a rate of approximately $\mu \approx 40$mutations per cell division per genome. The total mutation burden after $n$divisions is:

$$M(n) = \mu \cdot n$$

The information loss per mutation depends on the context. For a coding sequence, each point mutation changes one codon, potentially altering the amino acid. The information change per mutation in a random sequence position is:

$$\Delta I = -\log_2\left(\frac{1}{3}\right) \approx 1.58 \text{ bits (choosing one of 3 alternative bases)}$$

However, most of the genome is non-coding, and many coding mutations are synonymous (silent) or occur in non-essential genes. The effective information loss rate depends on the fraction of the genome under selective constraint. For the ~5% of the human genome under purifying selection:

$$\dot{I}_{\text{loss}} = 0.05 \times \mu \times 1.58 \text{ bits per division}$$

Over a lifetime of approximately $10^{16}$ total cell divisions across all tissues, with most cells dividing $\sim$40 times, individual cells accumulate$\sim$1600 somatic mutations. Most are passengers; the probability of a driver mutation in a specific oncogene is:

$$P(\text{driver}) = 1 - (1 - p_{\text{driver}})^{M(n)}$$

where $p_{\text{driver}} \sim 10^{-7}$ per mutation. This creates a direct link between aging (mutation accumulation) and cancer risk, which increases approximately as the 5th–6th power of age for most cancers, consistent with the multi-hit model requiring 5–6 driver mutations.

Gompertz Law of Mortality and Thermodynamic Interpretation

Benjamin Gompertz (1825) observed that the human mortality rate increases exponentially with age after maturity:

$$h(t) = h_0 \cdot e^{\gamma t}$$

where $h(t)$ is the hazard function (instantaneous mortality rate), $h_0$is the baseline mortality, and $\gamma \approx 0.085$ per year for humans (mortality doubles every $\ln 2 / 0.085 \approx 8.2$ years). The survival function is:

$$S(t) = \exp\left(-\frac{h_0}{\gamma}(e^{\gamma t} - 1)\right)$$

The thermodynamic interpretation connects $\gamma$ to the rate of entropy accumulation: as damage increases exponentially, the system's ability to maintain homeostasis degrades exponentially, leading to exponentially increasing failure probability. The Gompertz constant $\gamma$ correlates with mass-specific metabolic rate across species, supporting the connection between metabolic entropy production and aging rate.

6. Applications

Caloric Restriction and Lifespan Extension

Caloric restriction (CR) by 20–40% extends lifespan in virtually every model organism studied, from yeast to primates. The thermodynamic interpretation: CR reduces metabolic rate and thus entropy production rate $\sigma$. With reduced electron transport flux, the fraction of electrons leaking to form ROS decreases. Additionally, CR activates sirtuins and AMPK pathways that enhance repair mechanisms, effectively reducing$k_{\text{damage}}$ and increasing $k_{\text{repair}}$ in the damage accumulation equation.

Cancer as an Entropy Phenomenon

From an information-theoretic perspective, cancer represents a loss of information in the genome. Each mutation increases the Shannon entropy of the genome: the original precisely specified sequence becomes progressively randomized. The Hallmarks of Cancer can be reinterpreted as progressive loss of regulatory information. The minimum number of "driver" mutations needed ($\sim$3–7) corresponds to the minimum information loss required to escape normal growth control.

Biomarkers of Biological Age

Biological age can differ from chronological age. The epigenetic clock (Horvath, 2013) uses DNA methylation patterns at ~350 CpG sites to predict biological age with remarkable accuracy ($r > 0.96$). From an information perspective, the epigenetic clock measures the loss of epigenetic information: young cells have precise methylation patterns (low entropy), while aged cells show stochastic drift (increased entropy). Telomere length, mitochondrial DNA copy number, and inflammatory markers provide complementary measures of the thermodynamic "age" of the system.

Historical Context

  • Schrödinger (1944): "What is Life?" introduced negentropy and predicted that genetic information is stored in an "aperiodic crystal" (DNA, discovered 9 years later).
  • Shannon (1948): "A Mathematical Theory of Communication" provided the mathematical framework for quantifying biological information.
  • Harman (1956): Proposed the free radical theory of aging, one of the most influential theories in gerontology.
  • Hayflick (1961): Discovered the replicative limit of normal cells, later explained by telomere biology (Blackburn, Greider, Szostak: 2009 Nobel Prize).
  • Prigogine (1977 Nobel): Dissipative structures theory provided the thermodynamic framework for understanding living systems as far-from-equilibrium structures.

7. Computational Exploration

Entropy, Information & Aging: Telomere Dynamics, ROS Damage, and Epigenetic Drift

Python
script.py402 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Chapter Summary

  • Shannon entropy $H = -\sum p_i \log_2 p_i$ quantifies biological information: DNA carries ~2 bits/bp, ~6.4 Gbits total for the human genome.
  • Aging thermodynamics: entropy production rate increases with age as repair mechanisms decline; Prigogine's minimum entropy production theorem connects steady-state metabolism to health.
  • ROS damage follows a sigmoidal accumulation curve with a mitochondrial vicious cycle ($R_{\text{ROS}} \propto 1 + \gamma D$); antioxidant capacity and repair determine the damage trajectory.
  • Telomere dynamics: $L(n) = L_0 - n\Delta L$ predicts the Hayflick limit; telomerase activity above $\Delta L/\tau_{\text{div}}$ confers immortalization.
  • • The maximum entropy production principle suggests biological organization arises from constrained entropy maximization, providing a thermodynamic fitness landscape.
  • Caloric restriction extends lifespan by reducing ROS production and enhancing repair, consistent with thermodynamic predictions.
Rate this chapter: