Module 2: Mitochondrial Genome (mtDNA)

The human mitochondrial genome is a 16,569 bp closed circular molecule encoding 37 genes: 13 OxPhos proteins, 22 tRNAs, and 2 rRNAs. It is present in multiple copies per mitochondrion (1–2 per nucleoid, ~500 nucleoids per mitochondrion), inherited maternally, and accumulates mutations at 10–100× the nuclear rate. The population genetics of mtDNA—heteroplasmy, the germline bottleneck, the mitochondrial Evecoalescent, molecular-clock dating of human dispersal—all flow from this architecture.

1. The 16 569 bp Circular Genome

Anderson et al. (1981, Nature) published the first complete human mtDNA sequence—the Cambridge Reference Sequence (CRS)— and established the compact architecture: 16 569 base pairs, encoding 37 genes packed on both strands with no introns and few non-coding bases. The 2 rRNAs (12S at\(\sim\)1.5 kb, 16S at \(\sim\)2.5 kb) are followed by 22 tRNAs—one per codon family, dispersed around the circle as punctuation marks—and 13 protein-coding genes for the ETC and ATP synthase.

37 human mtDNA genes:

  • Complex I (NADH dehydrogenase): ND1, ND2, ND3, ND4, ND4L, ND5, ND6 (7 subunits)
  • Complex III (cytochrome bc1): cytochrome b (1 subunit)
  • Complex IV (cytochrome c oxidase): COX1, COX2, COX3 (3 subunits)
  • Complex V (ATP synthase): ATP6, ATP8 (2 subunits)
  • rRNAs: 12S (MT-RNR1), 16S (MT-RNR2)
  • tRNAs: 22 tRNAs (one for each amino acid except Leu and Ser, which have two)

The two strands are distinguished by G+T asymmetry. The heavy strand (H) is purine-rich and encodes 12 of the 13 proteins, both rRNAs, and 14 of the 22 tRNAs. The light strand (L) encodes only ND6 and 8 tRNAs. Heavy/light labels derive from their mobility in alkaline CsCl gradients, with G-rich strands migrating faster.

The only substantial non-coding region is the D-loop or control region (~1.1 kb), containing the heavy-strand origin of replication (OH), two transcription promoters (HSP1 and HSP2 on H; LSP on L), and the triple-stranded displacement loop itself. The D-loop is the hypervariable region used in forensic and phylogeographic analysis— its mutation rate is roughly 5× that of the coding region.

\[ \text{L}\ =\ 16\,569\ \text{bp}\;;\quad \text{37 genes}\;;\quad \text{coding density}\;\approx\;93\% \]

Compare: 1.5% coding density in the 3 Gb human nuclear genome. mtDNA is compressed to the limit.

2. A Different Genetic Code

Mitochondria use a slightly different genetic code from the nuclear/cytoplasmic system (Barrell, Bankier & Drouin 1979). In human mtDNA:

  • UGA codes tryptophan instead of STOP
  • AGA and AGG are STOP codons instead of arginine
  • AUA codes methionine instead of isoleucine
  • Only 22 tRNAs are needed (wobble is extremely permissive)

Each mtDNA-encoded mRNA is translated on dedicated mitoribosomes, which are structurally distinct from cytoplasmic 80S ribosomes (39S large + 28S small, with a higher protein-to-rRNA ratio than cytoplasmic or bacterial ribosomes; Amunts, Brown et al. 2015, Science). Mitoribosomes are tethered to the IMM because all 13 proteins they make are membrane-embedded and must be co-translationally inserted.

The use of a different code is a barrier to nuclear import: a gene that accidentally escapes to the nucleus and is transcribed cytoplasmically will translate incorrectly because stop codons become Trp and vice versa. This partly explains why the 13 retained OxPhos genes have never successfully migrated: the selective cost of each intermediate is high.

3. Replication, Transcription, and POLG

mtDNA is replicated by DNA polymerase γ(POLG), a heterotrimer of one catalytic α-subunit (POLG1) and two accessory β-subunits (POLG2). POLG1 has both polymerase and 3′→5′ exonuclease (proofreading) activities. The accessory subunits enhance processivity and nucleotide selectivity. DNA helicase TWINKLE unwinds the template; single-stranded DNA-binding protein mtSSBstabilizes the exposed strand.

Two models of mtDNA replication have been debated for decades. Clayton’s classical strand-displacement model (1982): heavy-strand synthesis initiates at OH and proceeds two-thirds around the circle, displacing the parental H-strand as a single-stranded loop, before light-strand synthesis begins at OL. The RITOLS model(RNA-incorporation-throughout-the-lagging-strand, Yasukawa et al. 2006) proposes that the displaced strand is coated by RNA co-transcriptionally, yielding a more conventional-appearing replication intermediate. Both models are still debated.

Transcription is driven by a single mitochondrial RNA polymerase (POLRMT), a T7-phage-related enzyme, together with the transcription factors TFAM (which also packages the genome, doubling as a histone-like protein) and TFB2M. Transcription yields polycistronic H- and L-strand precursors that are cleaved at tRNA boundaries by RNase P and RNase Z (the “tRNA punctuation” model, Ojala, Montoya & Attardi 1981). Termination is controlled by mTERF1.

\[ \text{POLG}\,(\alpha\beta_2)\;+\;\text{TWINKLE helicase}\;+\;\text{mtSSB}\;\Rightarrow\;\text{mtDNA replication fork} \]

Mutations in POLG cause a spectrum of diseases: Alpers syndrome (infantile hepato-cerebral), progressive external ophthalmoplegia (PEO), sensory ataxic neuropathy (SANDO), and parkinsonism. Alpers, in particular, is a devastating depletion syndrome with mtDNA copy number crashing in liver and brain. Pharmacologically, valproate is contraindicated in suspected POLG carriers because it precipitates liver failure.

4. Maternal Inheritance

All human mtDNA derives from the egg. Sperm mitochondria are destroyed at or shortly after fertilization. This uniparental inheritance was once thought to be passive—sperm mitochondria are 100× outnumbered by oocyte mitochondria—but Sutovsky et al. (1999, Nature) demonstrated an active mechanism: sperm mitochondria are ubiquitin-tagged in the sperm mid-piece during spermatogenesis and, upon fertilization, are recognized by the oocyte autophagy machinery (LC3-dependent) and degraded.

Further mechanisms confirmed in C. elegans (Al Rawi et al. 2011, Science) and mouse show that this is evolutionarily conserved: paternal mtDNA reaches the zygote but is selectively removed by mitophagy within hours. Allofferre et al. (2013) documented rare human cases of leaked paternal mtDNA, underlining that the system is not absolute but overwhelmingly effective.

Why uniparental?

Hurst & Hamilton (1992) and others have argued that uniparental inheritance suppresses “selfish” mtDNA variants that could exploit biparental transmission. It also simplifies mitonuclear coadaptation: with only one maternal lineage, selection on nuclear alleles does not have to cope with heterogeneous mtDNA backgrounds.

5. Heteroplasmy and the Threshold Effect

Because each cell has hundreds to thousands of mtDNA copies, a cell can harbor a mixture of mutant and wild-type molecules: heteroplasmy. The fraction of mutant molecules ranges continuously from 0 (homoplasmic wild-type) to 1 (homoplasmic mutant).

Most pathogenic mtDNA mutations are recessive at the cellular level: a cell with 30% mutant mtDNA is often functionally normal because the 70% wild-type produces enough OxPhos capacity. Above a characteristic threshold\(h^*\), however, OxPhos fails catastrophically. This threshold effect is a hallmark of mitochondrial genetics.

\[ \text{OxPhos capacity} \;\approx\; \begin{cases} 1 & h < h^* \\ (1-h)/(1-h^*) & h \geq h^* \end{cases} \]

Threshold \(h^*\) depends on the mutation and tissue; typical values are 60–95% mutant.

  • MERRF (myoclonic epilepsy with ragged red fibers, tRNALys A8344G): threshold ~85% in muscle
  • MELAS (mitochondrial encephalomyopathy, lactic acidosis, stroke-like episodes, tRNALeu(UUR) A3243G): threshold ~80% in muscle, ~60% in CNS
  • NARP / MILS (T8993G in ATP6): threshold ~70% (NARP) to ~90% (Leigh syndrome)
  • LHON (Leber’s hereditary optic neuropathy): often homoplasmic at G11778A, incomplete penetrance reflects unknown nuclear/mito modifiers

Heteroplasmy levels differ between tissues within one individual, depending on the cell division history, mitochondrial turnover, and selective pressures in that tissue type. Some tissues (nervous system) are post-mitotic and accumulate mutations; others (intestinal epithelium) cycle heavily and can purge mutant lineages.

6. Stewart 2008: The Germline Bottleneck

Heteroplasmy is transmitted non-uniformly between generations. A mother with 30% mutant mtDNA can have offspring ranging from 0% to >90% mutant. This shift is driven by the germline bottleneck: during oogenesis, the effective number of mtDNA segregation units drops to a few hundred per primordial oocyte, after which each oocyte proliferates mtDNA independently.

Stewart & Chinnery (2008, Nat. Rev. Genet.) and Wai et al. (2008,Nat. Genet.) reviewed the evidence: variance in inherited heteroplasmy scales as\(h(1-h)/N_b\), where \(N_b\) is the effective bottleneck size. Measurement in mouse and human estimates \(N_b\) at 200–500 mtDNA segregation units.

\[ \text{Var}(h_{\text{offspring}} \mid h_{\text{mother}}) \;=\; \frac{h(1-h)}{N_b} \]

The Wright-Fisher formula for a single bottleneck generation. For \(N_b = 200\)and \(h = 0.3\), the SD is \(\sim 3\%\).

The bottleneck explains why affected mothers sometimes have healthy children and healthy siblings of patients sometimes have affected children of their own. Preimplantation genetic diagnosis (PGD) for mtDNA disease must sample enough embryos to account for this variance; mitochondrial replacement therapy (“three-parent IVF”) bypasses the problem by transferring nuclei into donor oocytes with healthy mtDNA.

7. Mutation Rate: Why mtDNA Evolves Fast

mtDNA mutates 10–100× faster than nuclear DNA. Brown, George & Wilson (1979, PNAS) measured pairwise divergence among primates and estimated a rate of ~2% per million years—an order of magnitude above the 0.1–0.3%/Myr rate for single-copy nuclear DNA. Subsequent calibration (Ingman et al. 2000, Nature; Lakshmanan et al. 2024) refined the coding-region rate to ~1.4–1.7×10-8per site per year, with the hypervariable D-loop several times higher.

Multiple factors contribute:

  • mtDNA sits close to the ETC where reactive oxygen species (ROS) are generated (Module 5)
  • 8-oxoguanine (8-oxoG) is a dominant oxidative lesion, mispairing with A and producing G→T (transversion) or T→C (transition) substitutions
  • Limited repair capacity: base excision repair (BER) operates in mitochondria but nucleotide excision repair and homologous recombination are greatly reduced
  • Strand-asymmetric replication increases deamination in the exposed heavy-strand during the “strand displacement” phase
  • No protective histones; TFAM packages mtDNA but does not fully shield it

The mutational pattern is strand-asymmetric: G→A and T→C transitions dominate on the H strand, producing the characteristic substitution spectrum used in phylogeography. The rate is also age-dependent: somatic mtDNA mutations accumulate over lifetime, contributing to age-associated mitochondrial dysfunction (covered in Module 7).

8. Wallace 1988: Mitochondrial Eve

Cann, Stoneking & Wilson (1987, Nature) analyzed mtDNA restriction-enzyme fragments from 147 individuals worldwide and concluded that all present-day humans descend matrilineally from a single woman who lived in Africa ~200,000 years ago. Douglas Wallace (1988) extended this with population-specific haplogroup analyses. The individual is a coalescent statistical construct, not a unique biological ancestor: she was contemporary with thousands of other women, but only her matrilineal lineage has survived to produce living descendants.

The math is Kingman coalescent: for a population of \(N_{\text{fem}}\)reproducing females with generation time \(\tau\), the expected time to the most recent common maternal ancestor of all living people is

\[ \mathbb{E}[T_{\text{MRCA}}] \;\approx\; 2\,N_{\text{fem}}\,\tau \]

With \(N_{\text{fem}} \approx 5000\) (ancient bottleneck) and\(\tau = 25\) y: \(T_{\text{MRCA}} \approx 200{,}000\) years. Consistent with the molecular-clock age from observed divergence.

“Mitochondrial Eve” is not unique: Y-chromosome Adam exists symmetrically for paternal ancestry, at comparable ages (~150–300 kya). Neither is the first human; both are the most recent common ancestors along a single uniparental line, reflecting the stochastic loss of lineages through random drift in small populations.

9. mtDNA Haplogroups and Human Dispersal

Haplogroups are clades of mtDNA defined by shared derived polymorphisms. They are named by capital letters following the phylogenetic tree:

  • L0, L1, L2: ancestral African clades; L0 is the deepest (KhoeSan), ~175 kya
  • L3: African clade from which the “out-of-Africa” lineages descend, ~70 kya
  • M: south and east Asian, Australian aboriginal; ~60 kya
  • N: Eurasian founder; ~65 kya
  • R: European/Asian crown from N; ~55 kya
  • H, V, J, T, U, K: European subclades (H is most common, ~40% of Europeans)
  • A, B, C, D: Native American founder lineages, arrived in the Americas 15–25 kya

Haplogroup frequencies across populations provide fine-resolution history. Richards et al. (2000) used them to date the spread of Neolithic farmers from the Near East; Mishmar et al. (2003) argued for climatic selection on mtDNA (ATP-synthase variants associate with cold tolerance); ancient-DNA studies (Krause et al. 2010) recovered Neanderthal and Denisovan mtDNA, showing they fall outside the modern human mitochondrial tree.

Some haplogroups have been tentatively linked to disease susceptibility (haplogroup J in LHON penetrance; haplogroup H in age-related macular degeneration), but associations are modest and the field remains cautious about phenotypic significance (Samuels, Carothers, Horton & Chinnery 2006).

10. DNA Damage, Deletions, and Aging

Somatic mtDNA accumulates point mutations and deletions over an organism’s lifetime. The most common deletion is the 4977 bp “common deletion” (between direct repeats at positions 8470 and 13447), which accumulates in aged heart and brain (Cortopassi & Arnheim 1990). In post-mitotic tissues, clonal expansion of a mutated mtDNA within a single cell can drive mosaic COX-deficient cells in aged muscle fibers, substantia nigra neurons, and colon crypt base.

Trifunovic et al. (2004, Nature) created a “mutator mouse” with a proofreading-deficient POLG (D257A). These mice accumulate mtDNA point mutations at ~3–5× the normal rate and develop a progeroid phenotype: reduced lifespan, hair loss, kyphosis, sarcopenia. This was the first direct causal demonstration that mtDNA mutations drive aging-like phenotypes, though the interpretation (do mutations cause aging, or do they correlate with it?) remains contested.

Common somatic mtDNA lesions:

  • 8-oxoguanine (8-oxoG): from ROS; mispairs with A
  • 4977 bp “common deletion”: flanked by 13 bp direct repeats
  • KSS/PEO deletions: 2–9 kb deletions in muscle cristae of patients
  • Point mutations A3243G, A8344G: sporadic or inherited
  • Depletion (quantitative): loss of mtDNA copy number; POLG, TK2, DGUOK syndromes

11. Human mtDNA Gene Map (16 569 bp)

human mtDNA (16 569 bp circular)13 OxPhos proteins + 22 tRNAs + 2 rRNAsD-loop12S rRNA16S rRNAND1ND2COX1COX2ATP8/6COX3ND3ND4L/4ND5ND6cyt brRNAComplex I (ND)Complex III (cyt b)Complex IV (COX)Complex V (ATP6/8)D-loop / control4977 bp common deletion

Simulation 1: Heteroplasmy Drift with Germline Bottleneck

Wright-Fisher drift simulation of mtDNA heteroplasmy through a germline bottleneck. A mother with heteroplasmy \(h_0 = 0.30\) passes mtDNA through \(N_b = 200\)segregation units; offspring heteroplasmy is binomially sampled and then amplified back to somatic copy number. We compute the variance (\(h(1-h)/N_b\), Wright 1931), the disease probability \(P(h > h^*)\), and multi-generation drift trajectories that reveal why affected mothers can have healthy children and vice versa.

Python
script.py148 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Simulation 2: Coalescent Time and Mitochondrial Eve

Combine Kingman coalescent simulation with the molecular-clock calibration of Brown (1979), Ingman (2000), and Lakshmanan (2024) to date the most recent common matrilineal ancestor of modern humans. Given the observed coding-region divergence \(\pi \approx 0.35\%\)and \(\mu \approx 1.4 \times 10^{-8}\) /site/yr, we recover the ~200 kya estimate of Wallace (1988). We then relate this to haplogroup branching times from L0 (~175 kya) through the out-of-Africa L3 (~70 kya) to the Native American A/B/C/D clades (~15–25 kya).

Python
script.py162 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Key References

• Anderson, S. et al. (1981). “Sequence and organization of the human mitochondrial genome.” Nature, 290, 457–465.

• Andrews, R.M. et al. (1999). “Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA.” Nat. Genet., 23, 147.

• Brown, W.M., George, M., & Wilson, A.C. (1979). “Rapid evolution of animal mitochondrial DNA.” PNAS, 76, 1967–1971.

• Cann, R.L., Stoneking, M., & Wilson, A.C. (1987). “Mitochondrial DNA and human evolution.” Nature, 325, 31–36.

• Wallace, D.C. (1988). “A mitochondrial paradigm for metabolic and degenerative diseases, aging, and cancer.” Annu. Rev. Genet., 39, 359–407.

• Ingman, M., Kaessmann, H., Pääbo, S., & Gyllensten, U. (2000). “Mitochondrial genome variation and the origin of modern humans.” Nature, 408, 708–713.

• Sutovsky, P. et al. (1999). “Ubiquitin tag for sperm mitochondria.” Nature, 402, 371–372.

• Al Rawi, S. et al. (2011). “Postfertilization autophagy of sperm organelles prevents paternal mitochondrial DNA transmission.” Science, 334, 1144–1147.

• Stewart, J.B. & Chinnery, P.F. (2008). “The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease.” Nat. Rev. Genet., 16, 530–542.

• Wai, T., Teoli, D., & Shoubridge, E.A. (2008). “The mitochondrial DNA genetic bottleneck results from replication of a subpopulation of genomes.” Nat. Genet., 40, 1484–1488.

• Trifunovic, A. et al. (2004). “Premature ageing in mice expressing defective mitochondrial DNA polymerase.” Nature, 429, 417–423.

• Cortopassi, G.A. & Arnheim, N. (1990). “Detection of a specific mitochondrial DNA deletion in tissues of older humans.” Nucleic Acids Res., 18, 6927–6933.

• Mishmar, D. et al. (2003). “Natural selection shaped regional mtDNA variation in humans.” PNAS, 100, 171–176.

• Richards, M. et al. (2000). “Tracing European founder lineages in the Near Eastern mtDNA pool.” Am. J. Hum. Genet., 67, 1251–1276.

• Amunts, A. et al. (2015). “The structure of the human mitochondrial ribosome.” Science, 348, 95–98.

• Barrell, B.G., Bankier, A.T., & Drouin, J. (1979). “A different genetic code in human mitochondria.” Nature, 282, 189–194.

• Yasukawa, T. et al. (2006). “Replication of vertebrate mitochondrial DNA entails transient ribonucleotide incorporation throughout the lagging strand.” EMBO J., 25, 5358–5371.

• Ojala, D., Montoya, J., & Attardi, G. (1981). “tRNA punctuation model of RNA processing in human mitochondria.” Nature, 290, 470–474.

• Samuels, D.C., Carothers, A.D., Horton, R., & Chinnery, P.F. (2006). “The power to detect disease associations with mitochondrial DNA haplogroups.” Am. J. Hum. Genet., 78, 713–720.

• Lakshmanan, S. et al. (2024). “Refined Bayesian estimation of the human mtDNA mutation rate.” Mol. Biol. Evol., 41, msae021.