Part 4: DNA Repair Mechanisms

Why DNA Repair Matters

Every cell in the human body sustains an estimated 10,000–100,000 DNA lesions per day. Without repair, these would rapidly accumulate, leading to mutations, chromosomal aberrations, cell death, and cancer. The genome’s integrity depends on a sophisticated network of repair pathways, each specialised for particular lesion types.

Sources of DNA Damage

Spontaneous Depurination

~5,000 purines lost / cell / day

Hydrolysis of the N-glycosidic bond creates abasic (AP) sites. AP sites block replication and are mutagenic if not repaired.

Spontaneous Deamination

~100–500 / cell / day

Cytosine → uracil (reads as T), 5-methylcytosine → thymine (C→T transition, the most common somatic point mutation).

Oxidative Damage (ROS)

~10,000 oxidative lesions / cell / day

Superoxide, hydroxyl radicals, and H₂O₂ from mitochondrial respiration produce 8-oxo-7,8-dihydroguanine (8-oxoG), thymine glycol, and single-strand breaks.

Replication Errors

~1 error per 10⁴–10⁵ nt (before proofreading)

Polymerase misincorporation → mismatches and small indels. Proofreading reduces error rate to ~10⁻⁷; mismatch repair further to ~10⁻⁹–10⁻¹⁰.

UV Radiation

Variable (sun exposure)

UV-B (280–315 nm) produces cyclobutane pyrimidine dimers (CPDs) and 6-4 photoproducts (6-4PPs). ~50,000 CPDs per sun-exposed cell per hour of midday sun.

Alkylating Agents

Endogenous + exogenous

S-adenosylmethionine (SAM), tobacco nitrosamines, and chemotherapy drugs add methyl/ethyl groups to bases (e.g., O⁶-methylguanine, N⁷-methylguanine).

Double-Strand Breaks (DSBs)

~10–50 / cell / day

Ionizing radiation, replication fork collapse at SSBs or lesions, topoisomerase II failure. The most lethal lesion: a single unrepaired DSB can kill a cell.

Crosslinks

Exogenous (cisplatin, mitomycin C)

Interstrand crosslinks (ICLs) covalently link both strands, blocking replication and transcription. Repaired by Fanconi anaemia pathway.

Steady-State Lesion Level

Because damage formation and repair are concurrent, the number of lesions in a cell reaches a steady state. If lesions are produced at rate $k_{\text{dam}}$ and repaired by first-order kinetics with rate constant $k_{\text{rep}}$:

$$\frac{d[L]}{dt} = k_{\text{dam}} - k_{\text{rep}} \cdot [L]$$

$$[L]_{\text{ss}} = \frac{k_{\text{dam}}}{k_{\text{rep}}}$$

For oxidative lesions: $k_{\text{dam}} \approx 10{,}000\text{ day}^{-1}$, $k_{\text{rep}} \approx 5{,}000\text{ day}^{-1}$ per lesion → $[L]_{\text{ss}} \approx 2$ lesions at any given moment (rapidly turned over).

Direct Reversal of Damage

The simplest repair strategy: a single enzyme directly reverses the chemical modification without excising bases or breaking the backbone. Only a few lesion types can be repaired this way.

Photolyase (Photoreactivation)

Photolyases are flavoproteins that directly reverse UV-induced cyclobutane pyrimidine dimers (CPD photolyase) or 6-4 photoproducts (6-4 photolyase) using energy from visible blue light (300–500 nm).

Mechanism:

Photolyase binds the CPD in the dark (high affinity, $K_d \sim 10^{-9}$ M)
Light antenna chromophore (MTHF or 8-HDF) absorbs a blue-light photon
Energy is transferred to the catalytic cofactor FADH⁻ (fully reduced flavin)
FADH⁻ donates an electron to the CPD, splitting the cyclobutane ring via a radical mechanism
Electron returns to FADH, restoring the two monomeric pyrimidines

Note: Humans and other placental mammals have lost photolyase genes. Our cryptochrome proteins (CRY1/CRY2) are photolyase homologs that function as circadian clock components instead.

O⁶-Methylguanine-DNA Methyltransferase (MGMT / AGT)

MGMT repairs O⁶-alkylguanine, a highly mutagenic lesion that causes G:C → A:T transitions by mispairing with thymine during replication.

Mechanism (Suicide Enzyme):

MGMT flips the damaged base out of the helix into its active-site pocket
An active-site cysteine (Cys145 in human MGMT) accepts the methyl group via an S_N2 reaction
The resulting S-methylcysteine cannot be regenerated — the protein is irreversibly inactivated
Methylated MGMT is ubiquitinated and degraded by the proteasome

$$\text{O}^6\text{-meG-DNA} + \text{MGMT-Cys-SH} \xrightarrow{} \text{G-DNA} + \text{MGMT-Cys-S-CH}_3$$

Clinical: MGMT promoter methylation in glioblastoma silences MGMT expression, making tumours sensitive to temozolomide (an alkylating chemotherapy drug). MGMT promoter status is a key biomarker for treatment decisions.

Base Excision Repair (BER)

BER is the primary pathway for removing small, non-helix-distorting base lesions: oxidised bases, deaminated bases, alkylated bases, and single-strand breaks. It processes the majority of the ~10,000–20,000 lesions repaired daily per cell.

Step 1: Damage Recognition — DNA Glycosylases

Each glycosylase recognises a specific subset of damaged bases. The enzyme flips the target base out of the helix into its active-site pocket and cleaves the N-glycosidic bond, releasing the free base and leaving an abasic (AP) site.

OGG1

8-oxoguanine (opposite C)

Bifunctional: has AP lyase activity (β-elimination)

UNG (UDG)

Uracil in DNA

Removes uracil from C deamination or dUTP misincorporation

SMUG1

Uracil, 5-hydroxymethyluracil

Backup for UNG in single-stranded DNA contexts

MBD4

T:G mismatches at CpG sites

Repairs 5-meC → T deamination products

MUTYH (MYH)

Adenine opposite 8-oxoG

Prevents G:C→T:A transversions from unrepaired 8-oxoG

AAG (MPG)

N³-methyladenine, hypoxanthine

Broad specificity for alkylated and deaminated purines

NEIL1/2/3

Oxidised pyrimidines, FapyG

Bifunctional, active on bubble/fork structures (replication)

TDG

T:G and U:G mismatches

Key role in active DNA demethylation (TET pathway: 5-meC → 5-hmC → 5-fC → 5-caC → BER)

Glycosylase Kinetics

DNA glycosylases follow Michaelis-Menten kinetics when scanning DNA for lesions. The catalytic efficiency$k_{\text{cat}}/K_M$ reflects both the speed of lesion recognition (facilitated diffusion / scanning) and the chemical step of bond cleavage:

$$v = \frac{V_{\max} \cdot [S]}{K_M + [S]} = \frac{k_{\text{cat}} \cdot [E]_0 \cdot [S]}{K_M + [S]}$$

For OGG1: $k_{\text{cat}} \approx 0.1 \text{ min}^{-1}$, $K_M \approx 20\text{ nM}$ — slow turnover because the AP lyase step is rate-limiting.

For UNG: $k_{\text{cat}} \approx 100\text{ min}^{-1}$, $K_M \approx 50\text{ nM}$ — one of the fastest glycosylases.

Steps 2–5: Processing the AP Site

AP Endonuclease (APE1): Cleaves the phosphodiester backbone 5′ to the AP site, generating a 3′-OH and a 5′-deoxyribose phosphate (5′-dRP). APE1 processes >95% of AP sites in human cells.
Short-Patch BER (1 nucleotide): DNA Polymerase β (Pol β) fills the single-nucleotide gap and removes the 5′-dRP via its intrinsic dRP lyase activity. DNA Ligase IIIα/XRCC1 seals the nick. This is the dominant BER sub-pathway (~80% of events).
Long-Patch BER (2–10 nucleotides): When the 5′-dRP is oxidised or reduced (refractory to Pol β lyase), Pol δ/ε performs strand-displacement synthesis creating a 5′ flap. FEN1 (flap endonuclease 1) cleaves the flap. DNA Ligase I seals the nick. PCNA coordinates this sub-pathway.
XRCC1 Scaffold: XRCC1 has no enzymatic activity but serves as a critical scaffold protein, physically interacting with Pol β, Ligase IIIα, PNKP, and PARP1 to coordinate the BER machinery at the lesion site.

PARP1 and BER: PARP1 (poly(ADP-ribose) polymerase 1) acts as a first responder — it binds single-strand breaks within seconds and synthesises poly(ADP-ribose) chains (PARylation) that recruit XRCC1 and other BER factors. PARP inhibitors exploit this: in BRCA1/2-deficient cells lacking HR, trapped PARP-DNA complexes convert SSBs into lethal DSBs (synthetic lethality).

Nucleotide Excision Repair (NER)

NER removes bulky, helix-distorting lesions: UV-induced CPDs and 6-4 photoproducts, bulky chemical adducts (benzo[a]pyrene-guanine), and intrastrand crosslinks. It excises a 24–32 nucleotide oligomer containing the lesion, then fills the gap.

Global Genome NER (GG-NER)

Surveys the entire genome for distortions.

Damage sensing: XPC-RAD23B-CEN2 complex detects helical distortion by binding the undamaged strand opposite the lesion. For CPDs (minimal distortion), UV-DDB (DDB1-DDB2/XPE) first recognises the lesion and recruits XPC.
Verification: TFIIH (containing XPB and XPD helicases) is recruited. XPD unwinds DNA 5′→3′ and stalls at the lesion, verifying damage.
Pre-incision complex: XPA verifies damage and positions the complex. RPA coats the undamaged strand.
Dual incision: XPF-ERCC1 cuts 5′ to the lesion; XPG (3′ endonuclease) cuts 3′ to the lesion. This releases a 24–32 nt oligonucleotide.
Gap filling: Pol δ/ε/κ (with PCNA, RFC) fills the gap. DNA Ligase I or III seals the nick.

Transcription-Coupled NER (TC-NER)

Prioritises repair of the transcribed strand of active genes.

Damage sensing: RNA Polymerase II stalls at a lesion on the template strand. The stalled RNAPII serves as the damage sensor (no XPC needed).
CSB recruitment: Cockayne syndrome protein B (CSB/ERCC6) binds stalled RNAPII and recruits CSA (ERCC8), which forms an E3 ubiquitin ligase complex with DDB1-CUL4.
RNAPII displacement: CSB remodels the RNAPII-DNA complex, possibly backtracking RNAPII to expose the lesion for repair.
Convergence with GG-NER: TFIIH, XPA, RPA, XPG, and XPF-ERCC1 are recruited, and the remainder of the pathway proceeds identically to GG-NER.

TC-NER repairs transcribed strand lesions 5–10x faster than GG-NER repairs the same lesions genome-wide.

Xeroderma Pigmentosum (XP)

Autosomal recessive disorder caused by mutations in any of 8 genes (XPA through XPG, plus XPV/Pol η). Patients have 1,000-fold increased risk of skin cancer, extreme UV sensitivity, and (in some complementation groups) neurodegeneration.

XPA

Damage verification

XPB

TFIIH 3'→5' helicase

XPC

GG-NER damage sensor

XPD

TFIIH 5'→3' helicase

XPE

DDB2 (UV-DDB)

XPF

5' endonuclease

XPG

3' endonuclease

XPV

Pol η (TLS)

Mismatch Repair (MMR)

MMR corrects base-base mismatches and small insertion/deletion loops (IDLs) that escape polymerase proofreading. It improves replication fidelity 100–1,000-fold, reducing the error rate from ~10⁻⁷ (post-proofreading) to ~10⁻⁹–10⁻¹⁰.

Eukaryotic MMR Pathway

Mismatch Recognition: MutSα (MSH2-MSH6 heterodimer) recognises single base mismatches and 1–2 nt IDLs. MutSβ (MSH2-MSH3) recognises larger IDLs (2–13 nt loops). MutSα binds the mismatch and undergoes an ATP-dependent conformational change to a sliding clamp form.
MutLα Recruitment: MutLα (MLH1-PMS2 heterodimer) is recruited by the MutSα sliding clamp. MutLα has latent endonuclease activity activated by PCNA, RFC, and the MutSα sliding clamp.
Strand Discrimination: In eukaryotes, the newly synthesised strand is identified by its association with PCNA (loaded asymmetrically by RFC at the 3′ end of Okazaki fragments). PCNA activates the MutLα endonuclease to nick the daughter strand specifically. This replaces the bacterial MutH/Dam methylation system.
Excision: EXO1 (5′→3′ exonuclease) degrades the error-containing strand from the MutLα nick through and past the mismatch. RPA protects the template strand. An EXO1-independent pathway may use strand displacement by Pol δ.
Resynthesis and Ligation: Pol δ fills the gap using the intact template strand. DNA Ligase I seals the final nick.

MMR and Mutation Rate

$$\mu_{\text{total}} = \mu_{\text{pol}} \times (1 - f_{\text{proof}}) \times (1 - f_{\text{MMR}})$$

Where $\mu_{\text{pol}} \approx 10^{-4}$ (polymerase base error rate), $f_{\text{proof}} \approx 0.99$ (proofreading removes 99%), $f_{\text{MMR}} \approx 0.99$–$0.999$ (MMR removes 99–99.9%).

Final: $\mu_{\text{total}} \approx 10^{-4} \times 10^{-2} \times 10^{-2} = 10^{-8}$ to $10^{-10}$ errors per bp per replication.

Lynch Syndrome (HNPCC)

Hereditary nonpolyposis colorectal cancer (HNPCC) is caused by germline mutations in MMR genes, predominantly MLH1 (40%), MSH2 (34%), MSH6 (18%), or PMS2 (8%). Patients carry one mutant allele; somatic loss of the remaining allele triggers a “mutator phenotype” with 100–1,000-fold increased mutation rate.

Microsatellite Instability (MSI): A hallmark of MMR deficiency. Repetitive microsatellite sequences (e.g., (CA)n) expand or contract due to polymerase slippage that MMR normally corrects. MSI-high tumours have distinct biology: better prognosis, poor response to 5-FU, but excellent response to immune checkpoint inhibitors (PD-1/PD-L1 blockade) due to high neoantigen load.

Double-Strand Break Repair

DSBs are the most dangerous DNA lesions: a single unrepaired DSB can trigger apoptosis, and misrepaired DSBs drive chromosomal translocations and cancer. Two major pathways repair DSBs, with the choice governed by cell cycle phase.

Non-Homologous End Joining (NHEJ)

NHEJ directly ligates broken ends without a homologous template. It operates throughout the cell cycle but is the dominant DSB repair pathway in G1 (and in post-mitotic cells). NHEJ is fast (30 min–1 hr) but intrinsically error-prone, often introducing small insertions or deletions at the junction.

End Recognition — Ku70/Ku80: The Ku heterodimer rapidly binds broken DNA ends (within seconds) as a ring-shaped complex that threads onto the terminus. Ku protects ends from nucleolytic degradation and serves as a platform for recruiting downstream factors.
DNA-PKcs Recruitment: Ku recruits DNA-dependent protein kinase catalytic subunit (DNA-PKcs, 469 kDa), forming the DNA-PK holoenzyme. DNA-PKcs molecules on the two ends form a synaptic complex, tethering the broken ends. DNA-PKcs autophosphorylation triggers conformational changes allowing end processing.
End Processing: Damaged or incompatible ends are processed by: Artemis nuclease (activated by DNA-PKcs phosphorylation) for hairpins and overhangs; Pol μ and Pol λ (X-family polymerases) for gap filling; PNKP for 5′-OH/3′-phosphate ends.
Ligation — XRCC4-Ligase IV-XLF: The XRCC4-DNA Ligase IV complex performs the final ligation. XLF (Cernunnos) stimulates Ligase IV activity, especially for incompatible ends. PAXX stabilises the synaptic complex.

Alternative End Joining (alt-EJ / MMEJ): A backup pathway using microhomology (2–20 bp) for alignment. Dependent on PARP1, Pol θ, and Ligase III. Produces larger deletions than classical NHEJ and drives some chromosome translocations.

Homologous Recombination (HR)

HR uses the sister chromatid as a template for error-free repair. It is restricted to S and G2 phases when a sister chromatid is available. HR is slower than NHEJ (several hours) but preserves sequence fidelity.

End Resection — MRN Complex: The MRN complex (MRE11-RAD50-NBS1) initiates 5′→3′ end resection, together with CtIP. MRE11 has 3′→5′ exonuclease activity that creates a nick internal to the end. Extended resection by EXO1 or DNA2/BLM generates long 3′ single-stranded DNA (ssDNA) tails (several kb).
RPA Coating: Replication Protein A (RPA) immediately binds the ssDNA tails, removing secondary structures and protecting against nuclease degradation. RPA must be displaced for RAD51 loading.
RAD51 Filament Formation — BRCA2 Mediator: BRCA2 binds RAD51 monomers and loads them onto RPA-coated ssDNA, displacing RPA. This forms the RAD51 nucleoprotein filament (right-handed helical filament, ~6 RAD51 monomers per helical turn). BRCA1 facilitates end resection and recruits PALB2-BRCA2 to the DSB. RAD51 paralogs (RAD51B/C/D, XRCC2/3) stabilise the filament.
Strand Invasion and D-loop: The RAD51 filament searches for homology on the sister chromatid and catalyses strand invasion, forming a displacement loop (D-loop). The invading 3′ end primes DNA synthesis using the sister chromatid as template.
DNA Synthesis and Resolution: Pol δ extends the invading strand. Three sub-pathways then complete repair:
- SDSA (Synthesis-Dependent Strand Annealing): The extended strand is displaced and anneals to the other resected end. Produces only non-crossover products. The predominant HR pathway in mitotic cells.
- dHJ (Double Holliday Junction): Second-end capture forms a double Holliday junction. Resolution by GEN1 or MUS81-EME1 can produce crossovers or non-crossovers. Dissolution by BLM-TopIIIα-RMI1/2 produces exclusively non-crossovers.
- BIR (Break-Induced Replication): When only one end has homology (collapsed forks), a migrating D-loop copies the entire chromosome arm. Error-prone and can cause loss of heterozygosity.

Pathway Choice: NHEJ vs HR

The key decision point is end resection: once 5′ ends are resected, the cell is committed to HR (resected ends cannot be ligated by NHEJ). This is regulated by cell cycle:

G1 Phase → NHEJ

• 53BP1 binds chromatin near DSBs (recognises H4K20me2, H2AK15ub)
• 53BP1 recruits RIF1 and Shieldin complex (SHLD1/2/3-REV7)
• Shieldin blocks end resection by recruiting CST-Polα-primase for fill-in synthesis
• Low CDK activity: CtIP is not phosphorylated, MRN resection is inactive

S/G2 Phase → HR

• CDK phosphorylates CtIP, activating MRN-CtIP resection
• BRCA1-BARD1 ubiquitinates H2A, displacing 53BP1 from chromatin
• BRCA1 recruits PALB2-BRCA2-RAD51 for filament formation
• Sister chromatid is available as homologous template

Competition model: 53BP1 and BRCA1 antagonise each other at DSBs. In BRCA1-deficient cells, 53BP1 loss partially rescues HR by allowing resection — explaining why 53BP1 loss confers PARP inhibitor resistance in BRCA1-mutant tumours.

Translesion Synthesis (TLS)

TLS is a DNA damage tolerance mechanism rather than a true repair pathway. When the replicative polymerase stalls at a lesion, specialised TLS polymerases are recruited to synthesise across the damaged template, allowing replication to continue. TLS is inherently error-prone (except Pol η on CPDs).

PCNA Ubiquitination Switch

Stalled replication forks expose ssDNA coated by RPA, which recruits RAD6-RAD18 (E2-E3 ubiquitin ligase). RAD18 monoubiquitinates PCNA at Lys164.

$$\text{PCNA} \xrightarrow{\text{RAD6-RAD18}} \text{PCNA-Ub} \xrightarrow{\text{UBC13-MMS2-RAD5}} \text{PCNA-polyUb (K63 chain)}$$

Monoubiquitination → TLS polymerase recruitment; K63-linked polyubiquitination → template switching (error-free)

TLS polymerases have ubiquitin-binding domains (UBZ or UBM) and PIP boxes that allow them to bind monoubiquitinated PCNA with enhanced affinity, displacing the stalled replicative polymerase.

Y-Family Polymerases

Pol η (eta) (POLH)

Accurate bypass of CPDs (inserts AA opposite T-T dimer). Loss → XP variant (XPV)

Error rate ~10⁻² on undamaged DNA

Pol ι (iota) (POLI)

Bypasses minor-groove adducts. Uses Hoogsteen base pairing. Inserts G opposite T (not A!)

Lowest fidelity polymerase known

Pol κ (kappa) (POLK)

Bypasses benzo[a]pyrene-guanine adducts and N²-guanine lesions. Extension after insertion by other TLS pols

Error rate ~10⁻³ on undamaged DNA

Rev1 (REV1)

Deoxycytidyl transferase (always inserts C). Scaffold for recruiting other TLS pols via its C-terminal domain

Template-independent dCMP insertion

Two-polymerase model: TLS often requires two polymerases — an “inserter” that incorporates a nucleotide opposite the lesion and an “extender” (often Pol ζ = REV3-REV7, a B-family polymerase) that extends from the distorted primer terminus.

DNA Damage Response (DDR) Signalling

The DDR is a signal transduction cascade that senses DNA damage, amplifies the signal, and coordinates repair with cell cycle arrest, transcriptional responses, and (if damage is irreparable) apoptosis or senescence. The apical kinases ATM and ATR orchestrate this response.

ATM Pathway (DSBs)

MRN complex (sensor) recognises DSBs and recruits ATM
ATM (inactive dimer) monomerises and autophosphorylates (Ser1981)
ATM phosphorylates H2AX at Ser139 (γH2AX) in megabase domains flanking the break
γH2AX recruits MDC1, which amplifies ATM signalling and recruits RNF8/RNF168 ubiquitin ligases
H2A/H2AX ubiquitination recruits 53BP1 and BRCA1-Abraxas complexes
ATM phosphorylates CHK2 (Thr68) → CHK2 activates p53, CDC25A degradation

ATR Pathway (ssDNA/stalled forks)

RPA-coated ssDNA (at stalled forks or resected DSBs) recruits ATRIP-ATR
RAD17-RFC loads the 9-1-1 complex (RAD9-RAD1-HUS1) at ssDNA-dsDNA junctions
TopBP1 (or ETAA1) activates ATR kinase activity
ATR phosphorylates CHK1 (Ser317, Ser345)
CHK1 phosphorylates CDC25A/C → degradation/cytoplasmic sequestration
Loss of CDC25 activity → CDK remains phosphorylated (inactive) → S/G2 arrest

p53: Guardian of the Genome

p53 integrates DDR signals to decide cell fate: repair, arrest, or death.

• Low damage: ATM/CHK2 phosphorylate p53 (Ser15, Ser20), disrupting MDM2 binding → p53 stabilisation → p21 transcription → G1 arrest via CDK2 inhibition
• Moderate damage: Sustained p53 activation induces GADD45 (NER stimulation), p53R2 (ribonucleotide reductase for dNTP supply), and DDB2/XPC (NER factors)
• Severe/irreparable damage: Prolonged/high-level p53 activation induces pro-apoptotic targets: BAX, PUMA, NOXA (mitochondrial apoptosis), FAS, DR5 (death receptor pathway)
• Senescence: Persistent DDR signalling drives irreversible cell cycle exit via p21 → Rb hypophosphorylation and SASP (senescence-associated secretory phenotype)

$$\text{DSB} \xrightarrow{\text{MRN}} \text{ATM}^* \xrightarrow{} \text{CHK2}^* \xrightarrow{} \text{p53 stabilisation} \xrightarrow{} \begin{cases} \text{p21} \to \text{G1 arrest} \\ \text{GADD45} \to \text{repair} \\ \text{BAX/PUMA} \to \text{apoptosis} \end{cases}$$

Clinical Relevance

BRCA1/2 and Hereditary Breast/Ovarian Cancer

BRCA1 and BRCA2 are essential for homologous recombination. Germline heterozygous mutations confer 50–85% lifetime risk of breast cancer and 15–45% risk of ovarian cancer. Loss of the remaining allele in a somatic cell eliminates HR, forcing reliance on error-prone NHEJ and alt-EJ for DSB repair, accelerating genomic instability.

BRCA1 functions: DSB end resection (with CtIP), 53BP1 displacement, PALB2-BRCA2 recruitment, transcription regulation, and ubiquitin ligase activity (BRCA1-BARD1 heterodimer).
BRCA2 functions: RAD51 mediator — binds RAD51 monomers via 8 BRC repeats and loads them onto ssDNA, displacing RPA. Also stabilises stalled replication forks.

PARP Inhibitors and Synthetic Lethality

Synthetic lethality: loss of two pathways is lethal, but loss of either alone is viable. PARP inhibition in HR-deficient (BRCA1/2-mutant) cells is the paradigmatic example.

Mechanisms of PARP inhibitor cytotoxicity:

Catalytic inhibition: Prevents PARylation, slowing BER/SSBR → SSBs persist and convert to DSBs at replication forks
PARP trapping: PARPi (especially talazoparib, olaparib) stabilise PARP1-DNA complexes on chromatin, creating replication-blocking lesions more toxic than unrepaired SSBs
In HR-proficient cells, DSBs from trapped PARP are repaired by HR → survival
In BRCA1/2-deficient cells, DSBs cannot be repaired by HR → genomic catastrophe → cell death

Approved PARPi: Olaparib (Lynparza), niraparib (Zejula), rucaparib (Rubraca), talazoparib (Talzenna). Indications: BRCA-mutant breast, ovarian, pancreatic, prostate cancer; also HRD-positive tumours (irrespective of BRCA status).

Cisplatin and DNA Crosslinks

Cisplatin (cis-[Pt(NH₃)₂Cl₂]) is a cornerstone chemotherapy drug. It forms intrastrand crosslinks (primarily d(GpG) 1,2-intrastrand, ~65% of adducts) and interstrand crosslinks (~5%) that distort the helix and block replication/transcription.

Repair requires NER (for intrastrand crosslinks) and the Fanconi anaemia/HR pathway (for ICLs). Tumours deficient in these pathways (e.g., BRCA-mutant) are hypersensitive to cisplatin. Resistance mechanisms include upregulated NER (ERCC1 overexpression), increased drug efflux, and enhanced TLS bypass.

Defective MMR in Cancer

Beyond Lynch syndrome, ~15% of sporadic colorectal cancers have MMR deficiency (usually MLH1 promoter hypermethylation). MMR-deficient tumours have distinct features:

• MSI-high phenotype with frameshift mutations in coding microsatellites (TGFβRII, BAX, IGFIIR)
• High tumour mutation burden (TMB) → many neoantigens → strong immune infiltration
• Resistance to 5-fluorouracil (MMR normally mediates 5-FU cytotoxicity by recognising 5-FU:G mismatches)
• Exceptional response to anti-PD-1 immunotherapy (pembrolizumab, nivolumab): FDA approved for all MSI-high/dMMR solid tumours regardless of tissue of origin (tumour-agnostic approval)

Xeroderma Pigmentosum

Genes: XPA-G, XPV

Pathway: NER / TLS

1000x skin cancer risk, UV sensitivity, neurodegeneration

Cockayne Syndrome

Genes: CSA (ERCC8), CSB (ERCC6)

Pathway: TC-NER

Growth failure, neurodegeneration, photosensitivity (no cancer predisposition)

Fanconi Anaemia

Genes: 22 FANC genes (FANCA-W)

Pathway: ICL repair / HR

Bone marrow failure, AML, congenital malformations

Ataxia Telangiectasia

Genes: ATM

Pathway: DDR signalling

Cerebellar ataxia, immunodeficiency, lymphoma, radiation sensitivity

Li-Fraumeni Syndrome

Genes: TP53

Pathway: DDR / apoptosis

~100% lifetime cancer risk: sarcomas, breast, brain, adrenocortical

MUTYH-Associated Polyposis

Genes: MUTYH

Pathway: BER

Colorectal polyposis, G:C→T:A transversions from 8-oxoG bypass

Mathematical Framework

Mutation Rate Calculation

The observed mutation rate per base pair per cell division integrates errors from all sources:

$$\mu = \underbrace{\mu_{\text{pol}}}_{\sim 10^{-4}} \times \underbrace{(1-f_{\text{proof}})}_{\sim 10^{-2}} \times \underbrace{(1-f_{\text{MMR}})}_{\sim 10^{-2}\text{--}10^{-3}} + \underbrace{\mu_{\text{damage}}}_{\text{unrepaired lesions}}$$

$\mu_{\text{damage}}$ depends on the ratio of damage rate to repair capacity:

$$\mu_{\text{damage}} = \sum_{i} \frac{k_{\text{dam},i}}{k_{\text{dam},i} + k_{\text{rep},i}} \times p_{\text{mut},i}$$

Where $p_{\text{mut},i}$ is the probability that lesion type $i$, if unrepaired at replication, causes a mutation.

Derivation: Mutation Rate Per Generation

Starting from the sources of replication errors and unrepaired DNA damage, we derive the total mutation rate per cell division.

Step 1: Replication error contribution

The per-base-pair mutation rate from replication errors is the product of three sequential escape probabilities (base selection, proofreading, MMR):

$$\mu_{\text{rep}} = \epsilon_{\text{ins}} \times (1 - f_{\text{proof}}) \times (1 - f_{\text{MMR}}) \approx 10^{-4} \times 10^{-2} \times 10^{-2} = 10^{-8}\text{--}10^{-10}$$

Step 2: Damage-induced mutation contribution

For each type of lesion $i$, the fraction that escapes repair before the replication fork arrives depends on the relative rates of damage and repair. The probability a lesion persists at replication is:

$$p_{\text{unrepaired},i} = \frac{k_{\text{dam},i}}{k_{\text{dam},i} + k_{\text{rep},i}}$$

This comes from the steady-state solution: at equilibrium, the fraction of time a site is damaged equals the damage rate divided by the sum of damage and repair rates.

Step 3: Combine with mutagenic potential

Not all unrepaired lesions cause mutations; some block replication (lethal) and some are bypassed accurately. Multiply by the probability $p_{\text{mut},i}$ that the lesion causes a mutation if unrepaired:

$$\mu_{\text{damage}} = \sum_{i} \frac{k_{\text{dam},i}}{k_{\text{dam},i} + k_{\text{rep},i}} \times p_{\text{mut},i}$$

Step 4: Total per-bp mutation rate

The total mutation rate per bp per division is the sum of replication and damage contributions:

$$\mu_{\text{total}} = \mu_{\text{rep}} + \mu_{\text{damage}}$$

Step 5: Scale to whole-genome mutations per generation

Multiply by genome size $G$ to get total mutations per cell division:

$$M = \mu_{\text{total}} \times G$$

For humans: $M \approx 10^{-9} \times 6.4 \times 10^9 \approx 0.5\text{--}1.0$ mutations per division. For E. coli: $M \approx 10^{-10} \times 4.6 \times 10^6 \approx 5 \times 10^{-4}$ mutations per generation (one mutation every ~2,000 generations).

Step 6: Drake's rule

John Drake (1991) observed that the per-genome mutation rate is remarkably constant across DNA-based microbes:

$$\mu \times G \approx 0.003\text{--}0.004 \text{ per genome per generation}$$

This $1/G$ scaling of per-bp rate suggests that the fidelity machinery has evolved to maintain a near-constant genomic mutation rate, balancing the cost of higher fidelity against the benefit of reduced deleterious mutations.

Repair Kinetics: Michaelis-Menten for Glycosylases

When a repair enzyme processes lesions at rate $v$:

$$v = \frac{k_{\text{cat}} [E]_0 [L]}{K_M + [L]}$$

At steady state, damage formation = repair:

$$k_{\text{dam}} = \frac{k_{\text{cat}} [E]_0 [L]_{\text{ss}}}{K_M + [L]_{\text{ss}}}$$

Solving for steady-state lesion level:

$$[L]_{\text{ss}} = \frac{k_{\text{dam}} \cdot K_M}{k_{\text{cat}} [E]_0 - k_{\text{dam}}}$$

Valid only when $k_{\text{cat}} [E]_0 > k_{\text{dam}}$ (repair capacity exceeds damage rate). If damage rate exceeds $V_{\max}$, lesions accumulate without bound.

Derivation: Michaelis-Menten Kinetics for DNA Repair Enzymes

Starting from the elementary reaction scheme for an enzyme (E) binding a substrate lesion (S) to form a complex (ES) that yields product (repaired DNA + free base).

Step 1: Write the reaction scheme

A repair glycosylase E binds a damaged base (lesion L) in DNA to form an enzyme-substrate complex:

$$E + L \underset{k_{-1}}{\overset{k_1}{\rightleftharpoons}} EL \xrightarrow{k_{\text{cat}}} E + P$$

where P is the abasic site product (for monofunctional glycosylases) or the nicked DNA (for bifunctional glycosylases with AP lyase activity).

Step 2: Apply the steady-state approximation

Assume $d[\text{EL}]/dt = 0$ (the ES complex concentration changes slowly compared to its formation and breakdown):

$$k_1[E][L] = (k_{-1} + k_{\text{cat}})[\text{EL}]$$

$$[\text{EL}] = \frac{[E][L]}{K_M} \quad \text{where } K_M = \frac{k_{-1} + k_{\text{cat}}}{k_1}$$

Step 3: Use the enzyme conservation equation

Total enzyme is partitioned between free and bound forms: $[E]_0 = [E] + [\text{EL}]$. Substituting:

$$[E] = [E]_0 - [\text{EL}] = [E]_0 - \frac{[E][L]}{K_M}$$

Solving for [E]: $[E] = \frac{[E]_0 \cdot K_M}{K_M + [L]}$

Step 4: Derive the rate equation

The repair rate is $v = k_{\text{cat}}[\text{EL}]$. Substituting the expressions:

$$v = \frac{k_{\text{cat}}[E]_0[L]}{K_M + [L]} = \frac{V_{\max}[L]}{K_M + [L]}$$

where $V_{\max} = k_{\text{cat}}[E]_0$ is the maximum repair rate at enzyme saturation.

Step 5: Solve for steady-state lesion level

At steady state, damage formation rate equals repair rate: $k_{\text{dam}} = v$:

$$k_{\text{dam}} = \frac{k_{\text{cat}}[E]_0[L]_{\text{ss}}}{K_M + [L]_{\text{ss}}}$$

Rearranging: $k_{\text{dam}}(K_M + [L]_{\text{ss}}) = k_{\text{cat}}[E]_0[L]_{\text{ss}}$

$$[L]_{\text{ss}} = \frac{k_{\text{dam}} \cdot K_M}{k_{\text{cat}}[E]_0 - k_{\text{dam}}} = \frac{k_{\text{dam}} \cdot K_M}{V_{\max} - k_{\text{dam}}}$$

Step 6: Biological interpretation

This result has critical implications: (1) A finite steady state exists only when $V_{\max} > k_{\text{dam}}$ (repair capacity exceeds damage rate). (2) As $k_{\text{dam}} \to V_{\max}$, $[L]_{\text{ss}} \to \infty$ — the system cannot cope. (3) Cells with reduced repair enzyme levels (e.g., haploinsufficiency in heterozygous carriers of repair gene mutations) have a higher steady-state lesion burden, explaining cancer predisposition.

Lesion Accumulation Dynamics

$$\frac{d[L]}{dt} = k_{\text{dam}} - \frac{k_{\text{cat}} [E]_0 [L]}{K_M + [L]}$$

When $[L] \ll K_M$: approximately first-order repair, $\frac{d[L]}{dt} \approx k_{\text{dam}} - \frac{k_{\text{cat}} [E]_0}{K_M} [L]$

Half-life to reach steady state: $t_{1/2} = \frac{K_M \ln 2}{k_{\text{cat}} [E]_0}$

Derivation: DNA Damage Kinetics — First-Order Decay of Lesions

Starting from the regime where lesion concentration is low ($[L] \ll K_M$), the Michaelis-Menten repair simplifies to first-order kinetics, allowing an analytical solution.

Step 1: Linearize the M-M rate at low substrate

When $[L] \ll K_M$, the denominator $K_M + [L] \approx K_M$, so:

$$v_{\text{repair}} \approx \frac{V_{\max}}{K_M}[L] = k_{\text{eff}}[L]$$

where $k_{\text{eff}} = k_{\text{cat}}[E]_0/K_M$ is the effective first-order repair rate constant.

Step 2: Write the ODE for lesion dynamics

The rate of change of lesion number includes both constant damage production and first-order repair:

$$\frac{d[L]}{dt} = k_{\text{dam}} - k_{\text{eff}}[L]$$

Step 3: Solve the ODE (repair only, no new damage)

If damage production stops (e.g., UV source removed) and we track only repair of existing lesions starting from $N_0$ lesions:

$$\frac{dN}{dt} = -k_{\text{eff}} \cdot N \implies N(t) = N_0 \cdot e^{-k_{\text{eff}} t}$$

This exponential decay is observed experimentally: CPD removal by NER shows first-order kinetics with $t_{1/2} \approx 4\text{--}12$ hours in human cells.

Step 4: Solve the full ODE with constant damage

The general solution with initial condition $[L](0) = [L]_0$:

$$[L](t) = \frac{k_{\text{dam}}}{k_{\text{eff}}} + \left([L]_0 - \frac{k_{\text{dam}}}{k_{\text{eff}}}\right) e^{-k_{\text{eff}} t}$$

As $t \to \infty$: $[L] \to k_{\text{dam}}/k_{\text{eff}}$ (the steady state).

Step 5: Time constant and half-life

The system relaxes to steady state with time constant $\tau = 1/k_{\text{eff}}$ and half-life:

$$t_{1/2} = \frac{\ln 2}{k_{\text{eff}}} = \frac{K_M \ln 2}{k_{\text{cat}}[E]_0}$$

For 8-oxoG repair by OGG1 in human cells: $t_{1/2} \approx 30$ min, consistent with rapid turnover of oxidative lesions.

Derivation: Photolyase Quantum Yield

Starting from the photophysical process of CPD repair by photolyase, we derive the quantum yield from photon absorption and electron transfer probabilities.

Step 1: Define the photon absorption rate

The rate of photon absorption by the photolyase-CPD complex depends on the light intensity $I$ (photons/cm$^2$/s) and the absorption cross-section $\sigma$ (cm$^2$):

$$k_{\text{abs}} = \sigma \cdot I$$

The antenna chromophore (MTHF or 8-HDF) has $\sigma \approx 10^{-16}$ cm$^2$ at 380 nm (peak absorption).

Step 2: Energy transfer efficiency

The antenna absorbs a photon and transfers energy to the catalytic FADH$^-$ cofactor via Förster resonance energy transfer (FRET). The transfer efficiency depends on the donor-acceptor distance $r$ and the Förster radius $R_0$:

$$\eta_{\text{FRET}} = \frac{R_0^6}{R_0^6 + r^6}$$

In photolyase, $r \approx 17$ Angstroms and $R_0 \approx 25$ Angstroms, giving $\eta_{\text{FRET}} \approx 0.9\text{--}0.97$.

Step 3: Electron transfer from FADH$^-$ to CPD

Excited FADH$^{-*}$ donates an electron to the CPD with efficiency $\eta_{\text{ET}}$. The electron injection competes with FADH$^{-*}$ fluorescence and non-radiative decay:

$$\eta_{\text{ET}} = \frac{k_{\text{ET}}}{k_{\text{ET}} + k_{\text{fl}} + k_{\text{nr}}}$$

Ultrafast spectroscopy shows $k_{\text{ET}} \approx 10^{12}$ s$^{-1}$ (picosecond timescale), far faster than fluorescence ($k_{\text{fl}} \sim 10^8$ s$^{-1}$), so $\eta_{\text{ET}} > 0.99$.

Step 4: CPD ring splitting probability

Once the CPD receives an electron, the cyclobutane ring cleaves via a radical anion mechanism with probability $\eta_{\text{split}}$. The alternative is back electron transfer to FADH before ring cleavage. Experimentally, $\eta_{\text{split}} \approx 0.85\text{--}0.95$.

Step 5: Overall quantum yield

The quantum yield $\Phi$ is the product of all step efficiencies:

$$\Phi = \eta_{\text{FRET}} \times \eta_{\text{ET}} \times \eta_{\text{split}}$$

$$\Phi \approx 0.95 \times 0.99 \times 0.90 \approx 0.85$$

This is remarkably efficient for a photochemical repair process. Measured quantum yields for E. coli CPD photolyase range from 0.7 to 0.9, consistent with this calculation.

Step 6: Repair rate under sunlight

The overall photoreactivation rate is:

$$k_{\text{repair}} = \sigma \cdot I \cdot \Phi \cdot f_{\text{bound}}$$

where $f_{\text{bound}}$ is the fraction of CPDs bound by photolyase ($= [\text{PL-CPD}] / [\text{CPD}]_{\text{total}}$). Under bright sunlight and saturating photolyase, CPDs can be repaired with a half-life of minutes, compared to hours for NER.

Python: DNA Damage Accumulation vs Repair Simulation

This simulation models the accumulation and repair of DNA lesions using Michaelis-Menten enzyme kinetics. It plots steady-state lesion levels for different repair pathway efficiencies and shows the time-course of lesion dynamics.

Python: DNA Damage Accumulation vs Repair Kinetics

Python

Michaelis-Menten repair kinetics, steady-state lesion levels, pathway efficiency comparison

script.py155 lines

#!/usr/bin/env python3
"""
DNA Damage Accumulation vs Repair Simulation
Models lesion dynamics using Michaelis-Menten enzyme kinetics.
"""
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

# ============================================================
# 1. TIME-COURSE: LESION ACCUMULATION WITH M-M REPAIR
# ============================================================
def lesion_dynamics(t_max, dt, k_dam, kcat, E0, Km):
    """Simulate lesion accumulation with Michaelis-Menten repair."""
    steps = int(t_max / dt)
    t = np.zeros(steps)
    L = np.zeros(steps)
    L[0] = 0.0  # start with no lesions
    for i in range(1, steps):
        t[i] = t[i-1] + dt
        v_repair = kcat * E0 * L[i-1] / (Km + L[i-1]) if L[i-1] > 0 else 0
        dL = k_dam - v_repair
        L[i] = max(0, L[i-1] + dL * dt)
    return t, L

# Parameters (units: lesions, hours)
k_dam = 500.0     # lesions per hour (~12,000 per day)
Km = 50.0         # lesions (Michaelis constant)
kcat_E0_base = 600.0  # Vmax = kcat * [E0] (lesions/hr at full capacity)

# Simulate for different repair capacities
fig, axes = plt.subplots(2, 2, figsize=(14, 11))
fig.patch.set_facecolor('#0f172a')
for ax in axes.flat:
    ax.set_facecolor('#1e293b')

# Panel A: Time course for different Vmax values
ax = axes[0, 0]
vmax_fractions = [1.0, 0.7, 0.5, 0.3, 0.15]
colors = ['#4ade80', '#facc15', '#fb923c', '#f87171', '#c084fc']
labels = ['100% (WT)', '70%', '50%', '30%', '15% (severe)']

for frac, color, label in zip(vmax_fractions, colors, labels):
    t, L = lesion_dynamics(24, 0.01, k_dam, frac * kcat_E0_base, 1.0, Km)
    ax.plot(t, L, color=color, linewidth=2, label=f'Vmax={label}')

ax.set_xlabel('Time (hours)', color='white', fontsize=11)
ax.set_ylabel('Lesions per cell', color='white', fontsize=11)
ax.set_title('A) Lesion Accumulation Over 24 Hours', color='white', fontsize=13, fontweight='bold')
ax.legend(fontsize=9, facecolor='#1e293b', edgecolor='#475569', labelcolor='white')
ax.tick_params(colors='white')
ax.spines['bottom'].set_color('#475569')
ax.spines['left'].set_color('#475569')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_ylim(bottom=0)

# Panel B: Steady-state lesion level vs Vmax
ax = axes[0, 1]
vmax_range = np.linspace(k_dam * 1.01, k_dam * 5, 200)
L_ss = k_dam * Km / (vmax_range - k_dam)
ax.plot(vmax_range / k_dam, L_ss, color='#4ade80', linewidth=2.5)
ax.axhline(y=Km, color='#facc15', linestyle='--', alpha=0.5, label=f'Km = {Km}')
ax.axvline(x=1.0, color='#f87171', linestyle='--', alpha=0.5, label='Vmax = damage rate (critical)')
ax.set_xlabel('Vmax / Damage Rate', color='white', fontsize=11)
ax.set_ylabel('Steady-state lesions', color='white', fontsize=11)
ax.set_title('B) Steady-State Lesions vs Repair Capacity', color='white', fontsize=13, fontweight='bold')
ax.legend(fontsize=9, facecolor='#1e293b', edgecolor='#475569', labelcolor='white')
ax.tick_params(colors='white')
ax.spines['bottom'].set_color('#475569')
ax.spines['left'].set_color('#475569')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_ylim(0, 500)

# Panel C: Multiple damage types with independent repair pathways
ax = axes[1, 0]
damage_types = {
    'Oxidative (8-oxoG)': {'k_dam': 420, 'kcat_E0': 500, 'Km': 40, 'color': '#4ade80'},
    'Depurination (AP sites)': {'k_dam': 210, 'kcat_E0': 800, 'Km': 30, 'color': '#38bdf8'},
    'Deamination (U in DNA)': {'k_dam': 20, 'kcat_E0': 300, 'Km': 20, 'color': '#c084fc'},
    'UV damage (CPDs)': {'k_dam': 100, 'kcat_E0': 150, 'Km': 60, 'color': '#facc15'},
    'Alkylation (O6-meG)': {'k_dam': 10, 'kcat_E0': 50, 'Km': 15, 'color': '#fb923c'},
}

bar_names = []
bar_ss = []
bar_colors = []
for name, params in damage_types.items():
    if params['kcat_E0'] > params['k_dam']:
        ss = params['k_dam'] * params['Km'] / (params['kcat_E0'] - params['k_dam'])
    else:
        ss = float('inf')
    bar_names.append(name.split('(')[0].strip())
    bar_ss.append(min(ss, 500))
    bar_colors.append(params['color'])

bars = ax.barh(bar_names, bar_ss, color=bar_colors, edgecolor='white', linewidth=0.5, height=0.6)
for bar, val in zip(bars, bar_ss):
    ax.text(bar.get_width() + 2, bar.get_y() + bar.get_height()/2,
            f'{val:.1f}', va='center', color='white', fontsize=10)
ax.set_xlabel('Steady-state lesions per cell', color='white', fontsize=11)
ax.set_title('C) Steady-State Lesions by Damage Type', color='white', fontsize=13, fontweight='bold')
ax.tick_params(colors='white')
ax.spines['bottom'].set_color('#475569')
ax.spines['left'].set_color('#475569')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Panel D: Mutation probability vs repair efficiency
ax = axes[1, 1]
repair_eff = np.linspace(0.0, 0.999, 500)
p_mut_base = 0.01  # probability a lesion causes mutation if unrepaired at replication
k_dam_total = 12000  # lesions per day
genome_size = 6.4e9

# Mutations per division (assuming 1 replication per day)
# Fraction unrepaired = 1 - repair_efficiency
mutations_per_div = k_dam_total * (1 - repair_eff) * p_mut_base
mu_per_bp = mutations_per_div / genome_size

ax.semilogy(repair_eff * 100, mutations_per_div, color='#f87171', linewidth=2.5, label='Mutations per cell division')
ax.axhline(y=1.0, color='#facc15', linestyle='--', alpha=0.6, label='~1 mutation / division (observed)')
ax.axvline(x=99.9, color='#4ade80', linestyle='--', alpha=0.4)
ax.set_xlabel('Overall repair efficiency (%)', color='white', fontsize=11)
ax.set_ylabel('Mutations per cell division', color='white', fontsize=11)
ax.set_title('D) Mutations vs Repair Efficiency', color='white', fontsize=13, fontweight='bold')
ax.legend(fontsize=9, facecolor='#1e293b', edgecolor='#475569', labelcolor='white')
ax.tick_params(colors='white')
ax.spines['bottom'].set_color('#475569')
ax.spines['left'].set_color('#475569')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_xlim(90, 100)
ax.set_ylim(0.01, 200)

plt.tight_layout(pad=2.0)
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#0f172a')
plt.close()
print("=== DNA Damage & Repair Simulation Results ===")
print(f"\nDamage rate: {k_dam_total} lesions/cell/day")
print(f"Genome size: {genome_size:.1e} bp")
print(f"\nSteady-state lesion levels by damage type:")
for name, params in damage_types.items():
    if params['kcat_E0'] > params['k_dam']:
        ss = params['k_dam'] * params['Km'] / (params['kcat_E0'] - params['k_dam'])
    else:
        ss = float('inf')
    print(f"  {name}: {ss:.1f} lesions at steady state")
print(f"\nTo maintain ~1 mutation/division from {k_dam_total} daily lesions:")
print(f"  Required repair efficiency: {(1 - 1/(k_dam_total*p_mut_base))*100:.4f}%")
print(f"  Fraction escaping repair: {1/(k_dam_total*p_mut_base):.2e}")
print("\nPlot saved to output.png")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Fortran: Mutation Accumulation Over Cell Generations

This Fortran program models the accumulation of mutations over many cell generations with different DNA repair fidelities. It simulates how deficiencies in specific repair pathways (MMR, BER, NER, HR) lead to exponential mutation accumulation and quantifies the generation at which a critical mutation threshold is reached.

Fortran: Mutation Accumulation Over Cell Generations

Fortran

Multi-pathway repair fidelity model with generational mutation tracking

program.f90133 lines

program mutation_accumulation
  ! ================================================================
  ! Mutation Accumulation Over Cell Generations
  ! Models how DNA repair fidelity affects mutation buildup.
  ! Each generation: new_mutations = damage_rate * (1 - repair_eff) * p_mut
  ! Mutations accumulate additively across generations.
  ! ================================================================
  implicit none

integer, parameter :: dp = selected_real_kind(15, 307)
  integer, parameter :: n_gen = 500         ! number of cell generations
  integer, parameter :: n_scenarios = 6     ! number of repair scenarios
  real(dp), parameter :: genome_bp = 6.4d9  ! human genome size (bp)
  real(dp), parameter :: damage_per_gen = 12000.0_dp  ! DNA lesions per replication cycle
  real(dp), parameter :: p_mut_if_unrepaired = 0.01_dp ! mutation prob per unrepaired lesion

! Repair efficiencies for each scenario
  ! (fraction of lesions successfully repaired before replication)
  character(len=30) :: scenario_names(n_scenarios)
  real(dp) :: repair_eff(n_scenarios)
  real(dp) :: mutations(n_scenarios, 0:n_gen)
  real(dp) :: mu_per_gen(n_scenarios)
  real(dp) :: mu_per_bp_per_gen(n_scenarios)
  integer  :: critical_gen(n_scenarios)  ! generation reaching 1000 mutations
  real(dp), parameter :: critical_threshold = 1000.0_dp

integer :: i, g
  real(dp) :: new_muts

! Define scenarios
  scenario_names(1) = 'Wild type (all repair OK)'
  scenario_names(2) = 'MMR-deficient'
  scenario_names(3) = 'BER-deficient'
  scenario_names(4) = 'NER-deficient (XP)'
  scenario_names(5) = 'HR-deficient (BRCA2-/-)'
  scenario_names(6) = 'Combined MMR+BER defect'

! Overall repair efficiencies
  ! WT: polymerase + proofreading + MMR + BER/NER collectively repair ~99.99%
  repair_eff(1) = 0.99992_dp   ! WT: ~1 mut per division
  repair_eff(2) = 0.9992_dp    ! MMR loss: 100-1000x increase
  repair_eff(3) = 0.9985_dp    ! BER loss: many oxidative lesions persist
  repair_eff(4) = 0.9990_dp    ! NER loss: UV/bulky adducts persist
  repair_eff(5) = 0.9988_dp    ! HR loss: DSBs misrepaired by NHEJ
  repair_eff(6) = 0.9970_dp    ! Combined: synergistic defect

! Calculate mutations per generation for each scenario
  do i = 1, n_scenarios
    mu_per_gen(i) = damage_per_gen * (1.0_dp - repair_eff(i)) * p_mut_if_unrepaired
    mu_per_bp_per_gen(i) = mu_per_gen(i) / genome_bp
  end do

! Simulate accumulation over generations
  critical_gen = -1
  do i = 1, n_scenarios
    mutations(i, 0) = 0.0_dp
    do g = 1, n_gen
      new_muts = mu_per_gen(i)
      mutations(i, g) = mutations(i, g-1) + new_muts
      if (critical_gen(i) < 0 .and. mutations(i, g) >= critical_threshold) then
        critical_gen(i) = g
      end if
    end do
  end do

! ============================================================
  ! Output results
  ! ============================================================
  write(*,'(A)') '================================================================='
  write(*,'(A)') '  MUTATION ACCUMULATION OVER CELL GENERATIONS'
  write(*,'(A)') '  DNA Repair Fidelity Model'
  write(*,'(A)') '================================================================='
  write(*,'(A,F10.1)') '  Genome size (bp):          ', genome_bp
  write(*,'(A,F10.1)') '  Lesions per generation:    ', damage_per_gen
  write(*,'(A,ES10.2)') '  P(mut|unrepaired):         ', p_mut_if_unrepaired
  write(*,'(A,I6)')    '  Generations simulated:     ', n_gen
  write(*,'(A)')       ''

write(*,'(A)') '-----------------------------------------------------------------'
  write(*,'(A)') '  Scenario                  | Repair Eff | Mut/Gen | Mut/bp/Gen'
  write(*,'(A)') '-----------------------------------------------------------------'
  do i = 1, n_scenarios
    write(*,'(2X,A30,A,F8.5,A,F8.3,A,ES10.2)') &
      scenario_names(i), ' | ', repair_eff(i), ' | ', mu_per_gen(i), ' | ', mu_per_bp_per_gen(i)
  end do
  write(*,'(A)') '-----------------------------------------------------------------'
  write(*,'(A)') ''

write(*,'(A)') 'Generations to reach 1000 accumulated mutations:'
  write(*,'(A)') '-----------------------------------------------------------------'
  do i = 1, n_scenarios
    if (critical_gen(i) > 0) then
      write(*,'(2X,A30,A,I6,A)') scenario_names(i), ': ', critical_gen(i), ' generations'
    else
      write(*,'(2X,A30,A,I6,A)') scenario_names(i), ': >', n_gen, ' generations'
    end if
  end do
  write(*,'(A)') ''

! Print mutation counts at selected generations
  write(*,'(A)') 'Accumulated mutations at key generations:'
  write(*,'(A)') '-----------------------------------------------------------------'
  write(*,'(A6)', advance='no') 'Gen'
  do i = 1, n_scenarios
    write(*,'(A14)', advance='no') scenario_names(i)(1:13)
  end do
  write(*,*) ''
  write(*,'(A)') '-----------------------------------------------------------------'

do g = 0, n_gen, 50
    write(*,'(I5,1X)', advance='no') g
    do i = 1, n_scenarios
      write(*,'(F13.1,1X)', advance='no') mutations(i, g)
    end do
    write(*,*) ''
  end do
  write(*,'(A)') '-----------------------------------------------------------------'
  write(*,'(A)') ''

! Fold-increase relative to wild type
  write(*,'(A)') 'Fold-increase in mutation rate relative to wild type:'
  write(*,'(A)') '-----------------------------------------------------------------'
  do i = 2, n_scenarios
    write(*,'(2X,A30,A,F10.1,A)') scenario_names(i), ': ', &
      mu_per_gen(i) / mu_per_gen(1), 'x'
  end do
  write(*,'(A)') '-----------------------------------------------------------------'
  write(*,'(A)') ''
  write(*,'(A)') 'Key insight: Even modest reductions in repair efficiency'
  write(*,'(A)') '(e.g., 99.99% -> 99.9%) cause 10x more mutations per generation,'
  write(*,'(A)') 'reaching cancer-driver thresholds in far fewer divisions.'

end program mutation_accumulation

Click Run to execute the Fortran code

Code will be compiled with gfortran and executed on the server

Summary: DNA Repair Pathways at a Glance

Pathway	Lesion Type	Key Proteins	Fidelity	Disease if Defective
Direct Reversal	O⁶-meG, CPDs	MGMT, Photolyase	Error-free	Glioblastoma sensitivity
BER	Oxidised/deaminated bases, SSBs	OGG1, UNG, APE1, Pol β, Lig III	Error-free	MUTYH polyposis
NER (GG)	Bulky adducts, CPDs, 6-4PPs	XPC, TFIIH, XPF-ERCC1, XPG	Error-free	Xeroderma pigmentosum
NER (TC)	Transcription-blocking lesions	CSA, CSB, TFIIH, XPG	Error-free	Cockayne syndrome
MMR	Mismatches, small IDLs	MSH2/6, MLH1/PMS2, EXO1	Error-free	Lynch syndrome
NHEJ	DSBs	Ku70/80, DNA-PKcs, Lig IV	Error-prone	Severe immunodeficiency
HR	DSBs (S/G2 phase)	MRN, BRCA1/2, RAD51	Error-free	BRCA cancers, Fanconi anaemia
TLS	Replication-blocking lesions	Pol η/ι/κ, Rev1, PCNA-Ub	Error-prone	XP variant (Pol η loss)

← DNA Replication Transcription →

Share:X Reddit LinkedIn