10. DNA Replication & Repair
Semiconservative replication, polymerase fidelity, damage recognition, and repair pathways
Semiconservative Replication
In 1958, Meselson and Stahl demonstrated that DNA replication is semiconservative: each daughter duplex contains one parental strand and one newly synthesized strand. They grew E. coli in $^{15}\text{N}$-labeled medium (heavy nitrogen), then shifted to $^{14}\text{N}$ (light) medium and tracked DNA density by CsCl equilibrium density gradient centrifugation.
Meselson-Stahl Results
- Generation 0: All DNA is heavy-heavy ($^{15}\text{N}$-$^{15}\text{N}$), single band at high density.
- Generation 1: All DNA is heavy-light ($^{15}\text{N}$-$^{14}\text{N}$), single band at intermediate density. This ruled out conservative replication.
- Generation 2: Two bands โ 50% heavy-light and 50% light-light ($^{14}\text{N}$-$^{14}\text{N}$). This ruled out dispersive replication.
Origin of Replication
Replication initiates at specific sequences called origins of replication. In E. coli, the single origin is oriC (~245 bp), containing AT-rich 13-mer repeats (DnaA boxes) that facilitate strand separation. Replication proceeds bidirectionally from oriC, with two replication forks moving in opposite directions until they meet at the terminus region (ter sites).
The entire E. coli chromosome (4.6 Mb) is replicated in ~40 minutes.
The Replication Machinery
DNA replication requires a coordinated ensemble of enzymes and proteins. The central enzyme is DNA polymerase III (Pol III) in E. coli, which synthesizes DNA exclusively in the 5'โ3' direction, adding nucleotides to the 3'-OH of a pre-existing primer.
Key Replication Proteins (E. coli)
- DnaA: Initiator protein, binds oriC and melts AT-rich region
- DnaB (helicase): Unwinds duplex DNA at the fork, moves 5'โ3' on the lagging strand template, powered by ATP hydrolysis
- DnaG (primase): Synthesizes short RNA primers (~10โ12 nt) to provide the 3'-OH that DNA polymerase requires
- SSB (single-strand binding protein): Stabilizes single-stranded DNA, prevents reannealing and nuclease degradation
- DNA Pol III holoenzyme: Main replicase, high processivity ($>$500,000 nt without dissociating), speed ~1000 nt/s
- $\beta$-clamp (sliding clamp): Toroidal dimer that encircles DNA, tethers Pol III for processive synthesis
- Clamp loader ($\gamma$ complex): ATP-dependent loading of $\beta$-clamp onto primed DNA
- Topoisomerase (DNA gyrase): Relieves positive supercoiling ahead of the replication fork
- DNA Pol I: Removes RNA primers (5'โ3' exonuclease) and fills gaps with DNA
- DNA ligase: Seals nicks by forming phosphodiester bonds (uses NAD$^+$ in prokaryotes, ATP in eukaryotes)
Leading Strand
Synthesized continuously in the 5'โ3' direction toward the replication fork. Requires only one RNA primer. The polymerase remains associated with the template.
Lagging Strand
Synthesized discontinuously as Okazaki fragments (~1000โ2000 nt in prokaryotes, ~100โ200 nt in eukaryotes). Each fragment requires a new RNA primer. The lagging-strand polymerase cycles through a "trombone" loop mechanism.
Pyrophosphate (PP$_i$) is rapidly hydrolyzed by pyrophosphatase, making polymerization effectively irreversible.
Proofreading and Fidelity
DNA replication achieves extraordinary accuracy through three successive layers of error correction. Each layer improves fidelity by roughly two orders of magnitude.
Three Levels of Replication Fidelity
Base selection: The polymerase active site discriminates correct from incorrect nucleotides based on Watson-Crick geometry. The "induced fit" mechanism checks the shape of the nascent base pair before catalysis.
Proofreading: DNA Pol III possesses a 3'โ5' exonuclease activity (in the $\varepsilon$ subunit). When a mismatched base is incorporated, the distorted primer terminus moves from the polymerase active site to the exonuclease site, where the incorrect nucleotide is excised. The polymerase then resumes synthesis.
Example:
For the E. coli genome (4.6 $\times$ 10$^6$ bp), the overall error rate of ~10$^{-10}$ per bp per replication means fewer than one mutation per 1000 cell divisions. Given a generation time of ~20 min, this translates to roughly one spontaneous mutation per ~300 hours of continuous growth.
Eukaryotic Replication
Eukaryotic chromosomes are much larger than prokaryotic genomes and are replicated from multiple origins of replication (hundreds to thousands per chromosome). The human genome uses ~30,000โ50,000 origins. Replication is confined to S phase of the cell cycle and is tightly regulated to ensure each origin fires exactly once per cycle.
Eukaryotic DNA Polymerases
- Pol $\alpha$/primase: Initiates replication; synthesizes a short RNA primer followed by ~20 nt of DNA. Low fidelity, no proofreading.
- Pol $\delta$: Primary lagging strand polymerase. Has 3'โ5' proofreading. Requires PCNA sliding clamp.
- Pol $\varepsilon$: Primary leading strand polymerase. Has 3'โ5' proofreading. Highly processive.
Replication Licensing
The licensing system prevents re-replication by allowing origin firing only once per S phase:
- G1 phase: ORC (Origin Recognition Complex) binds origins. Cdc6 and Cdt1 recruit the MCM2โ7 helicase complex โ pre-replicative complex (pre-RC) is "licensed."
- S phase: CDK and DDK phosphorylation activates MCM helicase, firing the origin. Cdc6 is degraded and Cdt1 is inhibited by geminin.
- G2/M: Licensing factors remain inhibited, preventing re-replication until the next G1.
The End-Replication Problem and Telomerase
Because DNA polymerase requires a primer and synthesizes 5'โ3', the 5' end of each newly synthesized strand cannot be completed after primer removal. This leads to progressive shortening of chromosome ends (telomeres) with each round of replication โ the end-replication problem.
Telomerase is a ribonucleoprotein reverse transcriptase that extends telomeric repeats. It consists of TERT (telomerase reverse transcriptase) and TR/TERC (RNA template component):
Telomerase is highly active in germ cells, stem cells, and ~85โ90% of cancers, but is largely inactive in most somatic cells. Telomere shortening acts as a mitotic clock contributing to replicative senescence (Hayflick limit).
DNA Damage
DNA is constantly subjected to chemical modification from both endogenous metabolic processes and exogenous environmental agents. Without repair, damage would rapidly overwhelm genome integrity.
Spontaneous DNA Damage (per human cell per day)
- Depurination: Hydrolytic loss of purine bases, creating apurinic (AP) sites. Rate: ~5,000 events/cell/day. The N-glycosidic bond is cleaved spontaneously under physiological conditions.
- Deamination: Hydrolytic removal of amino groups. Cytosine $\to$ uracil (~100โ500/day), adenine $\to$ hypoxanthine, 5-methylcytosine $\to$ thymine (a major source of CโT transition mutations at CpG sites).
- Oxidative damage: Reactive oxygen species (ROS) from aerobic metabolism. Major product: 8-oxoguanine (8-oxoG), which mispairs with adenine causing G:C $\to$ T:A transversions. Estimated ~10,000โ100,000 oxidative lesions/cell/day.
Induced DNA Damage
- UV radiation: Causes cyclobutane pyrimidine dimers (CPDs) and 6โ4 photoproducts between adjacent pyrimidines. CPDs distort the helix and block replication and transcription.
- Alkylating agents: Add alkyl groups to bases (e.g., O$^6$-methylguanine from temozolomide or MNNG). Cause mispairings and cross-links.
- Intercalating agents: Flat aromatic molecules (ethidium bromide, acridine orange) insert between base pairs, causing frameshift mutations during replication.
- Ionizing radiation: X-rays and $\gamma$-rays generate double-strand breaks (DSBs) through free radical intermediates. DSBs are the most cytotoxic lesion.
DNA Repair Mechanisms
Cells possess multiple overlapping repair pathways, each specialized for different types of damage. The undamaged complementary strand typically serves as a template for accurate repair.
Base Excision Repair (BER)
Repairs small base modifications (deamination, oxidation, alkylation). The most common repair pathway.
- DNA glycosylase recognizes and removes the damaged base, cleaving the N-glycosidic bond to create an AP site
- AP endonuclease (APE1) nicks the backbone 5' to the AP site
- DNA polymerase $\beta$ (Pol $\beta$) fills the single-nucleotide gap (short-patch BER) or Pol $\delta$/$\varepsilon$ fills 2โ10 nt (long-patch BER)
- DNA ligase III/XRCC1 (short-patch) or ligase I (long-patch) seals the nick
There are ~11 different DNA glycosylases in humans, each specific for particular lesions (e.g., UNG for uracil, OGG1 for 8-oxoG).
Nucleotide Excision Repair (NER)
Repairs bulky, helix-distorting lesions (UV photoproducts, chemical adducts). Removes an oligonucleotide segment containing the damage.
Prokaryotic (UvrABC system)
- UvrA/UvrB scan for damage
- UvrB verifies the lesion and recruits UvrC
- UvrC makes dual incisions: 8 nt 5' and 4โ5 nt 3' to lesion
- UvrD (helicase II) removes the 12โ13 nt fragment
- Pol I fills the gap; ligase seals
Eukaryotic (XP system)
- XPC-HR23B detects damage (global genome NER)
- TFIIH (XPB/XPD helicases) unwinds ~30 bp
- XPG cuts 3' and ERCC1-XPF cuts 5' to damage
- Removes 24โ32 nt oligomer
- Pol $\delta$/$\varepsilon$ fills; ligase I seals
Mutations in XP genes cause xeroderma pigmentosum โ extreme UV sensitivity and >1000-fold increased skin cancer risk.
Mismatch Repair (MMR)
Corrects replication errors (mismatches and small insertion/deletion loops) that escape polymerase proofreading.
- E. coli: MutS recognizes the mismatch; MutL mediates signaling; MutH nicks the unmethylated (newly synthesized) strand at the nearest hemimethylated GATC site. The segment is excised and resynthesized.
- Eukaryotes: MSH2/MSH6 (MutS$\alpha$) or MSH2/MSH3 (MutS$\beta$) recognize mismatches. MLH1/PMS2 (MutL$\alpha$) coordinate excision. Strand discrimination uses replication-associated nicks rather than methylation.
Loss of MMR genes (MLH1, MSH2) causes Lynch syndrome (hereditary nonpolyposis colorectal cancer, HNPCC) with microsatellite instability.
Double-Strand Break (DSB) Repair
DSBs are the most dangerous form of DNA damage. Two main pathways:
Homologous Recombination (HR)
- High fidelity (uses sister chromatid as template)
- Active in S/G2 phase
- End resection by MRN complex (Mre11-Rad50-Nbs1)
- Rad51 filament formation (RecA homolog)
- Strand invasion, D-loop, synthesis, resolution
BRCA1/BRCA2 are key HR proteins; their loss causes breast/ovarian cancer susceptibility.
Non-Homologous End Joining (NHEJ)
- Error-prone (may delete or insert bases)
- Active throughout the cell cycle (predominates in G1)
- Ku70/Ku80 bind broken ends
- DNA-PKcs recruited for end processing
- Ligase IV/XRCC4 seal the break
NHEJ is fast but imprecise. It is the primary DSB repair pathway in human cells.
Genetic Recombination
Homologous recombination involves the exchange of genetic information between DNA molecules with extensive sequence similarity. It is essential for meiotic crossover, DSB repair, and rescue of stalled replication forks.
Holliday Model of Recombination
- Strand invasion: A single strand from one duplex invades the homologous duplex, forming a displacement loop (D-loop)
- Holliday junction: A four-way DNA junction connecting the two duplexes at the point of strand exchange
- Branch migration: The junction moves along the DNA, extending the region of heteroduplex
- Resolution: The junction is cleaved by resolvases (RuvC in E. coli). Cutting in two different orientations yields either crossover (flanking markers exchanged) or non-crossover (gene conversion only) products
RecBCD Pathway (E. coli)
The RecBCD enzyme complex processes double-strand breaks to initiate homologous recombination:
- RecBCD binds a blunt DNA end, unwinds and degrades both strands (helicase + nuclease)
- Upon encountering a Chi site (5'-GCTGGTGG-3'), RecBCD activity changes: degradation of the 3'-ending strand stops, and RecA is loaded onto this strand
- The RecA-coated single strand performs strand invasion into the homologous duplex
Site-Specific Recombination
Unlike homologous recombination, site-specific recombination occurs at defined short sequences and does not require extensive homology:
- Integrases: Bacteriophage $\lambda$ integrase catalyzes phage DNA integration into/excision from the host chromosome at att sites
- Cre-lox system: Cre recombinase (from phage P1) recombines DNA at 34-bp loxP sites. Widely used as a tool for conditional gene knockouts in research
- FLP-FRT system: FLP recombinase acts at FRT sites; similar applications to Cre-lox
Key Concepts
- Semiconservative replication: Each daughter DNA retains one parental strand (Meselson-Stahl experiment with $^{15}\text{N}/^{14}\text{N}$ density labeling).
- Replication fork: Leading strand is continuous; lagging strand is synthesized as Okazaki fragments. Key players: helicase, primase, DNA Pol III, $\beta$-clamp, SSB, ligase.
- Three tiers of fidelity: Base selection (~10$^{-5}$), 3'โ5' proofreading (~10$^{-7}$), mismatch repair (~10$^{-9}$ to 10$^{-10}$).
- Eukaryotic replication: Multiple origins, Pol $\alpha$/$\delta$/$\varepsilon$, PCNA sliding clamp. Licensing (ORC/MCM) ensures once-per-cycle firing.
- Telomerase (TERT + TR) extends TTAGGG repeats, counteracting the end-replication problem. Active in germ cells and most cancers.
- Spontaneous damage: Depurination (~5,000/day), deamination (~100โ500/day), oxidative lesions (8-oxoG). Induced: UV dimers, alkylation, DSBs.
- Repair pathways: BER (small base lesions), NER (bulky adducts, XP proteins), MMR (replication mismatches, MutS/L/H), HR and NHEJ (double-strand breaks).
- Homologous recombination: Holliday junction, branch migration, resolution. RecBCD/Chi pathway in prokaryotes. Site-specific recombination: Cre-lox, integrase systems.