14. Transcription
Transcription converts the DNA blueprint into RNA โ the first step of gene expression. RNA polymerase II transcribes protein-coding genes into mRNA, which is extensively processed (capped, spliced, polyadenylated) before export to the ribosome. The regulation of transcription is the primary mechanism controlling cell identity and function.
RNA Polymerase II: The mRNA Machine
Eukaryotes have three nuclear RNA polymerases, each responsible for different RNA types. RNA Polymerase II (Pol II), a 12-subunit enzyme, transcribes all protein-coding genes and most snRNAs. Roger Kornberg determined its structure by X-ray crystallography (Nobel Prize 2006).
Eukaryotic RNA Polymerases
- Pol I: rRNA (28S, 18S, 5.8S) โ nucleolus; ~80% of total RNA
- Pol II: mRNA, snRNA, miRNA โ nucleoplasm; target of $\alpha$-amanitin
- Pol III: tRNA, 5S rRNA, small RNAs โ nucleoplasm
The CTD: A Regulatory Platform
Pol II's largest subunit (Rpb1) has a unique C-terminal domain (CTD) consisting of tandem repeats of the heptapeptide sequence:
Humans have 52 repeats; yeast have 26. The phosphorylation state of the CTD coordinates the transcription cycle: Ser-5 phosphorylation (by TFIIH/CDK7) marks initiation and recruits capping enzyme; Ser-2 phosphorylation (by P-TEFb/CDK9) marks productive elongation and recruits splicing and polyadenylation factors.
Derivation 1: Transcription Initiation โ Assembling the PIC
Pol II cannot recognize promoters alone; it requires general transcription factors (GTFs) to form the pre-initiation complex (PIC). The assembly follows an ordered pathway:
Key Steps
- TBP (TATA-binding protein): a subunit of TFIID that recognizes the TATA box (consensus: TATAAANN, ~25-30 bp upstream of TSS). TBP inserts into the minor groove and sharply bends the DNA (~80 degrees), creating a platform for PIC assembly.
- TFIIB: bridges TBP and Pol II; recognizes the BRE (TFIIB recognition element) and positions Pol II at the start site.
- TFIIH: a 10-subunit complex with two critical enzymatic activities โ CDK7 (Ser-5 kinase for CTD phosphorylation) and XPB/XPD (helicases that melt ~11 bp of DNA at the start site using ATP hydrolysis).
Energetics of Open Complex Formation
TFIIH's helicase activity uses ATP to separate the DNA strands at the transcription start site, forming the open complex:
The energy from ATP hydrolysis is more than sufficient to melt the DNA. After forming the initial RNA transcript (~10 nt), Pol II clears the promoter and enters productive elongation. Many genes experience promoter-proximal pausing at +20 to +60, regulated by NELF/DSIF; P-TEFb (CDK9) phosphorylates these factors and CTD Ser-2 to release Pol II into productive elongation.
Prokaryotic vs Eukaryotic Transcription
The fundamental chemistry of transcription is conserved across all domains of life, but the regulatory mechanisms differ dramatically between prokaryotes and eukaryotes:
Bacterial RNA Polymerase
The bacterial RNAP holoenzyme ($\alpha_2\beta\beta'\omega\sigma$) is a single enzyme responsible for all RNA synthesis. The $\sigma$ factor determines promoter specificity:
- $\sigma^{70}$: housekeeping genes; recognizes -10 (TATAAT, Pribnow box) and -35 (TTGACA) elements
- $\sigma^{32}$: heat shock genes; activated upon temperature upshift
- $\sigma^{54}$: nitrogen-regulated genes; requires activator (NtrC) and ATP hydrolysis for open complex formation
- $\sigma^{28}$: flagellar genes; regulated by anti-$\sigma$ factor FlgM
Rifampicin, a key antibiotic for tuberculosis, binds the $\beta$ subunit and blocks transcript extension beyond 2-3 nucleotides. Its selectivity for bacterial RNAP over eukaryotic Pol II (which differs structurally in the rifampicin-binding pocket) enables therapeutic use. Resistance mutations in rpoB are the primary mechanism of rifampicin resistance in Mycobacterium tuberculosis.
Key Differences: Prokaryotic vs Eukaryotic Transcription
- Coupling: in bacteria, transcription and translation are coupled (ribosomes bind mRNA as it is being transcribed); in eukaryotes, they are spatially separated (nucleus vs cytoplasm)
- mRNA processing: bacterial mRNAs are not capped, spliced, or polyadenylated; eukaryotic mRNAs require all three
- Chromatin: bacteria lack histones (most use HU/IHF for DNA organization); eukaryotes must deal with nucleosomal barriers
- Regulation complexity: bacterial operons typically respond to 1-2 signals; eukaryotic enhancers can integrate dozens of signals
- Promoter recognition: bacterial $\sigma$ factor binds directly; eukaryotic Pol II requires 6 general transcription factors (GTFs)
Derivation 2: Promoter Architecture and Enhancers
Gene expression is controlled by the interplay of core promoter elements (within ~40 bp of TSS), proximal promoter elements (~100-200 bp upstream), and distal regulatory elements (enhancers and silencers, up to 1 Mb away).
Core Promoter Elements
- TATA box (TATAAANN, position -25 to -30): present in ~10-20% of human promoters; associated with tissue-specific, tightly regulated genes
- Initiator (Inr) (YYANWYY, overlaps TSS): can function independently of TATA box
- DPE (downstream promoter element, +28 to +33): works cooperatively with Inr in TATA-less promoters
- CpG islands: GC-rich regions (~60% GC, ~1 kb) found at ~70% of human promoters, especially housekeeping genes. Unmethylated CpG islands recruit CXXC domain proteins and maintain an open chromatin state.
Enhancers: Long-Range Regulatory Elements
Enhancers are ~200-1000 bp sequences that activate transcription independent of distance (up to 1 Mb) and orientation. They were first discovered by Walter Schaffner in 1981 (SV40 enhancer). Enhancers contain clusters of transcription factor binding sites and communicate with promoters through DNA looping, mediated by:
- Mediator complex: a ~30-subunit complex that bridges enhancer-bound activators to Pol II at the promoter
- Cohesin: ring-shaped complex that holds enhancer-promoter loops together
- CTCF: insulator protein that defines topological domain boundaries (TADs), restricting enhancer-promoter interactions
The quantitative effect of multiple enhancers on transcription rate can follow different models:
Derivation 3: mRNA Processing โ Cap, Splice, Tail
Eukaryotic pre-mRNA undergoes three essential processing events, all coupled to transcription via the Pol II CTD.
5' Capping
The 5' cap is added co-transcriptionally when the nascent RNA is ~20-30 nt long. The 7-methylguanosine cap (m$^7$G) is added in three steps:
The cap protects mRNA from 5' exonuclease degradation, promotes ribosome binding (via eIF4E), and facilitates nuclear export.
Splicing: The Spliceosome
Most eukaryotic genes are split into exons (expressed sequences) and introns (intervening sequences), discovered by Phillip Sharp and Richard Roberts in 1977 (Nobel Prize 1993). Introns are removed by the spliceosome, a massive ribonucleoprotein complex (~3 MDa) containing 5 snRNAs (U1, U2, U4, U5, U6) and >100 proteins.
Intron splice sites are defined by conserved sequences:
- 5' splice site: GU (donor) โ recognized by U1 snRNA
- Branch point: conserved A (adenine), 18-40 nt upstream of 3' SS โ recognized by U2 snRNA
- 3' splice site: AG (acceptor) โ polypyrimidine tract upstream, recognized by U2AF
The splicing mechanism involves two transesterification reactions:
3' Polyadenylation
The 3' end of mRNA is defined not by Pol II termination, but by cleavage and polyadenylation. The polyadenylation signal (AAUAAA) is recognized by CPSF (cleavage and polyadenylation specificity factor):
Poly(A) polymerase (PAP) adds ~200 adenylate residues. The poly(A) tail protects from 3' exonuclease degradation and is bound by poly(A)-binding protein (PABP), which promotes translation initiation by circularizing the mRNA (via interaction with eIF4G).
Alternative Splicing: Expanding the Proteome
Alternative splicing allows a single gene to encode multiple protein isoforms by varying which exons are included in the mature mRNA. An estimated ~95% of human multi-exon genes undergo alternative splicing, generating a proteome far larger than the ~20,000 genes in the genome.
Types of Alternative Splicing
- Exon skipping (cassette exon): most common type (~40%); an exon is included or excluded entirely
- Alternative 5' splice site: two different 5' splice sites within an exon; changes the 3' boundary of the upstream exon
- Alternative 3' splice site: two different 3' splice sites; changes the 5' boundary of the downstream exon
- Intron retention: an intron is retained in the mature mRNA (common in plants; less so in mammals)
- Mutually exclusive exons: one of two exons is always included, but never both
Regulation of Splicing
Splicing is regulated by splicing regulatory elements (exonic/intronic splicing enhancers and silencers) bound by SR proteins (promote exon inclusion) and hnRNP proteins (promote exon skipping):
The combinatorial action of dozens of splicing regulators, together with RNA secondary structure, transcription speed (kinetic coupling model), and chromatin state, determines tissue-specific and developmental-stage-specific splicing patterns. Approximately 15% of disease-causing point mutations affect splicing โ many more than previously thought, as mutations in exonic splicing enhancers are often classified as "synonymous" but have pathological consequences through aberrant splicing.
Clinical Example: Spinal Muscular Atrophy
SMA is caused by homozygous loss of the SMN1 gene. The nearly identical SMN2 gene differs by a single C $\rightarrow$ T transition in exon 7 that disrupts an exonic splicing enhancer, causing ~90% exon 7 skipping:
Three groundbreaking therapies target this: nusinersen (antisense oligonucleotide redirecting SMN2 splicing), onasemnogene abeparvovec (gene therapy replacing SMN1), and risdiplam (small molecule splicing modifier). These represent the therapeutic potential of understanding splicing mechanisms.
Derivation 4: Chromatin Remodeling and the Histone Code
DNA in the nucleus is packaged into chromatin โ 147 bp of DNA wrapped ~1.65 turns around a histone octamer (H2A, H2B, H3, H4) forming the nucleosome. This packaging is inherently repressive: Pol II and GTFs cannot access nucleosomal DNA without chromatin remodeling.
Histone Modifications: Writers, Readers, Erasers
The "histone code" hypothesis (Strahl and Allis, 2000) proposes that combinations of histone tail modifications are "read" by effector proteins to determine transcriptional state:
Activating Marks
- H3K4me3: marks active promoters; read by TAF3 (TFIID subunit) and CHD1 chromodomain
- H3K27ac: marks active enhancers; written by p300/CBP acetyltransferase
- H3K36me3: marks actively transcribed gene bodies; written by SETD2, recruited by elongating Pol II
Repressive Marks
- H3K27me3: Polycomb repressive mark; written by PRC2 (EZH2), read by PRC1 (chromodomain)
- H3K9me3: constitutive heterochromatin; written by SUV39H1, read by HP1 (chromodomain)
Acetylation and Transcription
Histone acetylation neutralizes the positive charge of lysine residues, weakening the histone-DNA interaction:
The change in electrostatic interaction energy can be estimated. Each acetylated lysine removes one positive charge, reducing the DNA-histone binding energy by approximately:
With ~8-12 acetylatable lysines per nucleosome, full acetylation significantly destabilizes the nucleosome, facilitating Pol II passage. This explains why histone acetyltransferases (HATs) are transcriptional coactivators and histone deacetylases (HDACs) are corepressors.
Transcription Elongation: Speed, Pausing, and Regulation
After promoter clearance, Pol II enters the elongation phase. Elongation is not a simple continuous process โ it involves regulated pausing, backtracking, and arrest that serve critical regulatory and quality control functions.
Promoter-Proximal Pausing
At many genes (~30-40% in humans), Pol II pauses 20-60 bp downstream of the TSS. This promoter-proximal pausing is maintained by NELF (negative elongation factor) and DSIF (DRB sensitivity-inducing factor):
Pausing serves multiple functions: it maintains an "open" promoter (nucleosome-free) for rapid re-initiation, provides a checkpoint for mRNA capping (the cap must be added before pause release), and allows rapid, synchronized induction of signal-responsive genes (the rate-limiting step shifts from PIC assembly to pause release).
Elongation Rate and Co-Transcriptional Processing
Pol II elongation rate in vivo is approximately 1-4 kb/min (~30-70 nt/sec), though it varies significantly along the gene body. For long genes, transcription takes considerable time:
This means the dystrophin gene takes nearly an entire cell cycle to transcribe โ so it cannot be expressed during rapid cell division. Elongation rate also affects alternative splicing: slower elongation gives upstream splice sites more time to be recognized, favoring exon inclusion (the "kinetic coupling" model). Drugs that affect elongation rate (e.g., camptothecin, which causes Pol II pausing at Top1 cleavage sites) can alter splicing patterns genome-wide.
Transcription Termination
Pol II transcription termination for mRNA genes is coupled to 3' end processing. Two models have been proposed:
- Torpedo model: after cleavage at the poly(A) site, the 5' end of the downstream RNA is unprotected. The exonuclease Rat1/XRN2 degrades this RNA from the 5' end, eventually catching up with and dislodging Pol II โ like a torpedo.
- Allosteric model: passage through the poly(A) signal causes conformational changes in the Pol II elongation complex, reducing processivity and leading to spontaneous dissociation.
Current evidence supports a unified model incorporating both mechanisms: allosteric changes slow Pol II after the poly(A) site, while Rat1/XRN2 catches up and provides the final "push" for dissociation. Importantly, Pol II often transcribes thousands of base pairs beyond the poly(A) cleavage site before terminating โ this "readthrough transcription" can interfere with downstream genes if termination is defective.
Transcription and RNA Processing in Disease
Mutations affecting transcription and RNA processing cause a remarkable diversity of human diseases, underscoring the complexity and importance of gene expression regulation.
Thalassemias
Mutations in globin gene regulatory elements or splice sites cause quantitative defects in globin chain production. A single G $\rightarrow$ A mutation at the first nucleotide of IVS-1 in the $\beta$-globin gene abolishes the normal 5' splice site, causing $\beta^0$-thalassemia. Cryptic splice site activation produces aberrant mRNA species that are degraded by NMD.
Cockayne Syndrome
Mutations in CSA or CSB proteins impair transcription-coupled NER (TC-NER) โ the preferential repair of lesions on the transcribed strand of active genes. Patients show UV sensitivity, growth failure, neurodegeneration, and premature aging without increased cancer risk (unlike XP).
TERT Promoter Mutations in Cancer
Somatic mutations in the TERT promoter (C228T, C250T) create de novo ETS transcription factor binding sites, reactivating telomerase expression. Found in ~80% of glioblastomas, 70% of melanomas, 60% of hepatocellular carcinomas. The most common non-coding mutation in cancer.
Rett Syndrome
Mutations in MECP2 (methyl-CpG binding protein 2) cause this severe neurodevelopmental disorder (X-linked, primarily affects girls). MECP2 reads DNA methylation and recruits NCoR/HDAC complexes for gene silencing. Loss of MECP2 derepresses thousands of genes in neurons, disrupting synaptic function.
Derivation 5: DNA Methylation and Epigenetic Inheritance
In mammals, ~70-80% of CpG dinucleotides are methylated at the 5-position of cytosine. DNA methylation is generally associated with transcriptional silencing and is maintained through cell division by DNMT1 (maintenance methyltransferase).
Maintenance vs De Novo Methylation
- DNMT1 (maintenance): recognizes hemimethylated DNA after replication and methylates the new strand, preserving the methylation pattern through cell division โ the basis of epigenetic inheritance
- DNMT3A/3B (de novo): establish new methylation patterns during embryonic development; mutations in DNMT3B cause ICF syndrome (immunodeficiency, centromeric instability, facial anomalies)
Mechanisms of Silencing
DNA methylation silences transcription through two mechanisms:
- Direct: 5mC in the major groove blocks binding of some transcription factors
- Indirect: methyl-CpG binding proteins (MeCP2, MBD1-4) recruit HDACs and chromatin remodelers, establishing repressive chromatin
Active demethylation occurs via the TET enzymes (TET1/2/3), which oxidize 5mC to 5-hydroxymethylcytosine (5hmC), then further to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC), which are removed by base excision repair:
Pervasive Transcription and the Non-Coding Transcriptome
The ENCODE project (2012) revealed that ~80% of the human genome is transcribed in at least some cell type, far exceeding the ~2% that encodes proteins. This pervasive transcription produces tens of thousands of non-coding RNA species with diverse regulatory functions.
Enhancer RNAs (eRNAs)
Active enhancers are bidirectionally transcribed, producing short (~0.5-2 kb), unstable, non-polyadenylated enhancer RNAs (eRNAs). eRNAs appear to facilitate enhancer-promoter looping and Mediator recruitment:
eRNA production is now the gold standard for identifying active enhancers genome-wide (as opposed to poised enhancers that have H3K4me1 but not H3K27ac and lack eRNA transcription).
Circular RNAs (circRNAs)
A surprising discovery of the 2010s was that thousands of genes produce circular RNAs โ covalently closed loops formed by backsplicing (joining of a downstream 5' splice site to an upstream 3' splice site). circRNAs are resistant to exonuclease degradation and can function as miRNA sponges (e.g., CDR1as/ciRS-7 sequesters miR-7), protein scaffolds, or translational templates via IRES-dependent mechanisms.
Antisense Transcription
Many protein-coding genes have natural antisense transcripts (NATs) โ RNA molecules transcribed from the opposite strand. NATs can regulate their sense counterparts through RNA:RNA duplex formation (triggering RNAi or blocking translation), transcriptional interference (convergent Pol II complexes collide), or chromatin modification (recruiting PRC2 to silence the sense gene). An estimated 30-40% of human genes have associated NATs.
Mitochondrial Transcription
The mitochondrial genome (mtDNA, 16.6 kb circular DNA in humans) encodes 13 essential subunits of the electron transport chain, 22 tRNAs, and 2 rRNAs. Mitochondrial transcription differs fundamentally from nuclear transcription:
- Single-subunit RNA polymerase (POLRMT): related to T7 phage RNA polymerase, not to nuclear Pol II
- Polycistronic transcription: both strands are transcribed as long polycistronic precursors that are processed by RNase P and ELAC2 to release individual tRNAs and mRNAs ("tRNA punctuation model")
- No introns in human mtDNA; minimal non-coding sequence; the genetic code differs slightly from the universal code (e.g., UGA = Trp, not Stop)
- TFAM (mitochondrial transcription factor A): both a transcription factor and a major mtDNA packaging protein, coating the entire mtDNA molecule
Mitochondrial transcription is regulated primarily at the level of TFAM abundance: more TFAM = more mtDNA copies = more transcription. Mutations in mtDNA or nuclear-encoded mitochondrial genes cause mitochondrial diseases (MELAS, MERRF, Leber hereditary optic neuropathy) with maternal inheritance and variable penetrance due to heteroplasmy (mixture of mutant and wild-type mtDNA).
Applications in Medicine and Research
Splicing Diseases
~15% of disease-causing mutations affect splicing. Spinal muscular atrophy (SMA) results from loss of SMN1; the backup gene SMN2 produces mostly exon 7-skipped mRNA. Nusinersen (Spinraza) is an antisense oligonucleotide that redirects SMN2 splicing to include exon 7, producing functional SMN protein.
Epigenetic Cancer Therapy
DNMT inhibitors (azacitidine, decitabine) reactivate silenced tumor suppressor genes in MDS and AML. HDAC inhibitors (vorinostat, romidepsin) restore normal gene expression in cutaneous T-cell lymphoma. EZH2 inhibitors (tazemetostat) target PRC2 in follicular lymphoma with EZH2 mutations.
$\alpha$-Amanitin Poisoning
$\alpha$-Amanitin from death cap mushrooms (Amanita phalloides) binds Pol II with $K_d \approx 3$ nM, blocking translocation. It causes delayed liver failure (24-48 h) because existing mRNAs must be degraded before transcription arrest becomes lethal. Treatment: supportive care, N-acetylcysteine, liver transplant.
CDK Inhibitors in Cancer
CDK7 and CDK9 are essential for transcription. THZ1 (CDK7 inhibitor) selectively kills cancer cells addicted to super-enhancer-driven oncogene transcription. Flavopiridol (CDK9 inhibitor) blocks P-TEFb, causing global transcription shutdown โ selectively toxic to cancer cells with short-lived anti-apoptotic mRNAs (Mcl-1).
Historical Context
Francois Jacob and Jacques Monod proposed the concept of messenger RNA in 1961, predicting that a transient RNA intermediate carries genetic information from DNA to ribosomes. The discovery of split genes (introns/exons) by Sharp and Roberts in 1977 was one of the most surprising findings in molecular biology โ overturning the assumption that genes were contiguous DNA sequences. Roger Kornberg's structural determination of the Pol II transcribing complex (2001) revealed the molecular basis of the transcription mechanism at atomic resolution.
Transcription Factor Mutations in Disease
Mutations in transcription factors cause a wide spectrum of developmental disorders and cancers:
PAX Genes
PAX6 mutations cause aniridia (absent iris); PAX3 mutations cause Waardenburg syndrome (deafness, pigmentation defects). These illustrate how haploinsufficiency of developmental TFs causes congenital malformations.
p53: The Guardian of the Genome
TP53 is mutated in ~50% of all cancers. p53 is activated by DNA damage and activates transcription of p21 (cell cycle arrest), PUMA/BAX (apoptosis), and MDM2 (negative feedback). Li-Fraumeni syndrome (germline TP53 mutations) causes early-onset cancers.
Fusion Transcription Factors
Chromosomal translocations create oncogenic TF fusions: BCR-ABL (CML, targeted by imatinib), PML-RARA (APL, treated by ATRA), EWS-FLI1 (Ewing sarcoma). These rewire gene expression programs to drive proliferation.
MYC: The Master Oncogene
MYC amplification or translocation occurs in many cancers. MYC binds ~15% of all promoters and amplifies transcription of active genes (rather than activating silent genes). It also promotes RNA Pol I and Pol III transcription, ribosome biogenesis, and metabolic reprogramming.
RNA Modifications: The Epitranscriptome
Beyond DNA and histone modifications, RNA itself carries >170 different chemical modifications. The most abundant internal mRNA modification is N$^6$-methyladenosine (m$^6$A), which affects mRNA splicing, export, stability, and translation. m$^6$A is written by the METTL3/METTL14 complex, erased by FTO and ALKBH5 demethylases, and read by YTHDF1-3 proteins. Dysregulation of m$^6$A is implicated in AML (METTL3), glioblastoma (ALKBH5), and obesity (FTO โ the first obesity susceptibility gene identified by GWAS, though its effect is mediated through IRX3/5 regulation rather than m$^6$A directly).
Python Simulations
Promoter Elements and mRNA Processing
PythonExplore the effect of promoter mutations on transcription rate and compare pre-mRNA vs mature mRNA sizes across different genes.
Click Run to execute the Python code
Code will be executed with Python 3 on the server
Histone Modifications and Chromatin States
PythonVisualize the epigenetic code: how histone modifications and chromatin acetylation state control gene expression.
Click Run to execute the Python code
Code will be executed with Python 3 on the server
Alternative Splicing and CTD Phosphorylation Cycle
PythonCompare alternative splicing frequency across species and visualize the Pol II CTD phosphorylation cycle during the transcription process.
Click Run to execute the Python code
Code will be executed with Python 3 on the server
Key Takeaways
- RNA Pol II transcribes mRNA; its CTD coordinates initiation (Ser-5-P), elongation (Ser-2-P), and RNA processing.
- PIC assembly (TFIID/TBP $\rightarrow$ TFIIB $\rightarrow$ Pol II $\rightarrow$ TFIIH) at the core promoter initiates transcription; TFIIH melts DNA using ATP.
- Enhancers activate transcription over long distances via DNA looping, Mediator, and cohesin; CTCF defines TAD boundaries.
- mRNA processing: 5' cap (m$^7$G), splicing (spliceosome removes introns via two transesterification reactions), 3' poly(A) tail.
- The histone code: activating marks (H3K4me3, H3K27ac) and repressive marks (H3K27me3, H3K9me3) are written, read, and erased by specific enzymes.
- DNA methylation (5mC at CpG) silences genes and is maintained through replication by DNMT1; reversed by TET enzymes.
- Alternative splicing (~95% of multi-exon genes) vastly expands proteomic diversity; regulated by SR proteins and hnRNPs.
- Promoter-proximal pausing (NELF/DSIF) allows rapid gene induction upon signal; released by P-TEFb (CDK9).
- Pervasive transcription produces eRNAs (enhancer function), circRNAs (miRNA sponges), and NATs (antisense regulation).
- Mitochondrial transcription uses a single-subunit polymerase (POLRMT); regulated by TFAM abundance; mtDNA uses a variant genetic code.
- Epigenetic therapies (azacitidine, vorinostat, tazemetostat) exploit the reversibility of epigenetic marks in cancer treatment.