Part 8: Gene Expression Regulation

Controlling Gene Activity

Gene regulation determines when, where, and how much of a gene product is made. Every cell in a multicellular organism carries the same ~20,000 protein-coding genes, yet a hepatocyte, a neuron, and a lymphocyte express radically different proteomes. This chapter explores the multi-layered regulatory architecture that achieves such specificity — from prokaryotic operons to eukaryotic chromatin remodeling, epigenetic memory, post-transcriptional control, and signal transduction cascades that relay extracellular information to the genome.

Regulation occurs at every level: transcription initiation (the dominant control point), transcript elongation and processing, mRNA export and stability, translation initiation, and post-translational modification/degradation. Combinatorial logic allows ~1,500 transcription factors to specify ~200 distinct human cell types.

1. Prokaryotic Gene Regulation

1.1 The Lac Operon — Jacob-Monod Model

The lac operon of E. coli (Jacob & Monod, 1961; Nobel Prize 1965) is the paradigmatic example of inducible gene regulation. It encodes three structural genes needed for lactose catabolism:

lacZ

Beta-galactosidase — cleaves lactose into glucose + galactose. Also converts lactose to allolactose (the true inducer). ~1,023 aa tetramer.

lacY

Lactose permease — H⁺/lactose symporter, 12 transmembrane helices. Transports lactose into the cell against its concentration gradient.

lacA

Thiogalactoside transacetylase — acetylates non-metabolizable thiogalactosides for detoxification/export. Less essential than lacZ/lacY.

Regulatory Elements

lacI (repressor gene): Constitutively expressed from its own promoter (P_i). The LacI repressor is a homotetramer; each monomer has an N-terminal DNA-binding domain (HTH motif) and a C-terminal inducer-binding/oligomerization domain. The tetramer binds the operator (O₁) with K_d ~ 10^-13 M and can simultaneously contact auxiliary operators O₂ and O₃ via DNA looping, achieving ~70-fold tighter repression than O₁ alone.

Allolactose (inducer): Allolactose (1,6-O-beta-D-galactopyranosyl-D-glucose) binds the core domain of each LacI monomer, triggering a conformational change that reduces operator affinity by ~1,000-fold. IPTG (isopropyl-beta-D-thiogalactopyranoside) is a non-hydrolyzable synthetic inducer used in laboratories — it cannot be cleaved by beta-galactosidase, so induction is maintained at constant concentration.

CAP/cAMP positive regulation: Catabolite activator protein (CAP, also called CRP) is a homodimer that binds cAMP. When glucose is absent, adenylate cyclase produces cAMP; the CAP-cAMP complex binds upstream of the lac promoter (-61 to -72) and bends DNA ~90 degrees, making direct contacts with the alpha-CTD of RNA polymerase. This increases RNAP binding affinity ~20-50 fold.

Catabolite Repression and Diauxic Growth

When both glucose and lactose are present, glucose is used preferentially. Glucose transport via the PTS (phosphotransferase system) keeps EIIA^Glc in its dephosphorylated form, which (a) inhibits adenylate cyclase, lowering cAMP, and (b) directly inhibits LacY permease via “inducer exclusion.” Only when glucose is exhausted does cAMP rise, CAP activates the lac promoter, and lactose enters — producing the characteristic diauxic growth curve with a lag phase between glucose and lactose consumption.

Four States of the Lac Operon

Glucose	Lactose	cAMP	Repressor	Transcription
+	-	Low	Bound	Off (no inducer, no CAP)
+	+	Low	Released	Basal (no CAP activation)
-	-	High	Bound	Off (repressor blocks)
-	+	High	Released	Maximal (full induction)

1.2 The Trp Operon — Repression and Attenuation

The trp operon encodes five enzymes (TrpE, TrpD, TrpC, TrpB, TrpA) for tryptophan biosynthesis. Unlike the inducible lac operon, trp is repressible — it is normally ON and turned OFF when tryptophan is abundant. It uses dual control:

Tryptophan Repressor (TrpR)

TrpR is an aporepressor — inactive alone. Tryptophan acts as a corepressor: binding to TrpR induces a conformational change that positions the HTH DNA-binding motif to fit the operator. The Trp-TrpR complex reduces transcription ~70-fold. This is an example of feedback inhibition at the genetic level.

Attenuation (~8-10 fold)

A 162-nt leader sequence (trpL) contains a short ORF encoding a 14-aa leader peptide with two tandem Trp codons. Four regions (1-2-3-4) can form alternative RNA secondary structures:

- High Trp: Ribosome translates quickly through Trp codons, covers region 2 → regions 3-4 form a terminator hairpin → transcription stops
- Low Trp: Ribosome stalls at Trp codons in region 1, region 2 is free → 2-3 antiterminator forms → RNAP reads through into structural genes

Combined repression (~70x) and attenuation (~8-10x) give ~600-700-fold total regulation. Attenuation is unique to prokaryotes because it requires coupled transcription-translation (no nuclear membrane). Similar attenuation mechanisms regulate the his, leu, phe, and ilvGMEDA operons.

1.3 Two-Component Signal Transduction

Bacteria use two-component systems (TCS) as the primary mechanism for sensing and responding to environmental stimuli. ~30 TCS pairs exist in E. coli; ~200 in Myxococcus xanthus.

Sensor Histidine Kinase (HK)

Typically a homodimeric transmembrane protein. Stimulus detection by the periplasmic sensor domain triggers autophosphorylation on a conserved His residue in the cytoplasmic kinase domain (ATP-dependent). Example: EnvZ senses osmolarity changes.

Response Regulator (RR)

Receives the phosphoryl group on a conserved Asp residue (receiver domain). Phosphorylation activates the effector domain (often a DNA-binding domain). Example: OmpR~P activates ompC (small porins, high osmolarity) and represses ompF (large porins). Phosphatase activity of the HK resets the system.

More complex phosphorelays (His → Asp → His → Asp) exist in sporulation signaling (B. subtilis KinA/Spo0F/Spo0B/Spo0A) and in the chemotaxis system (CheA/CheY), enabling additional checkpoints and integration of multiple signals.

2. Eukaryotic Transcriptional Regulation

2.1 Transcription Factors — Modular Architecture

Eukaryotic transcription factors (TFs) have a modular structure: a DNA-binding domain (DBD) and one or more activation/repression domains (AD). These domains function independently — demonstrated by Brent & Ptashne (1985) domain-swap experiments.

DNA-Binding Domains

Helix-Turn-Helix (HTH)

The recognition helix inserts into the major groove. Found in homeodomains (60 aa, three helices), which specify body plan in development (Hox genes). The third helix makes base-specific contacts.

Zinc Finger (C2H2)

~30 aa motif: Cys₂-His₂ coordinates Zn²⁺, forming a compact beta-beta-alpha fold. Each finger contacts ~3 bp. Tandem arrays (e.g., TFIIIA with 9 fingers) wrap around DNA. Most common DBD in human TFs (~700 genes).

Zinc Finger (C4 / Nuclear Receptor)

Two Cys₄ zinc modules. Found in steroid/thyroid hormone receptors (ER, GR, RAR). The first zinc module provides base-specific contacts; the second mediates dimerization. Receptors bind hormone response elements (HREs) as homo- or heterodimers.

Leucine Zipper / bZIP

Leucine residues every 7 aa form a coiled-coil dimerization interface. The basic region N-terminal to the zipper grips DNA like a pair of forceps. Examples: Jun/Fos (AP-1), CREB. Heterodimerization expands regulatory specificity.

Basic Helix-Loop-Helix (bHLH)

Two amphipathic helices connected by a loop mediate dimerization; the basic region binds E-box motifs (CANNTG). Critical in myogenesis (MyoD), neurogenesis (NeuroD), and circadian rhythms (CLOCK/BMAL1). Some HLH proteins (Id) lack the basic region and act as dominant-negative inhibitors.

Activation Domains

Acidic

Rich in Asp/Glu. VP16 (HSV) has a potent acidic AD that recruits TFIIB, TFIIH, and mediator. Often intrinsically disordered, folding upon binding.

Glutamine-Rich

Sp1 contains glutamine-rich ADs. These contact TAFs (TBP-associated factors) within TFIID. Moderate strength.

Proline-Rich

CTF/NF-1 uses a proline-rich AD. These form rigid, extended structures (PPII helix) that interact with coactivators.

2.2 Enhancers, Silencers, and Insulators

Enhancers are cis-regulatory elements that can activate transcription over large distances (up to ~1 Mb), in either orientation, independent of position relative to the promoter. Each enhancer is a cluster of TF binding sites (~200-500 bp).

Enhanceosome Model

The IFN-beta enhancer requires cooperative, ordered assembly of multiple TFs (NF-kappaB, IRFs, ATF-2/c-Jun) into a precise stereospecific complex. Even half-turn helical spacing changes abolish activity. An all-or-none switch.

Billboard Model

Many developmental enhancers function additively — each bound TF contributes independently. The enhancer “displays” information that is read by the promoter. Partial occupancy gives graded output rather than binary switching.

Enhancer-Promoter Communication

Enhancers contact promoters via DNA looping, mediated by cohesin and the mediator complex. Chromosome conformation capture (3C/4C/Hi-C) has revealed that genomes are organized into topologically associating domains (TADs) (~100 kb – 1 Mb), within which enhancer-promoter interactions are favored.

Insulator Elements and CTCF

CTCF (CCCTC-binding factor) is a versatile 11-zinc-finger protein that defines TAD boundaries. It binds in an orientation-dependent manner and forms loops with cohesin (loop extrusion model). CTCF insulators can: (1) block enhancer-promoter communication when placed between them, and (2) act as barriers preventing heterochromatin spreading. The H19/Igf2 imprinting locus is a classic example where CTCF binding (maternal allele only, unmethylated) controls allele-specific enhancer access.

2.3 The Mediator Complex

Mediator is a ~1.4 MDa complex of ~26-30 subunits (in mammals) that serves as a bridge between gene-specific TFs and the RNA Polymerase II pre-initiation complex (PIC). It was first identified in yeast by Kornberg's lab.

Head Module

Contacts RNAP II and TFIIB/TFIIF. Med17 is essential and contacts Rpb3 subunit of Pol II. Core scaffold for PIC assembly.

Middle Module

Structural backbone connecting head and tail. Contains Med14 (scaffold subunit) and Med1 (nuclear receptor interaction).

Tail Module

Interfaces with gene-specific activators. Med15 contacts VP16, Gcn4 acidic ADs. Med23 contacts ELK1 (MAPK-responsive).

A dissociable CDK8 kinase module (Med12, Med13, CycC, CDK8) can associate reversibly and generally represses transcription by phosphorylating Pol II CTD at non-productive sites and preventing PIC assembly. CDK8 is an oncogene in colorectal cancer.

2.4 ATP-Dependent Chromatin Remodeling

Nucleosomes are barriers to transcription. ATP-dependent chromatin remodelers use the energy of ATP hydrolysis to alter histone-DNA contacts, thereby regulating DNA accessibility. All remodelers contain a conserved ATPase domain of the Snf2/SWI2 superfamily.

SWI/SNF Family (BAF/PBAF)

11-15 subunits. Catalytic subunit: BRG1 or BRM (human). Can slide nucleosomes along DNA, eject histones entirely, or restructure the octamer (e.g., create hexasomes). Contains a bromodomain (recognizes acetyl-lysine). Mutated in ~20% of human cancers (BAF complex is a major tumor suppressor).

ISWI Family

Primarily slides nucleosomes to create evenly spaced arrays (chromatin assembly/maturation). Senses linker DNA length via HAND-SANT-SLIDE domain. Important in DNA replication-coupled chromatin assembly (ACF/CHRAC complexes).

CHD Family

Contains tandem chromodomains (bind methylated histones). CHD1 recognizes H3K4me3 at active promoters. NuRD complex (CHD3/4) uniquely couples remodeling with HDAC activity — can both reposition and deacetylate nucleosomes.

INO80/SWR1 Family

Specialized in histone variant exchange. SWR1 replaces H2A with H2A.Z at promoters (destabilizes nucleosomes, facilitates TF access). INO80 can reverse this exchange. Also involved in DNA damage repair (remodels nucleosomes at double-strand breaks).

3. Epigenetic Regulation

3.1 Histone Post-Translational Modifications

The N-terminal tails of histones (especially H3 and H4) protrude from the nucleosome core and are subject to extensive covalent modifications. These modifications recruit effector proteins (“readers”) and alter chromatin structure.

Acetylation

Writers (HATs): p300/CBP (coactivator, acetylates H3K27 and many other sites), GCN5/PCAF (SAGA complex, H3K9/K14), Tip60/HBO1 (H4K5/K8/K12/K16). Acetylation neutralizes the positive charge on lysine, weakening histone-DNA electrostatic interactions and directly opening chromatin.

Erasers (HDACs): Class I (HDAC1/2/3/8, nuclear, Rpd3-like), Class II (HDAC4/5/6/7/9/10, shuttle in/out of nucleus), Class III (Sirtuins, NAD⁺-dependent), Class IV (HDAC11). HDAC inhibitors (vorinostat, romidepsin) are FDA-approved anticancer drugs.

Readers: Bromodomains (~60 in humans) recognize acetyl-lysine. BRD4 (member of BET family) binds acetylated histones and recruits P-TEFb to release paused Pol II. BET inhibitors (JQ1) are in clinical trials for cancer.

Methylation

Writers (HMTs): SET-domain methyltransferases (Su(var)3-9, E(z), Trithorax) use SAM as methyl donor. Lysine can be mono-, di-, or trimethylated (each state has distinct readers). DOT1L uniquely methylates H3K79 (globular domain) and lacks a SET domain. Arginine methylation by PRMTs (PRMT1, CARM1) adds another regulatory layer.

Erasers: LSD1/KDM1A (FAD-dependent amine oxidase, demethylates H3K4me1/me2 and H3K9me1/me2). JMJD family (Jumonji C domain, Fe²⁺/alpha-ketoglutarate-dependent dioxygenases) can remove all methylation states including trimethyl marks. ~30 JmjC demethylases in humans.

Readers: Chromodomains (HP1 binds H3K9me3, Polycomb binds H3K27me3), Tudor domains (53BP1 binds H4K20me2 at DNA damage sites), PHD fingers (ING2 binds H3K4me3 and recruits HDAC-containing Sin3a complex — linking activating marks to repression).

Other Modifications

Phosphorylation

H3S10ph (Aurora B kinase) — chromosome condensation in mitosis. gamma-H2AX (H2AX-S139ph, ATM/ATR kinases) — marks DNA double-strand breaks, recruits repair factors over megabase domains. H3T3ph by Haspin for kinetochore assembly.

Ubiquitination

H2AK119ub1 (PRC1/Ring1B) — Polycomb-mediated gene silencing. H2BK120ub1 (RNF20/40) — required for H3K4 and H3K79 methylation (trans-histone crosstalk), promotes transcription elongation. Deubiquitinases: USP22 (SAGA), BAP1 (PR-DUB).

3.2 The Histone Code Hypothesis

Proposed by Strahl & Allis (2000): combinations of histone modifications on one or multiple tails are read by effector proteins to produce distinct downstream outcomes. While the strict “code” analogy is debated, specific marks clearly correlate with chromatin states.

Mark	Location	Function	Key Writer/Reader
H3K4me3	Active promoters	Transcription initiation	SET1/MLL (writer), TAF3/ING (reader)
H3K4me1	Active/poised enhancers	Enhancer marking	MLL3/4 (writer)
H3K36me3	Gene bodies (transcribed)	Elongation, suppresses cryptic initiation	SETD2 (writer), DNMT3B (reader)
H3K27ac	Active enhancers/promoters	Distinguishes active from poised enhancers	p300/CBP (writer), BRD4 (reader)
H3K27me3	Polycomb-repressed regions	Facultative heterochromatin, developmental silencing	EZH2/PRC2 (writer), PRC1 chromo (reader)
H3K9me3	Constitutive heterochromatin	Pericentromeric silencing, TE suppression	SUV39H1/2 (writer), HP1 chromo (reader)
H3K4me3 + H3K27me3	Bivalent promoters (ESCs)	Poised genes, ready for activation or silencing	MLL + PRC2

3.3 DNA Methylation

In mammals, ~70-80% of CpG dinucleotides are methylated at the 5-position of cytosine (5mC). However, CpG islands (CGIs — regions >200 bp with CpG observed/expected >0.6) at promoters of ~60% of genes are typically unmethylated, allowing transcription.

DNA Methyltransferases (Writers)

DNMT1: Maintenance methyltransferase. Recruited to replication forks by UHRF1 (recognizes hemi-methylated CpG). Copies methylation pattern to the new strand. Essential for epigenetic inheritance through cell division.

DNMT3A/3B: De novo methyltransferases. Establish new methylation patterns during development (embryonic implantation, germ cell specification). DNMT3L is a catalytically inactive paralog that stimulates DNMT3A/3B and reads unmethylated H3K4 (linking histone state to DNA methylation).

TET Enzymes (Erasers)

TET1/2/3 are Fe²⁺/alpha-ketoglutarate-dependent dioxygenases that oxidize 5mC → 5-hydroxymethylcytosine (5hmC) → 5-formylcytosine (5fC) → 5-carboxylcytosine (5caC). 5fC/5caC are excised by thymine DNA glycosylase (TDG) and repaired by base excision repair (BER), completing active demethylation.

5hmC is enriched at enhancers and gene bodies of actively transcribed genes, particularly in neurons (~40% of modified cytosines in Purkinje cells are 5hmC). TET2 is one of the most commonly mutated genes in hematological malignancies.

Biological Roles

Genomic Imprinting

~100 imprinted genes in mammals show parent-of-origin-specific expression. Differentially methylated regions (DMRs) established in germ cells control allele-specific expression. Examples: Igf2/H19 (paternal/maternal expression), Prader-Willi/Angelman syndromes (chromosome 15q11-13 imprinted region).

X-Chromosome Inactivation

In female mammals, one X is silenced (Lyon hypothesis). Xist lncRNA coats the inactive X, recruiting PRC2 (H3K27me3) and DNMT3B (DNA methylation) for stable silencing. The Barr body is the cytological manifestation. ~15% of genes escape inactivation.

4. Post-Transcriptional Regulation

4.1 mRNA Stability and Degradation

mRNA half-lives range from minutes (c-fos, c-myc) to days (beta-globin). Stability is determined by cis-elements and trans-acting factors.

Protective Elements

The 5' m7G cap is bound by eIF4E, protecting from 5'→3' exonucleases (Xrn1). The 3' poly(A) tail (150-200 nt initially) is bound by PABPC1 proteins that circularize mRNA via eIF4G interaction, enhancing translation and stability.

Degradation Pathway

Major pathway: Deadenylation (CCR4-NOT complex, Pan2-Pan3) →Decapping (DCP1/DCP2 with activators Dhh1/Pat1) →5'→3' exonucleolytic decay (Xrn1). Alternative: 3'→5' decay by the exosome complex (10-subunit ring + Rrp44/Dis3 catalytic subunit).

AU-Rich Elements (AREs)

Located in 3' UTRs of many short-lived mRNAs (cytokines: TNF-alpha, IL-2; proto-oncogenes: c-fos, c-myc). Contain AUUUA pentamers (often clustered). Bound by destabilizing factors (TTP/tristetraprolin, AUF1/hnRNPD) that recruit the CCR4-NOT deadenylase, or by stabilizing factors (HuR/ELAVL1) that compete for ARE binding under stress conditions.

P-Bodies (Processing Bodies)

Cytoplasmic RNA-protein granules enriched in decay machinery (Dcp1/2, Xrn1, CCR4-NOT), translational repressors, and Argonaute. mRNAs in P-bodies are translationally silenced and may be degraded or returned to active translation. P-bodies form via liquid-liquid phase separation (LLPS) driven by multivalent RNA-protein interactions.

4.2 MicroRNA (miRNA) Pathway

MicroRNAs are ~22 nt non-coding RNAs that post-transcriptionally silence target mRNAs. ~2,600 mature miRNAs annotated in the human genome; each can regulate hundreds of targets. Over 60% of human protein-coding genes are predicted miRNA targets.

Step 1 — Transcription: miRNA genes are transcribed by Pol II as long primary transcripts (pri-miRNA) with 5' cap and poly(A) tail. Many are in introns of protein-coding genes (mirtrons can bypass Drosha processing).

Step 2 — Nuclear processing: The Microprocessor complex (Drosha RNase III + DGCR8/Pasha) cleaves the pri-miRNA stem-loop (~65 nt pre-miRNA hairpin). DGCR8 recognizes the ssRNA-dsRNA junction. Exported to cytoplasm by Exportin-5/Ran-GTP.

Step 3 — Cytoplasmic processing: Dicer (RNase III + PAZ domain) cleaves the loop from the pre-miRNA, generating a ~22 bp miRNA duplex with 2-nt 3' overhangs.

Step 4 — RISC assembly: The guide strand (selected by thermodynamic asymmetry — less stable 5' end) is loaded into Argonaute (Ago2 in mammals). The passenger strand (*) is expelled and degraded. The seed sequence (nucleotides 2-8 from the 5' end) is critical for target recognition via Watson-Crick pairing to the 3' UTR.

Step 5 — Silencing: In animals, miRNAs typically cause translational repression (blocking eIF4A scanning or 60S joining) and mRNA deadenylation/decay via GW182/TNRC6 recruitment of CCR4-NOT. Perfect complementarity (rare in animals, common in plants) triggers Ago2 “slicer” endonucleolytic cleavage of the mRNA.

4.3 RNA Interference (RNAi)

Discovered by Fire & Mello (1998, Nobel 2006) in C. elegans. Exogenous long dsRNA is processed by Dicer into ~21 nt siRNAs that are loaded into RISC/Ago2. Unlike miRNAs, siRNAs have perfect complementarity to their targets and trigger endonucleolytic cleavage between positions 10-11 of the guide strand.

Therapeutic Applications

Patisiran (Alnylam, 2018) — first FDA-approved RNAi drug. LNP-delivered siRNA targeting hepatocyte TTR mRNA for hereditary transthyretin amyloidosis. Inclisiran (2021) — GalNAc-conjugated siRNA targeting PCSK9 mRNA for hypercholesterolemia (twice-yearly dosing). Givosiran (2019) — targets ALAS1 for acute hepatic porphyria. The GalNAc-siRNA platform enables hepatocyte-specific delivery via the asialoglycoprotein receptor.

4.4 RNA-Binding Proteins (RBPs)

~1,500 RBPs in the human genome orchestrate every aspect of RNA metabolism. Key families:

Splicing Regulators

SR proteins (SRSF1-12): contain one or two RRM domains and an RS (arginine-serine) domain. Bind exonic splicing enhancers (ESEs) and promote exon inclusion by recruiting U2AF and U1 snRNP.

hnRNP proteins (A1, A2/B1, C, etc.): generally antagonize SR proteins. Bind exonic/intronic splicing silencers (ESS/ISS) to promote exon skipping. The SR/hnRNP ratio at a given exon determines inclusion/exclusion.

IRE/IRP System (Iron Homeostasis)

Iron Response Elements (IREs) are ~30-nt stem-loop structures in UTRs. When iron is low, Iron Regulatory Proteins (IRP1/2) bind IREs:

- 5' UTR IRE (ferritin, ferroportin): IRP binding blocks ribosome scanning → represses translation (stores less iron)
- 3' UTR IREs (transferrin receptor, DMT1): IRP binding stabilizes mRNA → increased protein → more iron uptake
- When iron is abundant, IRP1 assembles a [4Fe-4S] cluster and becomes cytosolic aconitase; IRP2 is ubiquitinated (FBXL5) and degraded

5. Signal Transduction to Gene Expression

5.1 MAPK Cascade: Ras → Raf → MEK → ERK

The mitogen-activated protein kinase (MAPK) cascade is a three-tiered kinase relay that amplifies extracellular growth factor signals and transmits them to the nucleus.

1. Receptor activation: Growth factor (e.g., EGF) binds receptor tyrosine kinase (EGFR) → dimerization → trans-autophosphorylation of cytoplasmic tails → SH2 domain of Grb2 binds phosphotyrosine → SOS (GEF) is recruited to the membrane.

2. Ras activation: SOS catalyzes GDP → GTP exchange on Ras (small GTPase, membrane-anchored via farnesyl group). Active Ras-GTP recruits Raf (MAPKKK) to the membrane, relieving its autoinhibition. RasGAPs (NF1) accelerate GTP hydrolysis to inactivate Ras. Oncogenic mutations (G12V, G13D, Q61L) impair GTPase activity — found in ~30% of cancers.

3. Kinase cascade: Raf (Ser/Thr kinase) phosphorylates and activates MEK1/2 (dual-specificity kinase, phosphorylates Thr and Tyr). MEK activates ERK1/2 (MAPK). Each kinase activates many molecules of the next — providing signal amplification (estimated 100-1000 fold per tier).

4. Nuclear translocation: Activated ERK dimerizes and translocates to the nucleus, where it phosphorylates transcription factors: Elk-1 (ternary complex factor, activates c-fos), c-Myc (stabilization), RSK (which phosphorylates CREB). Immediate-early genes (c-fos, c-jun, Egr-1) are induced within minutes and encode TFs that activate delayed-early genes.

5.2 JAK-STAT Pathway

The fastest pathway from membrane to gene activation — no second messenger cascade. Used by cytokines (interferons, interleukins), growth hormone, and erythropoietin.

Activation

Cytokine binding induces receptor dimerization → associated JAKs (Janus kinases: JAK1/2/3, TYK2) trans-phosphorylate each other → JAKs phosphorylate receptor cytoplasmic tails → STAT proteins (STAT1-6) bind via SH2 domains → JAKs phosphorylate STAT on a conserved tyrosine → pSTATs dimerize (reciprocal SH2-pTyr interaction) → translocate to nucleus → bind GAS elements (gamma-activated sequence, TTN₅AA).

Negative Regulation

SOCS proteins (Suppressors of Cytokine Signaling): induced by STAT activation (negative feedback). SOCS1 directly inhibits JAK; SOCS3 binds phosphorylated receptor. Both recruit E3 ubiquitin ligase for proteasomal degradation. PIAS proteins: SUMOylate STATs, inhibiting DNA binding. SHP1/2 phosphatases: dephosphorylate JAKs and receptors.

5.3 Additional Signaling Pathways

Wnt/Beta-Catenin

Without Wnt: destruction complex (APC, Axin, GSK3-beta, CK1) phosphorylates beta-catenin → ubiquitination (beta-TrCP) → proteasomal degradation. With Wnt: Frizzled/LRP5/6 receptor engagement recruits Dishevelled, sequestering the destruction complex → beta-catenin accumulates → enters nucleus → displaces Groucho from TCF/LEF → activates target genes (c-myc, cyclin D1, Axin2). Constitutive activation (APC mutations) drives ~80% of colorectal cancers.

Notch Signaling

Juxtacrine signaling: Delta/Jagged ligand on one cell binds Notch receptor on adjacent cell → ADAM10/TACE metalloprotease (S2 cleavage) → gamma-secretase (presenilin, S3 cleavage) releases Notch intracellular domain (NICD) → NICD enters nucleus, binds CSL/RBP-Jkappa, recruits MAML coactivator → activates Hes/Hey target genes (lateral inhibition in neurogenesis, T-cell/B-cell fate decisions).

Hedgehog (Hh)

Without Hh: Patched (Ptc) inhibits Smoothened (Smo) → Gli transcription factors are proteolytically processed to repressor forms (Gli-R) by PKA/CK1/GSK3. With Hh: Hh binds Ptc, relieving Smo inhibition → Smo accumulates in primary cilium → full-length Gli activators (Gli-A) enter nucleus. Targets: Ptc1, Gli1 (feedback), Cyclin D/E. Mutations cause basal cell carcinoma (Ptc loss) and medulloblastoma.

6. Combinatorial Control and Transcriptional Condensates

6.1 How ~1,500 TFs Regulate ~20,000 Genes

No single TF acts alone. Gene expression is determined by the combinatorial logic of multiple TFs binding to enhancers and promoters. This explains how a limited TF repertoire generates extraordinary regulatory diversity:

1.Cooperative binding: TFs bind DNA synergistically through protein-protein interactions (e.g., Oct4/Sox2 on the Nanog enhancer). Cooperativity sharpens the dose-response curve (Hill coefficient > 1).
2.Heterodimerization: bZIP and bHLH families use combinatorial dimerization. With N monomers, up to N(N+1)/2 distinct dimers are possible. Each dimer has different DNA-binding specificity and transcriptional activity.
3.Context-dependent activity: The same TF can activate or repress depending on cofactors, post-translational modifications, and chromatin context. E.g., glucocorticoid receptor activates anti-inflammatory genes but represses AP-1 targets via tethering.
4.Enhancer logic: Most developmental genes are controlled by multiple enhancers, each active in a different tissue/time. The even-skipped (eve) stripe 2 enhancer in Drosophila integrates inputs from Bicoid, Hunchback (activators) and Kruppel, Giant (repressors) to produce a sharp stripe of expression.

6.2 Phase Separation in Transcription

A paradigm shift in understanding transcriptional regulation has emerged from the discovery that many transcriptional regulators undergo liquid-liquid phase separation (LLPS), forming membraneless condensates at active loci.

Super-Enhancer Condensates

Super-enhancers (SEs) are clusters of enhancers densely loaded with Med1, BRD4, and TFs. The intrinsically disordered regions (IDRs) of Med1, Oct4, GCN4, and the Pol II CTD form phase-separated condensates that concentrate the transcriptional machinery. These condensates are disrupted by 1,6-hexanediol and are sensitive to CDK7-mediated Pol II CTD phosphorylation (which transfers Pol II from the initiation condensate to an elongation condensate).

Biological Significance

Phase separation may explain: (a) how enhancers activate transcription at distance (condensate bridges the gap), (b) transcriptional bursting (condensate formation/dissolution), (c) SE sensitivity to perturbation (phase transitions are cooperative and switch-like), (d) oncogene addiction (cancer cells depend on SE condensates at driver oncogenes). The concept extends to heterochromatin (HP1alpha condensates), nucleoli, and Polycomb bodies.

7. Mathematical Models of Gene Regulation

7.1 Hill Equation for Cooperative TF Binding

When a transcription factor binds cooperatively to multiple sites on a promoter, the fractional occupancy follows the Hill equation:

$$f([TF]) = \frac{[TF]^n}{K_d^n + [TF]^n}$$

where f is the fraction of promoters occupied, [TF] is the TF concentration, K_d is the dissociation constant (TF concentration at half-maximal occupancy), and n is the Hill coefficient. When n = 1, binding is non-cooperative (hyperbolic); n > 1 indicates positive cooperativity (sigmoidal response); n < 1 indicates negative cooperativity.

For the lac repressor tetramer binding cooperatively to the operator, n ~ 2. For some developmental TFs, n can exceed 4, creating ultrasensitive switch-like responses.

Derivation: Hill Function from the MWC Allosteric Model

Starting from the Monod-Wyman-Changeux (MWC) concerted model of allostery, we show how the Hill equation emerges as an approximation.

Step 1: Define the MWC two-state model

A protein with n identical subunits exists in two conformations: T (tense, low affinity) and R (relaxed, high affinity), in equilibrium characterized by L = [T₀]/[R₀]:

$$L = \frac{[T_0]}{[R_0]} \qquad c = \frac{K_R}{K_T} \ll 1 \qquad \alpha = \frac{[S]}{K_R}$$

Step 2: Write the MWC binding function

Each subunit binds ligand independently within its state. The exact MWC fractional saturation is:

$$\bar{Y} = \frac{Lc\alpha(1+c\alpha)^{n-1} + \alpha(1+\alpha)^{n-1}}{L(1+c\alpha)^n + (1+\alpha)^n}$$

Step 3: Take the extreme cooperativity limit (c → 0)

When the T state has negligible ligand affinity (c ≈ 0), the T terms in the numerator vanish:

$$\bar{Y} \approx \frac{\alpha(1+\alpha)^{n-1}}{L + (1+\alpha)^n}$$

Step 4: Further simplify for saturating conditions

For large L (strong T-state preference) and focusing on the transition region where α ∼ L^1/n, the binding curve becomes switch-like. When (1+α)ⁿ ≈ αⁿ for moderate α:

$$\bar{Y} \approx \frac{\alpha^n}{L + \alpha^n} = \frac{[S]^n / K_R^n}{L + [S]^n / K_R^n} = \frac{[S]^n}{L \cdot K_R^n + [S]^n}$$

Step 5: Identify the Hill equation

Defining K_d,eff = L^1/n · K_R as the effective half-saturation constant:

$$\bar{Y} \approx \frac{[S]^n}{K_{d,\text{eff}}^n + [S]^n} \qquad \text{(Hill equation with } n_H = n\text{)}$$

Step 6: Interpret the Hill coefficient

In the MWC model, the apparent Hill coefficient n_H depends on L and c. Maximum cooperativity (n_H → n) occurs when c → 0 and L is large. In practice, n_H < n because binding is not perfectly concerted. For hemoglobin (n = 4), n_H ≈ 2.8. For transcription factors, effective cooperativity can exceed the number of binding sites when combined with DNA looping or multimerization.

$$1 \leq n_H \leq n \qquad \text{(always between non-cooperative and maximum)}$$

7.2 Thermodynamic Model of Gene Regulation

The thermodynamic (statistical mechanical) approach models gene expression by summing Boltzmann weights over all possible promoter states. For a promoter with an activator (A) and repressor (R), the probability of RNAP being bound is:

$$P_{\text{RNAP}} = \frac{\frac{[P]}{K_P}\left(1 + \frac{[A]}{K_A} \cdot \omega_{AP}\right)}{Z}$$

$$Z = 1 + \frac{[P]}{K_P} + \frac{[A]}{K_A} + \frac{[R]}{K_R} + \frac{[P][A]}{K_P K_A}\omega_{AP} + \frac{[P][R]}{K_P K_R}\omega_{RP} + \cdots$$

Here Z is the partition function summing all promoter configurations, K_P, K_A, K_R are dissociation constants for RNAP, activator, and repressor respectively, and omega terms are cooperative interaction energies. The rate of transcription is proportional to P_RNAP. This framework (Bintu, Buchler, Garcia et al., 2005) unifies activation, repression, and combinatorial regulation into a single formalism.

Derivation: Lac Operon Thermodynamic Model

Starting from statistical mechanics, we derive the probability of RNAP being bound to the lac promoter as a function of repressor and inducer concentrations.

Step 1: Enumerate all promoter states

The lac promoter can exist in four states: (1) empty, (2) RNAP bound, (3) Repressor bound, (4) both bound (mutually exclusive for overlapping binding sites). Each state has a Boltzmann weight:

$$w_{\text{empty}} = 1, \quad w_{\text{RNAP}} = \frac{[P]}{K_P}, \quad w_{\text{Rep}} = \frac{[R]}{K_R}, \quad w_{\text{P+CAP}} = \frac{[P]}{K_P}\cdot\frac{[A]}{K_A}\cdot\omega$$

Step 2: Construct the partition function

The partition function Z sums over all possible promoter configurations:

$$Z = 1 + \frac{[P]}{K_P} + \frac{[R]}{K_R} + \frac{[P]}{K_P}\frac{[A]}{K_A}\omega_{AP}$$

Step 3: Calculate RNAP occupancy probability

The probability that RNAP is bound (transcription occurs) is the sum of all RNAP-containing states divided by Z:

$$P_{\text{RNAP}} = \frac{\frac{[P]}{K_P}\left(1 + \frac{[A]}{K_A}\omega_{AP}\right)}{Z}$$

Step 4: Incorporate the repressor-inducer equilibrium

The effective repressor concentration depends on inducer (IPTG/allolactose). Inducer binding reduces repressor-DNA affinity by factor f:

$$[R]_{\text{eff}} = \frac{[R]_{\text{total}}}{1 + ([I]/K_I)^n} \qquad \text{(Hill-like inducer response)}$$

Step 5: Include CAP-cAMP activation (catabolite repression)

CAP-cAMP activates transcription when glucose is low (cAMP is high). The CAP activation factor depends on glucose through its effect on cAMP:

$$[\text{cAMP}] = \frac{[\text{cAMP}]_{\text{basal}}}{1 + ([\text{Glucose}]/K_{\text{glu}})^2} \qquad f_{\text{CAP}} = \frac{[\text{cAMP}]}{K_{\text{cAMP}} + [\text{cAMP}]}$$

Step 6: Final expression rate

The transcription rate is proportional to RNAP occupancy, integrating all regulatory inputs:

$$\text{Rate} = k_{\text{esc}} \cdot P_{\text{RNAP}} = k_{\text{esc}} \cdot \frac{\frac{[P]}{K_P}(1 + f_{\text{CAP}}\cdot\omega)}{1 + \frac{[P]}{K_P}(1 + f_{\text{CAP}}\cdot\omega) + \frac{[R]_{\text{eff}}}{K_R}}$$

This reproduces the known lac operon behavior: maximal expression requires both low glucose (high cAMP/CAP) and presence of inducer (low effective repressor). Neither condition alone is sufficient.

7.3 RNAP-Promoter Binding Equilibrium

The simplest model of transcription initiation treats RNAP binding as a two-state equilibrium:

$$P + RNAP \underset{k_{\text{off}}}{\overset{k_{\text{on}}}{\rightleftharpoons}} P \cdot RNAP \xrightarrow{k_{\text{esc}}} P + RNAP_{\text{elongating}} + \text{mRNA}$$

$$\text{Rate} = k_{\text{esc}} \cdot \frac{[RNAP]/K_d}{1 + [RNAP]/K_d} \quad \text{where } K_d = \frac{k_{\text{off}}}{k_{\text{on}}}$$

For strong E. coli promoters (consensus -10 and -35 elements), K_d ~ 10 nM and the open complex forms rapidly (tau ~ seconds). Weak promoters may have K_d > 1 muM. Promoter escape (k_esc) is often rate-limiting and regulated by sigma factor release and initial transcription (abortive cycling).

7.4 Gene Toggle Switch — Bistability

Two mutually repressing genes (Gardner et al., Nature 2000) form a bistable toggle switch, the simplest genetic memory element. The system is described by:

$$\frac{du}{dt} = \frac{\alpha_1}{1 + v^\beta} - u \qquad \frac{dv}{dt} = \frac{\alpha_2}{1 + u^\gamma} - v$$

For Hill coefficients beta, gamma > 1 and sufficiently large alpha, the nullclines intersect at three fixed points: two stable steady states and one unstable saddle. This creates a bistable switch where the system remembers which gene was last activated — a foundation for synthetic biology circuits and cellular decision-making (e.g., lysogeny vs. lysis in phage lambda).

Derivation: Conditions for Bistability in a Genetic Toggle Switch

Starting from the mutual repression equations, we derive the conditions under which the system exhibits two stable steady states.

Step 1: Write the toggle switch ODEs

Two genes mutually repress each other with Hill-type repression (Gardner et al., 2000):

$$\frac{du}{dt} = \frac{\alpha_1}{1 + v^\beta} - u \qquad \frac{dv}{dt} = \frac{\alpha_2}{1 + u^\gamma} - v$$

Step 2: Find the nullclines (steady-state curves)

Set each derivative to zero to find the nullclines. The u-nullcline (du/dt = 0) and v-nullcline (dv/dt = 0) are:

$$u = \frac{\alpha_1}{1 + v^\beta} \qquad (\text{u-nullcline}) \qquad v = \frac{\alpha_2}{1 + u^\gamma} \qquad (\text{v-nullcline})$$

Step 3: Determine intersection conditions

Steady states occur where nullclines intersect. Substituting the v-nullcline into the u-nullcline gives a self-consistency equation:

$$u = \frac{\alpha_1}{1 + \left(\frac{\alpha_2}{1 + u^\gamma}\right)^\beta} \equiv F(u)$$

Step 4: Analyze the symmetric case (α₁ = α₂ = α, β = γ = n)

At the symmetric fixed point u* = v* = u_s, the self-consistency equation becomes:

$$u_s = \frac{\alpha}{1 + u_s^n} \implies u_s(1 + u_s^n) = \alpha$$

Step 5: Linearize and derive the bistability condition

The Jacobian at a fixed point (u*, v*) determines stability. Bistability requires that the symmetric fixed point be unstable (a saddle point). The condition for instability at the symmetric point is that the product of the nullcline slopes exceeds 1:

$$\left|\frac{du}{dv}\right|_{\text{u-null}} \times \left|\frac{dv}{du}\right|_{\text{v-null}} > 1 \implies \frac{\alpha_1 \beta\, v_s^{\beta-1}}{(1+v_s^\beta)^2} \cdot \frac{\alpha_2 \gamma\, u_s^{\gamma-1}}{(1+u_s^\gamma)^2} > 1$$

Step 6: Simplified bistability criterion

For the symmetric case (α₁ = α₂, β = γ = n), bistability requires the Hill coefficients to be sufficiently large and the production rates to be sufficiently high. The critical condition simplifies to:

$$n > 1 + \frac{1}{\log(\alpha/2)} \qquad \text{(approximate, for large } \alpha\text{)}$$

In practice: n > 2 almost always guarantees bistability for reasonable α values. When n = 1 (no cooperativity), the nullclines intersect only once and the system is monostable. Cooperativity (n > 1) is essential for creating the S-shaped nullclines that enable three intersections.

7.5 Noise in Gene Expression

Gene expression is inherently stochastic due to the small number of molecules involved (often fewer than 10 mRNA copies per gene in a bacterial cell). This stochasticity, or “noise,” has profound consequences for cellular behavior and can be quantified using the Fano factor.

$$F = \frac{\sigma^2}{\langle n \rangle} = 1 + b \qquad \text{where } b = \frac{k_p}{\delta_m} \text{ (burst size)}$$

Here F is the Fano factor (variance-to-mean ratio of protein copy number), b is the average number of proteins produced per mRNA lifetime (the “burst size”), k_p is the translation rate, and δ_m is the mRNA degradation rate. For a Poisson process, F = 1; transcriptional bursting yields F > 1 (super-Poissonian noise).

Derivation: Fano Factor for Gene Expression Noise from the Master Equation

Starting from the stochastic two-stage model of gene expression (mRNA → protein), we derive the Fano factor F = 1 + b.

Step 1: Define the two-stage model

mRNAs are produced at rate k_m and degraded at rate δ_m. Each mRNA produces proteins at rate k_p, and proteins are degraded at rate δ_p:

$$\emptyset \xrightarrow{k_m} m \xrightarrow{\delta_m} \emptyset \qquad m \xrightarrow{k_p} m + P \qquad P \xrightarrow{\delta_p} \emptyset$$

Step 2: Solve for mean mRNA and protein levels

At steady state, the mean copy numbers are:

$$\langle m \rangle = \frac{k_m}{\delta_m} \qquad \langle P \rangle = \frac{k_m k_p}{\delta_m \delta_p}$$

Step 3: Define the burst size

Since mRNA is short-lived compared to protein (δ_m >> δ_p), each mRNA produces a “burst” of proteins before being degraded. The average burst size is:

$$b = \frac{k_p}{\delta_m} \qquad \text{(proteins per mRNA lifetime)}$$

Step 4: Compute the protein variance from the master equation

Using the generating function method on the chemical master equation (Thattai & van Oudenaarden, 2001), the protein variance has two components — intrinsic (Poisson) noise and extrinsic (burst) noise:

$$\sigma_P^2 = \langle P \rangle + \frac{k_p}{\delta_m} \cdot \langle P \rangle \cdot \frac{1}{1 + \delta_p/\delta_m}$$

Step 5: Simplify in the limit δ_m >> δ_p

When mRNA degrades much faster than protein (typical in bacteria: mRNA half-life ~2–5 min, protein half-life ~hours), δ_p/δ_m → 0:

$$\sigma_P^2 \approx \langle P \rangle + b \cdot \langle P \rangle = \langle P \rangle(1 + b)$$

Step 6: Extract the Fano factor

The Fano factor is the variance-to-mean ratio:

$$F = \frac{\sigma_P^2}{\langle P \rangle} = 1 + b = 1 + \frac{k_p}{\delta_m}$$

When b = 0 (no translation bursting), F = 1 (Poisson statistics). For typical E. coli genes with b ≈ 1–5, F ≈ 2–6, meaning protein fluctuations are 2–6× larger than Poisson. This “burstiness” enables phenotypic heterogeneity even in clonal populations, driving phenomena like antibiotic persistence and competence switching.

8. Computational Lab: Lac Operon Simulation

Python: Diauxic Growth and Catabolite Repression

This simulation models beta-galactosidase activity as a function of IPTG inducer and glucose concentration, incorporating the Hill equation for cooperative induction, CAP-cAMP positive regulation, and a dynamic diauxic growth simulation showing the preferential use of glucose before lactose.

Lac Operon: Beta-Galactosidase Expression & Diauxic Growth

Python

Models induction, catabolite repression, and diauxic growth dynamics

script.py154 lines

#!/usr/bin/env python3
"""
Lac Operon Gene Expression Simulator
Models beta-galactosidase activity as a function of IPTG and glucose,
demonstrating diauxic growth and catabolite repression.
"""
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

# ------------------------------------------------------------------
# Model parameters
# ------------------------------------------------------------------
Vmax_lacZ   = 1000.0   # max beta-gal expression (Miller units)
K_IPTG      = 0.05     # half-max IPTG induction (mM)
n_IPTG      = 2.0      # Hill coefficient for IPTG (cooperativity)
K_glucose   = 2.0      # half-max glucose repression (mM)
n_glucose   = 2.0      # Hill coefficient for glucose repression
CAP_max     = 1.0      # max CAP activation factor
K_cAMP      = 0.5      # cAMP half-max for CAP activation (mM)

# Glucose inhibits adenylate cyclase => lowers cAMP
cAMP_basal  = 2.0      # cAMP when glucose = 0
K_glu_cAMP  = 1.0      # glucose conc for half-inhibition of cAMP

def cAMP_level(glucose):
    """cAMP concentration decreases as glucose increases (catabolite repression)."""
    return cAMP_basal / (1.0 + (glucose / K_glu_cAMP)**n_glucose)

def CAP_activation(glucose):
    """CAP-cAMP complex activation factor."""
    c = cAMP_level(glucose)
    return CAP_max * c / (K_cAMP + c)

def repressor_release(IPTG):
    """Fraction of operator sites free of repressor (inducer-dependent)."""
    return IPTG**n_IPTG / (K_IPTG**n_IPTG + IPTG**n_IPTG)

def beta_gal_activity(IPTG, glucose):
    """Beta-galactosidase activity: product of induction and CAP activation."""
    induction = repressor_release(IPTG)
    cap = CAP_activation(glucose)
    basal = 0.01  # leaky expression
    return Vmax_lacZ * (basal + (1 - basal) * induction * cap)

# ------------------------------------------------------------------
# Figure 1: 2D heat map — beta-gal vs IPTG & glucose
# ------------------------------------------------------------------
iptg_range    = np.linspace(0, 1.0, 200)   # mM
glucose_range = np.linspace(0, 20.0, 200)  # mM
IPTG_grid, GLU_grid = np.meshgrid(iptg_range, glucose_range)
Z = beta_gal_activity(IPTG_grid, GLU_grid)

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Heat map
im = axes[0].pcolormesh(IPTG_grid, GLU_grid, Z, shading='auto', cmap='inferno')
axes[0].set_xlabel('IPTG (mM)', fontsize=12)
axes[0].set_ylabel('Glucose (mM)', fontsize=12)
axes[0].set_title('Beta-Galactosidase Activity\n(Miller Units)', fontsize=13)
fig.colorbar(im, ax=axes[0], label='Activity')

# Dose-response at fixed glucose levels
for glc, color, ls in [(0, '#00ff88', '-'), (2, '#ffaa00', '--'),
                        (5, '#ff4444', '-.'), (15, '#888888', ':')]:
    activity = [beta_gal_activity(ip, glc) for ip in iptg_range]
    axes[1].plot(iptg_range, activity, color=color, ls=ls,
                 lw=2, label=f'Glucose = {glc} mM')
axes[1].set_xlabel('IPTG (mM)', fontsize=12)
axes[1].set_ylabel('Beta-gal Activity', fontsize=12)
axes[1].set_title('Dose-Response Curves', fontsize=13)
axes[1].legend(fontsize=9)
axes[1].set_xlim(0, 1)

# ------------------------------------------------------------------
# Figure panel 3: Diauxic growth simulation
# ------------------------------------------------------------------
dt = 0.01   # hours
T  = 12.0
steps = int(T / dt)
time      = np.zeros(steps)
biomass   = np.zeros(steps)
glucose_t = np.zeros(steps)
lactose_t = np.zeros(steps)
bgal_t    = np.zeros(steps)

# Initial conditions
biomass[0]   = 0.05   # OD600
glucose_t[0] = 10.0   # mM
lactose_t[0] = 10.0   # mM

mu_max_glu = 0.8   # max growth rate on glucose (1/hr)
mu_max_lac = 0.45  # max growth rate on lactose (1/hr)
Ks_glu     = 0.5
Ks_lac     = 1.0
Yxs        = 0.5   # yield coefficient

for i in range(1, steps):
    time[i] = i * dt
    glu = glucose_t[i-1]
    lac = lactose_t[i-1]
    B   = biomass[i-1]

# Catabolite repression: lactose utilization suppressed by glucose
    lac_repression = 1.0 / (1.0 + (glu / 1.0)**3)

mu_glu = mu_max_glu * glu / (Ks_glu + glu)
    mu_lac = mu_max_lac * lac / (Ks_lac + lac) * lac_repression

mu_total = mu_glu + mu_lac

biomass[i]   = B + mu_total * B * dt
    glucose_t[i] = max(0, glu - mu_glu * B / Yxs * dt)
    lactose_t[i] = max(0, lac - mu_lac * B / Yxs * dt)
    bgal_t[i]    = beta_gal_activity(lac * lac_repression * 0.1, glu) * B

ax3 = axes[2]
ax3.plot(time, biomass, 'w-', lw=2, label='Biomass (OD)')
ax3.set_xlabel('Time (hours)', fontsize=12)
ax3.set_ylabel('Biomass (OD600)', fontsize=12, color='white')
ax3.set_title('Diauxic Growth Simulation', fontsize=13)

ax3b = ax3.twinx()
ax3b.plot(time, glucose_t, '#ff6666', lw=1.5, ls='--', label='Glucose')
ax3b.plot(time, lactose_t, '#66aaff', lw=1.5, ls='--', label='Lactose')
ax3b.set_ylabel('Sugar (mM)', fontsize=12, color='#aaaaaa')

lines1, labels1 = ax3.get_legend_handles_labels()
lines2, labels2 = ax3b.get_legend_handles_labels()
ax3.legend(lines1 + lines2, labels1 + labels2, fontsize=9, loc='center right')

for ax in axes:
    ax.set_facecolor('#0a0a1a')
fig.patch.set_facecolor('#0a0a1a')
for ax in [axes[0], axes[1], axes[2], ax3b]:
    ax.tick_params(colors='#cccccc')
    ax.xaxis.label.set_color('#cccccc')
    ax.yaxis.label.set_color('#cccccc')
    ax.title.set_color('#ffffff')
    for spine in ax.spines.values():
        spine.set_color('#444444')

plt.tight_layout()
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#0a0a1a')
plt.close()

print("=== Lac Operon Simulation Results ===")
print(f"Max beta-gal (no glucose, saturating IPTG): {beta_gal_activity(1.0, 0):.0f} Miller units")
print(f"Beta-gal at high glucose + IPTG:            {beta_gal_activity(1.0, 20):.0f} Miller units")
print(f"Repression ratio:                           {beta_gal_activity(1.0, 0)/beta_gal_activity(1.0, 20):.1f}x")
print(f"\nDiauxic growth lag phase visible around t = {time[np.argmin(np.diff(biomass[:int(8/dt)]))]:.1f} hours")
print(f"Peak biomass: {biomass[-1]:.2f} OD600")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

9. Computational Lab: Gene Toggle Switch

Fortran: Bistable Toggle Switch Dynamics

This Fortran program models a genetic toggle switch — two mutually repressing genes exhibiting bistability. Using RK4 integration, it computes trajectories from multiple initial conditions, demonstrating that the system converges to one of two stable steady states depending on initial conditions. Nullcline analysis identifies the fixed points.

Gene Toggle Switch: Bistability Analysis

Fortran

Two mutually repressing genes with RK4 integration and nullcline computation

toggle_switch.f90129 lines

program gene_toggle_switch
  ! =====================================================================
  ! Gene Regulatory Network: Bistable Toggle Switch
  ! Two mutually repressing genes (Gardner et al., Nature 2000)
  !
  !   du/dt = alpha1/(1 + v^beta) - u
  !   dv/dt = alpha2/(1 + u^gamma) - v
  !
  ! Demonstrates bistability via nullcline analysis and time integration.
  ! =====================================================================
  implicit none

integer, parameter :: dp = selected_real_kind(15, 307)
  integer, parameter :: N_TRAJ  = 6        ! number of trajectories
  integer, parameter :: NSTEPS  = 50000
  integer, parameter :: NX_NULL = 500       ! nullcline resolution
  real(dp), parameter :: dt     = 0.005_dp
  real(dp), parameter :: alpha1 = 5.0_dp   ! max production rate gene u
  real(dp), parameter :: alpha2 = 5.0_dp   ! max production rate gene v
  real(dp), parameter :: beta_  = 3.0_dp   ! Hill coefficient for v repression
  real(dp), parameter :: gamma_ = 3.0_dp   ! Hill coefficient for u repression

real(dp) :: u(N_TRAJ), v(N_TRAJ)
  real(dp) :: du, dv, t
  real(dp) :: u_null(NX_NULL), v_null1(NX_NULL), v_null2(NX_NULL)
  integer  :: i, j, ios
  real(dp) :: x_val

! Initial conditions spanning the phase space
  real(dp) :: u0(N_TRAJ), v0(N_TRAJ)
  data u0 / 0.1_dp, 4.5_dp, 2.5_dp, 0.5_dp, 4.0_dp, 2.0_dp /
  data v0 / 4.5_dp, 0.1_dp, 2.5_dp, 1.0_dp, 3.5_dp, 0.3_dp /

! ------------------------------------------------------------------
  ! 1. Compute nullclines for phase portrait
  ! ------------------------------------------------------------------
  open(unit=20, file='nullclines.dat', status='replace', iostat=ios)
  if (ios /= 0) stop 'Error opening nullclines.dat'

do i = 1, NX_NULL
    x_val = 0.001_dp + (6.0_dp - 0.001_dp) * real(i-1, dp) / real(NX_NULL-1, dp)
    u_null(i)  = x_val
    ! u-nullcline: du/dt = 0 => u = alpha1/(1+v^beta), solve for v given u:
    !   u = alpha1/(1+v^beta) => v = ((alpha1/u) - 1)^(1/beta)
    if (alpha1/x_val - 1.0_dp > 0.0_dp) then
      v_null1(i) = (alpha1/x_val - 1.0_dp)**(1.0_dp/beta_)
    else
      v_null1(i) = 0.0_dp
    end if
    ! v-nullcline: dv/dt = 0 => v = alpha2/(1+u^gamma)
    v_null2(i) = alpha2 / (1.0_dp + x_val**gamma_)
    write(20, '(3ES16.8)') u_null(i), v_null1(i), v_null2(i)
  end do
  close(20)

! ------------------------------------------------------------------
  ! 2. Integrate trajectories using RK4
  ! ------------------------------------------------------------------
  u = u0
  v = v0

open(unit=30, file='trajectories.dat', status='replace', iostat=ios)
  if (ios /= 0) stop 'Error opening trajectories.dat'

do i = 1, NSTEPS
    t = real(i, dp) * dt
    ! Write every 50th step
    if (mod(i, 50) == 0) then
      write(30, '(13ES14.6)') t, (u(j), v(j), j=1, N_TRAJ)
    end if
    ! RK4 integration for each trajectory
    do j = 1, N_TRAJ
      call rk4_step(u(j), v(j), dt)
    end do
  end do
  close(30)

! ------------------------------------------------------------------
  ! 3. Print steady states and analysis
  ! ------------------------------------------------------------------
  print *, '============================================='
  print *, ' GENE TOGGLE SWITCH — BISTABILITY ANALYSIS'
  print *, '============================================='
  print *, ''
  print *, 'Parameters: alpha1 =', alpha1, ' alpha2 =', alpha2
  print *, '            beta   =', beta_,  ' gamma  =', gamma_
  print *, ''
  print *, 'Final steady states for each trajectory:'
  print *, '-----------------------------------------'
  do j = 1, N_TRAJ
    print '(A,I1,A,F6.3,A,F6.3,A,F6.3,A,F6.3)', &
      '  Traj ', j, ': (u0,v0)=(', u0(j), ',', v0(j), &
      ') -> (u,v)=(', u(j), ',', v(j), ')'
  end do
  print *, ''
  print *, 'Bistable states identified:'
  print *, '  State A (gene u HIGH): u ~ alpha1, v ~ 0'
  print *, '  State B (gene v HIGH): u ~ 0,      v ~ alpha2'
  print *, ''
  print *, 'The system exhibits bistability because the Hill'
  print *, 'coefficients (beta, gamma > 1) create cooperative'
  print *, 'repression, yielding three fixed points (two stable,'
  print *, 'one unstable saddle).'

contains

subroutine rk4_step(uu, vv, h)
    real(dp), intent(inout) :: uu, vv
    real(dp), intent(in)    :: h
    real(dp) :: k1u, k1v, k2u, k2v, k3u, k3v, k4u, k4v

call derivs(uu, vv, k1u, k1v)
    call derivs(uu + 0.5_dp*h*k1u, vv + 0.5_dp*h*k1v, k2u, k2v)
    call derivs(uu + 0.5_dp*h*k2u, vv + 0.5_dp*h*k2v, k3u, k3v)
    call derivs(uu + h*k3u, vv + h*k3v, k4u, k4v)

uu = uu + h/6.0_dp * (k1u + 2.0_dp*k2u + 2.0_dp*k3u + k4u)
    vv = vv + h/6.0_dp * (k1v + 2.0_dp*k2v + 2.0_dp*k3v + k4v)
  end subroutine

subroutine derivs(uu, vv, dudt, dvdt)
    real(dp), intent(in)  :: uu, vv
    real(dp), intent(out) :: dudt, dvdt
    dudt = alpha1 / (1.0_dp + vv**beta_) - uu
    dvdt = alpha2 / (1.0_dp + uu**gamma_) - vv
  end subroutine

end program gene_toggle_switch

Click Run to execute the Fortran code

Code will be compiled with gfortran and executed on the server

Summary: Levels of Gene Regulation

Level	Mechanism	Key Players	Timescale
Chromatin	Remodeling, histone modification	SWI/SNF, HATs/HDACs, HMTs	Minutes to hours
Epigenetic	DNA methylation, histone code	DNMTs, TETs, PRC1/2	Cell generations
Transcription	TF binding, enhancer activation	TFs, Mediator, RNAP II	Minutes
RNA processing	Splicing, polyadenylation	SR proteins, hnRNPs, CPSF	Co-transcriptional
mRNA stability	Deadenylation, decapping, RNAi	CCR4-NOT, miRISC, P-bodies	Minutes to hours
Translation	Initiation control, uORFs	eIF4E, 4E-BP, mTOR, IRPs	Minutes
Post-translational	Modification, degradation	Kinases, ubiquitin, proteasome	Seconds to hours

Key Concepts and Connections

Negative vs. Positive Regulation

Negative regulators (repressors, HDACs, DNA methylation) silence genes by default; signals relieve repression. Positive regulators (activators, HATs, enhancers) actively recruit transcriptional machinery. Most genes use both mechanisms simultaneously.

Feedback Loops

Negative feedback (SOCS/JAK-STAT, trp repressor) maintains homeostasis. Positive feedback (Oct4 self-activation in stem cells, Ras/ERK/Elk-1/SOS) creates switch-like bistable responses. Combined feedforward/feedback motifs generate complex dynamics including oscillations (p53/Mdm2, NF-kappaB/IkappaB).

Disease Connections

Cancer: mutations in chromatin regulators (SWI/SNF ~20%, EZH2, DNMT3A, TET2, IDH1/2), signaling (Ras ~30%, Raf ~7%, EGFR), and TFs (p53 ~50%, Myc amplification). Imprinting disorders: Prader-Willi, Angelman, Beckwith-Wiedemann. Epigenetic drugs: HDAC inhibitors, DNMT inhibitors (azacitidine, decitabine), EZH2 inhibitors (tazemetostat), BET inhibitors.

Prokaryotic vs. Eukaryotic

Prokaryotes: operons, coupled transcription-translation, attenuation, two-component systems, sigma factor switching. Eukaryotes: chromatin barrier, combinatorial TF logic, long-range enhancers, extensive RNA processing, nuclear-cytoplasmic compartmentalization, epigenetic memory across cell divisions. Despite differences, core principles (cooperativity, combinatorial logic, feedback) are universal.

← Protein Structure Recombination →

Share:X Reddit LinkedIn