Part 6: Translation and Protein Synthesis
RNA to Protein: The Central Dogma's Final Step
Translation is the process by which the nucleotide sequence of messenger RNA (mRNA) is decoded to produce a specific polypeptide chain. This remarkable molecular machine operates at the ribosome, reading mRNA in the 5' to 3' direction while synthesizing protein from the N-terminus to the C-terminus. The process requires transfer RNAs (tRNAs) as adaptor molecules, aminoacyl-tRNA synthetases for charging tRNAs, and numerous protein factors that orchestrate initiation, elongation, and termination.
Translation consumes approximately 4 high-energy phosphate bonds per amino acid incorporated (2 for aminoacyl-tRNA synthesis, 1 for EF-Tu GTP hydrolysis during A-site delivery, 1 for EF-G GTP hydrolysis during translocation), making it one of the most energy-intensive processes in the cell. In rapidly growing E. coli, up to 80% of cellular energy is devoted to translation.
1. The Genetic Code
Properties of the Genetic Code
Fundamental Features
- Triplet: Each codon consists of 3 consecutive nucleotides, providing 43 = 64 possible codons
- Degenerate (redundant): 64 codons encode only 20 amino acids + 3 stop signals. Most amino acids are specified by 2-6 synonymous codons
- Non-overlapping: Codons are read sequentially in a single reading frame without sharing nucleotides
- Comma-free: No punctuation between codons; the reading frame is set by the start codon
- Unambiguous: Each codon specifies exactly one amino acid (or stop)
- Nearly universal: Same code in virtually all organisms, with minor exceptions (mitochondria, Mycoplasma, some ciliates)
Degeneracy Pattern
1 codon: Met (AUG), Trp (UGG)
2 codons: Phe, Tyr, His, Gln, Asn, Lys, Asp, Glu, Cys
3 codons: Ile
4 codons: Val, Pro, Thr, Ala, Gly
6 codons: Leu, Ser, Arg
Most degeneracy occurs at the 3rd (wobble) position of the codon, where changes often do not alter the encoded amino acid.
Start and Stop Codons
Start Codon: AUG
- Encodes methionine (Met) in eukaryotes
- Encodes N-formylmethionine (fMet) in prokaryotes
- Sets the reading frame for the entire ORF
- Internal AUGs encode regular Met residues
- Rare alternative starts: GUG (~8% in E. coli), UUG (~1%)
- fMet is later removed by methionine aminopeptidase (MAP) in most proteins
Stop Codons (Nonsense Codons)
- UAA (Ochre): Most common in E. coli; recognized by RF1
- UAG (Amber): Recognized by RF1; site for amber suppressor tRNAs
- UGA (Opal/Umber): Recognized by RF2; can encode selenocysteine (21st amino acid) via SECIS element
- Suppressor tRNAs carry anticodons that read stop codons, inserting an amino acid instead of terminating
- UGA is also recoded for pyrrolysine (22nd amino acid) in some methanogens
Wobble Hypothesis (Crick, 1966)
Francis Crick proposed that the first two positions of the codon form standard Watson-Crick base pairs with the anticodon, but the third codon position (5' end of the anticodon) allows non-standard "wobble" pairing. This explains how fewer than 61 tRNAs can decode all 61 sense codons.
Wobble Base Pairing Rules
| 5' Anticodon Base | 3' Codon Base(s) Recognized | Notes |
|---|---|---|
| G | U or C | G-U wobble pair is thermodynamically stable |
| C | G only | Standard Watson-Crick only |
| A | U only | Standard Watson-Crick only |
| U | A or G | U-G wobble pair |
| I (Inosine) | U, C, or A | Inosine can pair with 3 bases; found in many tRNAs |
Inosine (I) is formed by deamination of adenosine and is the most versatile wobble base. A single tRNA with I at the wobble position can decode three different codons, significantly reducing the number of tRNA species required. The minimum number of tRNAs needed to decode all 61 sense codons is 31.
Codon Usage Bias
Although synonymous codons encode the same amino acid, organisms show strong preferences for certain codons over others. This codon usage bias correlates with the abundance of cognate tRNA species and affects translation speed and accuracy.
Biological Significance
- Highly expressed genes use "optimal" codons matched to abundant tRNAs
- Rare codons cause ribosome pausing/stalling
- Codon bias affects co-translational folding
- Selection maintains codon bias in large populations
- Important for heterologous gene expression (codon optimization)
Near-Cognate Misreading
- tRNAs with near-cognate anticodons can occasionally misread codons
- Error rate: ~10-3 to 10-4 per codon
- Ribosome uses kinetic proofreading to reduce errors
- Two-step selection: initial selection + proofreading after GTP hydrolysis
- Aminoglycoside antibiotics increase misreading by distorting the A site
Codon Adaptation Index (CAI)
The CAI quantifies how well a gene's codon usage matches the optimal codons of the organism:
where L is the number of codons, wci is the relative adaptiveness of codon ci, defined as the ratio of the observed frequency of that codon to the frequency of the most common synonymous codon for the same amino acid. CAI ranges from 0 to 1, with 1 indicating maximal codon optimization.
RSCU (Relative Synonymous Codon Usage) equals 1.0 for codons with no bias. Values above 1.0 indicate preferred codons; values below 1.0 indicate avoided codons.
Derivation: Codon Usage Bias and the tRNA Adaptation Index (tAI)
Starting from the observation that tRNA gene copy number correlates with tRNA abundance, we derive the tRNA Adaptation Index as a measure of translational efficiency.
Step 1: tRNA gene copy number as a proxy for abundance
In rapidly growing bacteria and yeast, tRNA abundance is approximately proportional to tRNA gene copy number. For each codon $c$, the available pool of cognate tRNAs includes exact-match (Watson-Crick) and wobble-pairing species. Define the absolute adaptiveness:
$$W_c = \sum_{j} (1 - s_{cj}) \cdot \text{tGCN}_j$$
where $\text{tGCN}_j$ is the gene copy number of tRNA isoacceptor $j$, and $s_{cj}$ is a penalty for wobble pairing (0 for Watson-Crick, 0.5 for G-U wobble, etc.).
Step 2: Normalize to get relative adaptiveness
For each amino acid, normalize by the maximum $W$ value among its synonymous codons:
$$w_c = \frac{W_c}{\max_{c' \in \text{syn}} W_{c'}}$$
This ensures $0 < w_c \leq 1$ for each codon. Codons with $w_c = 1$ are optimally adapted; lower values indicate slower decoding.
Step 3: Define the tAI for a gene
The tAI is the geometric mean of $w_c$ values over all $L$ codons in the coding sequence (analogous to CAI):
$$\text{tAI} = \left(\prod_{i=1}^{L} w_{c_i}\right)^{1/L} = \exp\left(\frac{1}{L}\sum_{i=1}^{L} \ln w_{c_i}\right)$$
Step 4: Relationship between tAI and translation speed
Since the elongation rate at each codon follows Michaelis-Menten kinetics with cognate tRNA as substrate, and tRNA abundance $\propto$ tGCN, the per-codon rate is approximately:
$$k_i \propto \frac{[\text{tRNA}]_{\text{cognate}}}{K_M + [\text{tRNA}]_{\text{cognate}}} \propto w_{c_i} \quad \text{(when not saturated)}$$
Step 5: tAI predicts protein abundance
The overall translation rate for a gene scales with the harmonic mean of per-codon rates (since slow codons bottleneck the ribosome). The tAI (geometric mean) approximates this and correlates strongly ($r \approx 0.6\text{--}0.7$) with measured protein abundance in E. coli and yeast.
Step 6: Comparison: tAI vs. CAI
CAI uses observed codon frequencies in highly expressed genes as a reference (empirical). tAI uses tRNA gene copy numbers and wobble rules (mechanistic). Both predict expression level, but tAI has a clearer biophysical basis. For E. coli: ribosomal protein genes have CAI $\approx 0.8\text{--}0.9$ and tAI $\approx 0.7\text{--}0.9$, while horizontally transferred genes have CAI $\approx 0.3\text{--}0.5$ and tAI $\approx 0.2\text{--}0.4$.
2. Transfer RNA (tRNA)
tRNA Structure
Transfer RNAs are small RNA molecules (76-90 nucleotides) that serve as the physical link between the nucleotide sequence of mRNA and the amino acid sequence of proteins. Each tRNA carries a specific amino acid and recognizes one or more codons through its anticodon triplet.
Cloverleaf Secondary Structure
- Acceptor Stem (7 bp): 5' and 3' ends base-pair; 3' end extends as single-stranded CCA-OH, the site of amino acid attachment (ester bond to 2' or 3'-OH of terminal adenosine)
- D-loop (Dihydrouridine arm): Contains dihydrouridine (D) modified bases; varies in length (8-12 nt loop); interacts with aminoacyl-tRNA synthetase
- Anticodon Loop: Always 7 nucleotides; anticodon at positions 34-36; position 34 is the wobble position; flanked by modified bases (position 37 often has a hypermodified purine)
- TΨC Loop (T-arm): Contains the conserved TΨC sequence (ribothymidine-pseudouridine-cytidine); interacts with the ribosome large subunit
- Variable Loop: 4-21 nucleotides; short in Class 1 tRNAs, long in Class 2 (tRNASer, tRNALeu)
L-shaped 3D Structure
- The cloverleaf folds into a compact L-shaped tertiary structure (~60 x 20 Angstroms)
- The acceptor stem and TΨC arm stack coaxially to form one arm of the L
- The D-arm and anticodon arm stack coaxially to form the other arm
- The anticodon is at one end of the L; the amino acid attachment (CCA 3') is at the other end, ~75 Angstroms apart
- Tertiary interactions include base triples, non-Watson-Crick pairs, and intercalation
- Mg2+ ions stabilize the tertiary fold
- The invariant G19-C56 pair connects the D-loop and T-loop at the corner of the L
Modified Bases in tRNA
tRNAs contain the highest density of modified nucleosides of any RNA class. Over 100 different modifications have been identified, with each tRNA containing an average of 12-14 modified bases.
Inosine (I)
- Formed by deamination of adenosine by ADAT enzymes
- Found at wobble position 34
- Can pair with U, C, or A
- Expands decoding capacity
Pseudouridine (Ψ)
- C-glycoside isomer of uridine (C-C bond instead of N-C)
- Found in TΨC loop (position 55)
- Stabilizes RNA structure via extra H-bond donor
- Most abundant tRNA modification
Dihydrouridine (D)
- Saturated 5,6-double bond of uracil
- Primarily in D-loop (hence the name)
- Increases conformational flexibility
- Cannot stack efficiently
Other important modifications include m1A58 (stabilizes T-loop), t6A37 and i6A37 (adjacent to anticodon, prevent frameshifting), and 2'-O-methylation (protects against nucleases).
Aminoacyl-tRNA Synthetases (aaRS)
These essential enzymes catalyze the attachment of the correct amino acid to its cognate tRNA(s). There are 20 synthetases in most organisms (one per amino acid), and they must achieve extremely high fidelity since the ribosome cannot verify the amino acid identity — only the codon-anticodon match.
Class I Synthetases (10 enzymes)
- Rossmann fold catalytic domain
- HIGH and KMSKS signature motifs
- Aminoacylate 2'-OH of terminal A
- Approach tRNA from minor groove side
- Generally monomeric
- Amino acids: Met, Val, Ile, Leu, Cys, Arg, Glu, Gln, Tyr, Trp
Class II Synthetases (10 enzymes)
- Antiparallel beta-sheet catalytic domain
- Motifs 1, 2, 3 (distinct from Class I)
- Aminoacylate 3'-OH of terminal A
- Approach tRNA from major groove side
- Generally dimeric or tetrameric
- Amino acids: Gly, Ala, Pro, Ser, Thr, His, Asp, Asn, Lys, Phe
Two-Step Aminoacylation Reaction
The aminoacyl-adenylate (AA-AMP) intermediate remains enzyme-bound. Pyrophosphatase hydrolyzes PPi to 2Pi, driving the reaction forward (cost: 2 high-energy bonds per amino acid).
Editing and Proofreading: The Double-Sieve Mechanism
Some synthetases (e.g., IleRS, ValRS, LeuRS, ThrRS) have a separate editing domain (CP1 domain) that hydrolyzes mischarged amino acids. The double-sieve model (Fersht, 1977):
- First sieve (synthetic site): Excludes amino acids larger than the cognate substrate based on steric fit. Cannot reject smaller amino acids (e.g., IleRS cannot exclude Val, which differs by one methyl group).
- Second sieve (editing site): Hydrolyzes amino acids smaller than the cognate substrate. The correctly charged product is too large to enter the editing site. IleRS editing site accepts Val-tRNAIle and hydrolyzes it, but Ile-tRNAIle is sterically excluded.
Editing can occur pre-transfer (hydrolyzing AA-AMP) or post-transfer (hydrolyzing AA-tRNA). This reduces the error rate from ~10-2 to ~10-4 (1 in 10,000).
Wobble Pairing Energetics
The free energy contributions of wobble base pairs differ from standard Watson-Crick pairs:
The total codon-anticodon interaction energy is the sum of all three base pair contributions. A minimum threshold of approximately -6 kcal/mol is needed for stable ribosomal A-site binding.
3. Ribosome Structure
The ribosome is a massive ribonucleoprotein machine (2.5-4.2 MDa) that catalyzes peptide bond formation. The catalytic center is composed of RNA, making the ribosome a ribozyme. Atomic-resolution structures (Ramakrishnan, Steitz, Yonath; Nobel 2009) revealed that no protein comes within 18 Angstroms of the peptidyl transferase active site.
Prokaryotic Ribosome (70S, ~2.5 MDa)
30S Small Subunit
- 16S rRNA (1542 nt) — decoding center; monitors codon-anticodon complementarity; 3' end contains anti-Shine-Dalgarno sequence (CCUCCU)
- 21 proteins (S1-S21)
- Responsible for mRNA binding and decoding fidelity
- Head, body, platform, and shoulder domains
50S Large Subunit
- 23S rRNA (2904 nt) — contains peptidyl transferase center (PTC); domain V forms the active site; this is the ribozyme activity
- 5S rRNA (120 nt) — structural role; bridges to tRNA and factors
- ~34 proteins (L1-L36)
- Contains the peptide exit tunnel (~100 Angstroms long, ~15 Angstroms wide)
Eukaryotic Ribosome (80S, ~4.2 MDa)
40S Small Subunit
- 18S rRNA (~1900 nt) — decoding; contains expansion segments absent in prokaryotes
- ~33 proteins (eS and uS nomenclature)
- More complex than 30S; additional regulatory interactions
60S Large Subunit
- 28S rRNA (~4700 nt) — peptidyl transferase center (homologous to 23S)
- 5.8S rRNA (~160 nt) — homologous to prokaryotic 23S 5' end; H-bonded to 28S
- 5S rRNA (~120 nt) — structural role
- ~47 proteins (eL and uL nomenclature)
Functional Sites
A Site (Aminoacyl)
- Accepts incoming aminoacyl-tRNA
- Codon-anticodon recognition occurs here
- 30S decoding center monitors base pairing geometry
- A1492, A1493 (16S rRNA) flip out to sense minor groove of codon-anticodon helix
- G530 switches from syn to anti upon cognate tRNA binding
P Site (Peptidyl)
- Holds peptidyl-tRNA (growing chain)
- During initiation, fMet-tRNA binds directly here
- CCA end positioned at PTC for peptide bond formation
- P-site tRNA contacts both subunits extensively
E Site (Exit)
- Deacylated tRNA exits here after translocation
- Low affinity for aminoacyl-tRNA
- E-site tRNA release is coupled to A-site tRNA binding (allosteric coupling)
- Codon-anticodon interaction maintained in E site
Additional Structural Features
- mRNA Channel: Formed between head and body of small subunit; accommodates ~30 nt of mRNA; Shine-Dalgarno helix fits in the channel exit
- Peptide Exit Tunnel: ~100 Angstrom tunnel through 50S subunit; lined mostly with 23S rRNA (domain I-V); nascent chain can begin folding near the exit (SRP recognition occurs here); proteins L4 and L22 form a constriction point
- Factor Binding Center: GTPase-associated center (GAC) on 50S; sarcin-ricin loop (SRL, nt 2653-2667 of 23S) is essential for stimulating GTP hydrolysis by EF-Tu and EF-G
- Inter-subunit Bridges: ~12 bridges connecting 30S and 50S; mostly RNA-RNA contacts; include B2a (the largest, involving h44 of 16S and H69 of 23S)
4. Translation Initiation
Prokaryotic Initiation
Prokaryotic initiation is directed by the Shine-Dalgarno (SD) sequence, a purine-rich region (consensus: AGGAGG) located 5-10 nucleotides upstream of the AUG start codon. The SD sequence base-pairs with the complementary anti-SD sequence (3'-AUUCCUCCACUAG-5') at the 3' end of 16S rRNA.
Step 1: 30S Pre-Initiation Complex
- IF3 binds free 30S subunit, prevents premature 50S joining; also performs a proofreading role for start codon selection
- IF1 binds at the A site of 30S, blocks tRNA entry; enhances IF2 and IF3 activities
- mRNA binds through SD-anti-SD base pairing, positioning AUG at the P site
Step 2: 30S Initiation Complex
- IF2-GTP delivers fMet-tRNAfMet to the P site
- IF2 is a GTPase that specifically recognizes the formyl group on fMet (discriminates initiator from elongator Met-tRNA)
- The initiator tRNA has unique features: 3 consecutive G-C pairs in the anticodon stem, no Watson-Crick pair at 1:72 position
Step 3: 70S Initiation Complex
- 50S subunit joins, triggering GTP hydrolysis by IF2
- IF1, IF2-GDP, and IF3 dissociate
- fMet-tRNAfMet is positioned in the P site, ready for elongation
- The A site is now empty and ready to accept the first elongator aminoacyl-tRNA
Eukaryotic Initiation (Cap-Dependent Scanning)
Eukaryotic initiation is far more complex, involving at least 12 initiation factors (eIFs) and a scanning mechanism to locate the start codon. The process is the primary point of translational regulation.
Step 1: 43S Pre-Initiation Complex (PIC) Formation
- eIF2-GTP-Met-tRNAiMet ternary complex forms (eIF2 is a heterotrimeric GTPase: alpha, beta, gamma subunits)
- Ternary complex joins 40S subunit along with eIF1 (fidelity), eIF1A (A-site occupation, like IF1), eIF3 (13-subunit complex, anti-association), and eIF5 (GAP for eIF2)
- This forms the 43S PIC in an "open" conformation capable of scanning
Step 2: mRNA Activation and 48S Complex
- eIF4F complex binds the 5' m7G cap:
- - eIF4E: Cap-binding protein (regulated by 4E-BPs)
- - eIF4G: Large scaffold protein; bridges eIF4E to eIF3 (40S recruitment); also binds PABP (circularizes mRNA)
- - eIF4A: DEAD-box RNA helicase; unwinds 5'-UTR secondary structure; stimulated by eIF4B
- eIF4B stimulates eIF4A helicase activity; eIF4H is a cofactor
- 43S PIC is recruited to the mRNA 5' end via eIF3-eIF4G interaction, forming the 48S complex
Step 3: Scanning and Start Codon Recognition
- 48S complex scans 5' to 3' along the 5'-UTR, powered by eIF4A helicase
- Scans for the first AUG in a favorable Kozak consensus context:
- 5'-gcc(A/G)ccAUGG-3'
- Position -3 (purine, especially A) and +4 (G) are most critical
- AUG recognition triggers conformational change: eIF1 displacement, "closed" complex, eIF2 GTP hydrolysis (stimulated by eIF5)
- Poor Kozak context allows "leaky scanning" to downstream AUGs
- Upstream open reading frames (uORFs) in the 5'-UTR can regulate reinitiation
Step 4: 60S Joining (80S Formation)
- eIF5B-GTP (homolog of prokaryotic IF2) promotes 60S subunit joining
- eIF2-GDP, eIF1, eIF3, eIF5 dissociate
- eIF5B GTP hydrolysis triggers release of eIF5B-GDP and eIF1A
- eIF2B (guanine nucleotide exchange factor) recycles eIF2-GDP to eIF2-GTP (rate-limiting; regulated by phosphorylation of eIF2-alpha by kinases: HRI, PKR, PERK, GCN2 = integrated stress response)
IRES-Mediated Internal Initiation
Internal Ribosome Entry Sites (IRESes) are structured RNA elements that recruit ribosomes directly to an internal position in the mRNA, bypassing the need for a 5' cap and scanning. Originally discovered in picornavirus RNAs (poliovirus, EMCV), IRESes are also found in some cellular mRNAs.
Types of IRESes
- Type I (Picornavirus): Requires most eIFs except eIF4E (e.g., poliovirus)
- Type II (EMCV, HCV-like): Binds 40S directly; requires fewer factors
- Type III (HCV): Binds 40S directly via RNA structure; needs only eIF3 and ternary complex
- Type IV (Cricket paralysis virus): Requires NO initiation factors; RNA structure mimics tRNA in the P site
Biological Significance
- Allows translation during stress when cap-dependent translation is inhibited
- Viral strategy: many viruses cleave eIF4G (by viral proteases) to shut off host translation while using IRES for their own mRNAs
- Some cellular mRNAs use IRES during apoptosis, mitosis, hypoxia
- Important drug target for HCV (sofosbuvir era)
5. Translation Elongation
Elongation is a cyclic process with three major steps: aminoacyl-tRNA delivery, peptide bond formation, and translocation. Each cycle adds one amino acid to the growing polypeptide and moves the ribosome by one codon (3 nucleotides) along the mRNA.
Step 1: Aminoacyl-tRNA Delivery (Decoding)
- EF-Tu-GTP-aa-tRNA ternary complex delivers aminoacyl-tRNA to the ribosomal A site. EF-Tu (in eukaryotes: eEF1A) is the most abundant protein in the cell (~5-10% of total protein in E. coli).
- Initial selection: Codon-anticodon base pairing is monitored by 16S rRNA nucleotides A1492, A1493, and G530. Cognate tRNA induces a conformational change (domain closure of 30S) that activates the GTPase center on 50S.
- GTP hydrolysis: Triggered by the sarcin-ricin loop (SRL) of 23S rRNA interacting with EF-Tu. GTP hydrolysis causes a conformational change in EF-Tu (switch I and II regions), releasing EF-Tu-GDP from the ribosome.
- Proofreading: After GTP hydrolysis but before peptide bond formation, the aa-tRNA can still dissociate if the codon-anticodon interaction is incorrect (kinetic proofreading, Hopfield 1974). Near-cognate tRNAs are rejected at this stage.
- Accommodation: Cognate aa-tRNA swings its CCA end into the PTC of the 50S A site (from the A/T state to the A/A state). This involves a ~70-Angstrom movement of the acceptor end.
- EF-Ts (eEF1B in eukaryotes) serves as the guanine nucleotide exchange factor (GEF) for EF-Tu, recycling EF-Tu-GDP back to EF-Tu-GTP.
Step 2: Peptide Bond Formation
Peptide bond formation is catalyzed by the peptidyl transferase center (PTC), which is composed entirely of 23S rRNA — making the ribosome a ribozyme. The reaction is an aminolysis: the alpha-amino group of the A-site aminoacyl-tRNA attacks the carbonyl carbon of the ester bond linking the peptide to the P-site tRNA.
- Substrate-assisted catalysis: The 2'-OH of the P-site tRNA A76 ribose participates directly in catalysis, acting as a proton shuttle. Mutation to 2'-deoxy reduces rate ~106-fold.
- Entropy reduction: The ribosome achieves most of its catalytic power (~107-fold rate enhancement) by precisely positioning the substrates, reducing the entropic cost of the reaction. The PTC provides a pre-organized environment.
- Rate: Peptide bond formation itself is very fast (~50-300 s-1), not rate-limiting in elongation. The chemical step may be preceded by rate-limiting accommodation.
- Key 23S rRNA residues: A2451, U2506, U2585, A2602 form the PTC walls; A2451 was initially proposed as a general acid-base catalyst but is now thought to play a structural/positioning role.
Step 3: Translocation
- Hybrid states: After peptide bond formation, tRNAs spontaneously adopt hybrid states: the deacylated tRNA moves to P/E (P-site on 30S, E-site on 50S) and the peptidyl-tRNA moves to A/P (A-site on 30S, P-site on 50S). This is driven by the thermodynamics of the CCA end interactions.
- EF-G-GTP binding: EF-G (eEF2 in eukaryotes) binds to the ribosome at the A site (its domain IV mimics the shape of tRNA — "molecular mimicry"). EF-G accelerates translocation ~50-fold.
- Ratchet-like motion: GTP hydrolysis by EF-G drives a ratchet-like rotation of the 30S subunit relative to 50S (~6 degree counterclockwise rotation), coupled with swiveling of the 30S head domain (~18 degrees). This moves the mRNA-tRNA complex by exactly one codon.
- Post-translocation state: After translocation, deacylated tRNA is in the E site (released on next cycle), peptidyl-tRNA is in the P site, and the A site is empty and ready for the next aa-tRNA. EF-G-GDP dissociates.
Translation Speed and Energetics
Elongation Rates
- E. coli (37 C): ~15-20 amino acids/second
- Eukaryotes: ~5-6 amino acids/second
- Mitochondria: ~1-2 amino acids/second
- Complete 300-aa protein in E. coli: ~15-20 seconds
- Same protein in eukaryotes: ~50-60 seconds
Energy Cost Per Amino Acid
Each high-energy bond provides ~7.3 kcal/mol. Total cost per amino acid: ~29.2 kcal/mol, making translation thermodynamically highly favorable and effectively irreversible.
Overall translation rate considering tRNA competition:
$$k_{\text{elong}} = \frac{k_{\text{cat}}}{1 + \frac{K_M}{[\text{aa-tRNA}_{\text{cognate}}]} \left(1 + \frac{[\text{aa-tRNA}_{\text{near-cognate}}]}{K_I}\right)}$$This Michaelis-Menten-like expression shows that the elongation rate at each codon depends on the concentration of cognate aminoacyl-tRNA and competition from near-cognate species. Rare codons with low cognate tRNA concentrations have slower elongation, leading to ribosome pausing.
Derivation: Ribosome Elongation Rate from Michaelis-Menten tRNA Selection
Starting from the kinetic scheme for aminoacyl-tRNA selection at the ribosomal A site, we derive the codon-specific elongation rate.
Step 1: Define the ternary complex delivery scheme
EF-Tu-GTP-aa-tRNA ternary complexes sample the A site. The cognate complex binds with association rate $k_1$ and either dissociates ($k_{-1}$) or triggers GTP hydrolysis ($k_2$):
$$\text{Ribosome} + \text{TC}_{\text{cog}} \underset{k_{-1}}{\overset{k_1}{\rightleftharpoons}} \text{Initial complex} \xrightarrow{k_2} \text{GTP hydrolysis} \xrightarrow{k_3} \text{Accommodation}$$
Step 2: Effective M-M rate for cognate tRNA selection
Applying the steady-state approximation to the initial recognition complex and combining the subsequent steps into an effective $k_{\text{cat}}$:
$$k_{\text{elong}} = \frac{k_{\text{cat}} \cdot [\text{TC}_{\text{cog}}]}{K_M + [\text{TC}_{\text{cog}}]}$$
where $K_M = (k_{-1} + k_2)/k_1$ and $k_{\text{cat}}$ combines GTP hydrolysis, proofreading, accommodation, peptide bond formation, and translocation.
Step 3: Include competitive inhibition by near-cognate tRNAs
Near-cognate ternary complexes compete for the A site but are mostly rejected during initial selection and proofreading. They act as competitive inhibitors with inhibition constant $K_I$:
$$k_{\text{elong}} = \frac{k_{\text{cat}} \cdot [\text{TC}_{\text{cog}}]}{K_M\left(1 + \frac{[\text{TC}_{\text{near}}]}{K_I}\right) + [\text{TC}_{\text{cog}}]}$$
Step 4: Kinetic proofreading (Hopfield, 1974)
After GTP hydrolysis, near-cognate tRNAs have a second chance to dissociate before accommodation (proofreading step). This adds an irreversible energy-consuming step that amplifies discrimination beyond thermodynamic equilibrium:
$$\text{Selectivity} = \underbrace{\frac{(k_2)_{\text{cog}}}{(k_2)_{\text{near}}}}_{\text{initial selection}} \times \underbrace{\frac{(k_3)_{\text{cog}}}{(k_3)_{\text{near}}}}_{\text{proofreading}} \approx 10^{2} \times 10^{1} = 10^{3}$$
Overall error rate per codon: ~$10^{-3}\text{--}10^{-4}$.
Step 5: Codon-specific rate variation
Since $[\text{TC}_{\text{cog}}] \propto$ tRNA abundance, which varies 10-fold between common and rare codons, the elongation rate varies accordingly. At a rare codon ($[\text{TC}] \ll K_M$):
$$k_{\text{elong}}^{\text{rare}} \approx \frac{k_{\text{cat}}}{K_M} \cdot [\text{TC}_{\text{rare}}] \quad \text{(first-order, slow)}$$
At an optimal codon ($[\text{TC}] \gg K_M$):
$$k_{\text{elong}}^{\text{optimal}} \approx k_{\text{cat}} \quad \text{(zero-order, maximal speed)}$$
Step 6: Overall translation rate for an mRNA
The total time to translate an mRNA of $L$ codons is the sum of per-codon dwell times. The overall rate is limited by the slowest codons (harmonic mean):
$$v_{\text{overall}} = \frac{L}{\sum_{i=1}^{L} 1/k_i} = L \cdot \left(\sum_{i=1}^{L} \frac{1}{k_i}\right)^{-1}$$
For E. coli: optimal codons give $k \approx 20$ aa/s, rare codons $k \approx 2$ aa/s. A cluster of rare codons can cause ribosome pausing, traffic jams, and co-translational folding pauses that may be biologically functional.
6. Translation Termination
Termination occurs when a stop codon (UAA, UAG, or UGA) enters the ribosomal A site. Since no aminoacyl-tRNA has an anticodon complementary to stop codons, protein release factors recognize them instead and trigger hydrolysis of the peptidyl-tRNA bond.
Prokaryotic Release Factors
- RF1: Recognizes UAA and UAG. Contains the PxT tripeptide motif in the anticodon-mimicking domain; GGQ motif catalyzes peptidyl-tRNA hydrolysis
- RF2: Recognizes UAA and UGA. Contains SPF tripeptide for stop codon recognition; also has GGQ motif
- RF3: GTPase that accelerates RF1/RF2 dissociation from the ribosome after peptide release. RF3-GTP binding promotes RF1/RF2 release; GTP hydrolysis releases RF3 itself.
Eukaryotic Release Factors
- eRF1: Recognizes all three stop codons (single omnipotent factor). NIKS motif in domain 1 for stop codon recognition; GGQ motif in domain 2 for peptidyl-tRNA hydrolysis; domain 3 interacts with eRF3
- eRF3: GTPase (translational GTPase superfamily). eRF3-GTP stimulates eRF1 activity; GTP hydrolysis triggers conformational changes for efficient peptide release. Also involved in NMD pathway.
The GGQ Motif: Catalytic Mechanism
The universally conserved Gly-Gly-Gln (GGQ) motif in both RF1/RF2 and eRF1 is positioned in the PTC to catalyze the hydrolysis of the ester bond between the peptide and the P-site tRNA. The glutamine backbone NH positions a water molecule for nucleophilic attack on the ester carbonyl. The glutamine sidechain is methylated post-translationally (by PrmC/HemK) in prokaryotes, enhancing activity. Mutation of either Gly to any other amino acid is lethal.
Ribosome Recycling
- Prokaryotes: Ribosome Recycling Factor (RRF) + EF-G-GTP split the 70S ribosome into subunits. RRF mimics the shape of tRNA and binds to the A site. EF-G-mediated GTP hydrolysis drives subunit dissociation. IF3 then prevents 30S-50S reassociation. Deacylated tRNA and mRNA are released.
- Eukaryotes: ABCE1 (Rli1), an ABC-type ATPase, is the primary recycling factor. ABCE1 uses ATP hydrolysis to mechanically split the 80S ribosome after eRF1-mediated peptide release. Ligatin (eIF2D) and MCT-1/DENR can also promote recycling and reinitiation.
7. Post-Translational Modifications (PTMs)
Most proteins undergo covalent modifications after (or during) translation that are essential for their function, localization, and regulation. The proteome is vastly more complex than the genome due to combinatorial PTMs.
Signal Peptide Cleavage
The signal recognition particle (SRP) recognizes hydrophobic signal peptides (typically 16-30 residues at the N-terminus) as they emerge from the ribosome exit tunnel. SRP directs the ribosome-nascent chain complex to the ER membrane (eukaryotes) or plasma membrane (prokaryotes). After translocation through the Sec61/SecYEG translocon, signal peptidase cleaves the signal peptide.
N-linked Glycosylation (ER)
- Occurs co-translationally in the ER lumen
- Oligosaccharyltransferase (OST) transfers a preassembled 14-sugar core glycan (Glc3Man9GlcNAc2) from dolichol-PP to Asn in the sequon N-X-S/T (X is not Pro)
- Glucose residues trimmed by glucosidases I and II; calnexin/calreticulin cycle ensures proper folding
- Further processed in Golgi (trimming, addition of GlcNAc, Gal, sialic acid, fucose)
- Critical for glycoprotein folding, stability, cell-cell recognition
O-linked Glycosylation (Golgi)
- Occurs post-translationally in the Golgi apparatus
- Sugars added one at a time to Ser or Thr hydroxyl groups
- No consensus sequence (unlike N-linked); often in Ser/Thr-rich regions
- Common core: GalNAc (mucin-type); also O-GlcNAc (cytoplasmic/nuclear; regulatory; competes with phosphorylation)
- Important for mucins, proteoglycans, signaling
Phosphorylation
- Kinases transfer gamma-phosphate from ATP to Ser, Thr, or Tyr
- ~518 kinases in human genome (kinome)
- Reversed by phosphatases
- Major regulatory switch in signaling
- ~30% of all proteins are phosphorylated
Ubiquitination
- 76-residue ubiquitin conjugated to Lys residues
- E1 (activating) → E2 (conjugating) → E3 (ligase) cascade
- K48-linked polyUb: proteasome degradation
- K63-linked polyUb: signaling, DNA repair
- MonoUb: endocytosis, histone regulation
- Reversed by deubiquitinases (DUBs)
SUMOylation
- Small Ubiquitin-like Modifier (~100 aa)
- Conjugated to Lys in ΨKxE consensus
- E1 (SAE1/SAE2) → E2 (Ubc9) → E3 ligases
- Regulates nuclear transport, transcription
- Often antagonistic to ubiquitination
- SUMO-1, SUMO-2/3 have distinct targets
8. Translation Quality Control
Cells have evolved multiple surveillance pathways to detect and deal with aberrant mRNAs and stalled ribosomes. These quality control mechanisms prevent the accumulation of potentially toxic truncated or aberrant proteins.
Nonsense-Mediated Decay (NMD)
- Trigger: Premature termination codon (PTC) located >50 nt upstream of an exon-exon junction
- Mechanism: UPF1 (RNA helicase) interacts with eRF3 during termination. If UPF1 encounters a downstream exon junction complex (EJC, deposited during splicing), it triggers NMD. UPF2 and UPF3 bridge UPF1 to the EJC.
- Outcome: SMG1 kinase phosphorylates UPF1, recruiting SMG5/6/7 which activate mRNA decapping (Dcp1/Dcp2) and deadenylation; also endonucleolytic cleavage by SMG6
- Significance: Degrades ~5-10% of all mRNAs; important for eliminating PTC-containing transcripts from nonsense mutations; also regulates normal gene expression
No-Go Decay (NGD)
- Trigger: Ribosome stalling due to mRNA secondary structure, rare codons, damaged bases, or poly(A) sequences within the ORF
- Mechanism: Stalled ribosomes are sensed by Dom34 (Pelota in mammals) and Hbs1 (HBS1L), which mimic eRF1 and eRF3 respectively. Dom34 lacks the GGQ motif and does not trigger peptide release.
- Outcome: Endonucleolytic cleavage of mRNA near the stall site (by an unknown endonuclease, possibly Cue2); ribosome splitting by ABCE1; fragments degraded by Xrn1 (5' to 3') and the exosome (3' to 5')
Non-Stop Decay (NSD)
- Trigger: mRNAs lacking a stop codon (e.g., premature polyadenylation within ORF, or endonucleolytic cleavage)
- Mechanism: Ribosome translates into the poly(A) tail, producing poly-lysine (AAA = Lys). Poly(A) in the mRNA channel triggers stalling. Ski7 (GTPase) or Dom34/Hbs1 recognize the stalled ribosome.
- Outcome: mRNA degraded by the exosome (recruited by Ski complex: Ski2/Ski3/Ski8). Nascent peptide targeted for proteasomal degradation.
Ribosome-Associated Quality Control (RQC)
- Trigger: Stalled 60S-peptidyl-tRNA complex remaining after ribosome splitting (by Dom34/Hbs1/ABCE1) during NGD or NSD
- Key factor: Listerin (Ltn1/NEMF pathway) — an E3 ubiquitin ligase that ubiquitinates the nascent chain on the stalled 60S subunit
- RQC2 (NEMF): Stabilizes tRNA in the P site of the 60S; remarkably, can add C-terminal Ala-Thr extensions (CATylation or "CAT tails") to the nascent chain without mRNA template, exposing Lys residues buried in the exit tunnel for Ltn1 ubiquitination
- Vms1 (ANKZF1): Releases peptidyl-tRNA from the 60S when Ltn1 pathway is overwhelmed; acts as a backup
- Outcome: Ubiquitinated nascent chain extracted by Cdc48/p97 (AAA-ATPase) and delivered to the 26S proteasome for degradation
- Failure consequences: RQC defects linked to neurodegeneration; Ltn1 mutation causes protein aggregation in mice (cerebellar neurodegeneration)
Prokaryotic Rescue: tmRNA (SsrA)
In bacteria, stalled ribosomes on truncated mRNAs are rescued by the tmRNA (transfer-messenger RNA) system. tmRNA mimics both a tRNA (alanyl-tRNA at its 5' end) and an mRNA (contains a short ORF encoding a degradation tag). When a ribosome stalls at the 3' end of a truncated mRNA, SmpB protein delivers tmRNA to the A site. The ribosome switches templates from the broken mRNA to the tmRNA ORF, adding the tag sequence (AANDENYALAA in E. coli) to the C-terminus of the nascent chain. This tagged protein is then recognized and degraded by ClpXP, ClpAP, FtsH, and Tsp proteases. The ribosome terminates normally at the stop codon within the tmRNA ORF.
Python: Translation Simulation with tRNA Competition
This simulation models a ribosome translating an mRNA codon-by-codon. The elongation rate at each codon depends on the abundance of the cognate tRNA species (based on E. coli codon usage data). Rare codons with low-abundance tRNAs cause ribosome pausing, while common codons with abundant tRNAs are decoded quickly. The stochastic waiting times follow an exponential distribution, reflecting the random arrival of ternary complexes at the A site.
Translation Simulation: Ribosome Elongation with tRNA Competition
PythonCodon-by-codon translation simulator showing elongation rate variation
Click Run to execute the Python code
Code will be executed with Python 3 on the server
Fortran: Codon Usage Statistics and CAI Calculator
This Fortran program analyzes an input DNA sequence to compute codon frequencies, Relative Synonymous Codon Usage (RSCU), relative adaptiveness values, and the Codon Adaptation Index (CAI). The sample sequence is from the E. coli lacZ gene, a highly expressed gene with strong codon bias toward translationally optimal codons.
Codon Usage Statistics & CAI Analysis
FortranComputes codon frequencies, RSCU, relative adaptiveness, and Codon Adaptation Index for a DNA sequence
Click Run to execute the Fortran code
Code will be compiled with gfortran and executed on the server
Summary: Key Factors in Translation
| Phase | Prokaryotic Factors | Eukaryotic Factors |
|---|---|---|
| Initiation | IF1, IF2 (GTPase), IF3; SD sequence; fMet-tRNAf | eIF1, 1A, 2, 2B, 3, 4A, 4B, 4E, 4G, 5, 5B; cap + scanning; Kozak; Met-tRNAi |
| Elongation | EF-Tu (GTPase), EF-Ts (GEF), EF-G (GTPase) | eEF1A (GTPase), eEF1B (GEF), eEF2 (GTPase) |
| Termination | RF1 (UAA/UAG), RF2 (UAA/UGA), RF3 (GTPase) | eRF1 (all stops), eRF3 (GTPase) |
| Recycling | RRF + EF-G + IF3 | ABCE1 (Rli1), Ligatin |
| Quality Control | tmRNA/SmpB, ArfA, ArfB | NMD (UPF1/2/3), NGD (Dom34/Hbs1), NSD, RQC (Ltn1/RQC2) |
Antibiotics Targeting Translation
The structural differences between bacterial 70S and eukaryotic 80S ribosomes make translation a prime target for antibiotics. Many clinically important antibiotics exploit these differences.
30S Subunit Targets
- Tetracyclines: Block A-site tRNA binding
- Aminoglycosides (streptomycin, gentamicin): Bind 16S rRNA near A site; cause misreading by distorting decoding center (A1492/A1493 locked in flipped-out state)
- Spectinomycin: Inhibits EF-G-driven translocation; binds 30S head (h34)
- Kasugamycin: Blocks initiator tRNA binding to P site
50S Subunit Targets
- Chloramphenicol: Binds PTC A-site crevice; blocks aminoacyl-tRNA accommodation
- Macrolides (erythromycin, azithromycin): Bind in the peptide exit tunnel (near L4/L22 constriction); block elongation after 6-8 amino acids
- Lincosamides (clindamycin): Overlap with macrolide binding site in PTC
- Oxazolidinones (linezolid): Bind 50S A site; interfere with initiator tRNA positioning
- Fusidic acid: Prevents EF-G-GDP release; blocks translocation
Polyribosomes and Translational Regulation
Polyribosomes (Polysomes)
- Multiple ribosomes simultaneously translate a single mRNA
- Inter-ribosome spacing: ~80-100 nt (one ribosome per ~30 codons)
- Maximum packing: ~1 ribosome per 80 nt of ORF
- Free polysomes: synthesize cytoplasmic/nuclear proteins
- Membrane-bound polysomes (rough ER): synthesize secreted, membrane, and organellar proteins
- In prokaryotes: coupled transcription-translation (RNA polymerase leads, ribosomes follow immediately)
Key Regulatory Mechanisms
- mTOR pathway: Phosphorylates 4E-BP (releases eIF4E) and S6K (activates eIF4B, S6)
- eIF2-alpha phosphorylation: ISR kinases (GCN2, PERK, HRI, PKR) inhibit global translation but activate ATF4 translation via uORFs
- microRNAs: Recruit RISC/Argonaute to mRNA 3'-UTR; repress translation and promote decay
- Iron response elements: IRE-IRP system; IRP binding to 5'-UTR IRE blocks scanning (ferritin); IRP binding to 3'-UTR stabilizes mRNA (TfR)
- Ribosome heterogeneity: Specialized ribosomes with distinct rRNA/protein compositions may preferentially translate subsets of mRNAs
Derivation: Polysome Density from Initiation/Elongation Ratio
Starting from the rates of translation initiation and elongation, we derive the number of ribosomes simultaneously translating a single mRNA (polysome density).
Step 1: Define the ribosome loading rate
Ribosomes initiate translation at the 5′ end of the mRNA at rate $k_{\text{init}}$ (ribosomes per second). Once initiated, each ribosome moves along the ORF at elongation speed $v_{\text{elong}}$ (codons per second).
Step 2: Ribosome transit time
For an ORF of length $L$ codons, the time for a ribosome to traverse the entire mRNA is:
$$\tau_{\text{transit}} = \frac{L}{v_{\text{elong}}}$$
For a 300-codon ORF at 6 aa/s (eukaryotic): $\tau = 50$ s.
Step 3: Steady-state number of ribosomes per mRNA
By Little's law (queueing theory), the average number of ribosomes on an mRNA equals the loading rate times the transit time:
$$\langle N_{\text{rib}} \rangle = k_{\text{init}} \times \tau_{\text{transit}} = \frac{k_{\text{init}} \cdot L}{v_{\text{elong}}}$$
Step 4: Linear density of ribosomes
The linear density (ribosomes per codon of mRNA) is:
$$\rho = \frac{k_{\text{init}}}{v_{\text{elong}}} \quad \text{(ribosomes per codon)}$$
A ribosome occupies ~30 nt (10 codons) of mRNA, so the maximum density is $\rho_{\max} = 1/10$ per codon. Initiation cannot exceed $k_{\text{init}}^{\max} = v_{\text{elong}}/10$ without causing queuing.
Step 5: Protein production rate per mRNA
The rate of completed proteins from one mRNA equals the initiation rate (in steady state, each initiated ribosome eventually produces one protein):
$$\frac{d[\text{protein}]}{dt}\bigg|_{\text{per mRNA}} = k_{\text{init}}$$
Total cellular protein production rate: $k_{\text{init}} \times [\text{mRNA}]$.
Step 6: Numerical examples
Highly expressed mRNA in E. coli: $k_{\text{init}} \approx 1$ s$^{-1}$, $v_{\text{elong}} = 15$ aa/s, $L = 300$ codons. Then $\langle N_{\text{rib}} \rangle = 300/15 = 20$ ribosomes per mRNA, $\rho = 1/15 \approx 0.07$ per codon. Electron micrographs of polysomes confirm 10-70 ribosomes on highly expressed mRNAs. Eukaryotic average: $k_{\text{init}} \approx 0.1$ s$^{-1}$, $v = 6$ aa/s, giving $\langle N \rangle \approx 5$ ribosomes per mRNA.
Derivation: Energy Cost of Protein Synthesis
Starting from the individual steps of translation, we derive the total number of high-energy phosphate bonds consumed per amino acid incorporated into a polypeptide.
Step 1: Aminoacyl-tRNA synthesis (charging)
Aminoacyl-tRNA synthetase activates the amino acid using ATP:
$$\text{AA} + \text{ATP} \rightarrow \text{AA-AMP} + \text{PP}_i$$
The PP$_i$ is hydrolyzed by pyrophosphatase: $\text{PP}_i \rightarrow 2\text{P}_i$. This makes the reaction irreversible. Net cost: 2 high-energy phosphate bonds (ATP $\rightarrow$ AMP + 2P$_i$, equivalent to 2 ATP $\rightarrow$ 2 ADP).
Step 2: EF-Tu GTP hydrolysis (A-site delivery)
The EF-Tu-GTP-aa-tRNA ternary complex delivers the aminoacyl-tRNA to the ribosomal A site. Codon recognition triggers GTP hydrolysis:
$$\text{EF-Tu-GTP} \rightarrow \text{EF-Tu-GDP} + \text{P}_i$$
Cost: 1 high-energy phosphate bond. EF-Ts then recycles EF-Tu-GDP back to EF-Tu-GTP (no additional NTP cost).
Step 3: EF-G GTP hydrolysis (translocation)
After peptide bond formation, EF-G-GTP binds the ribosome and hydrolyzes GTP to drive translocation of the mRNA-tRNA complex by one codon:
$$\text{EF-G-GTP} \rightarrow \text{EF-G-GDP} + \text{P}_i$$
Cost: 1 high-energy phosphate bond.
Step 4: Total per amino acid
Summing all steps for one elongation cycle:
$$\text{Total} = \underbrace{2}_{\text{charging}} + \underbrace{1}_{\text{EF-Tu}} + \underbrace{1}_{\text{EF-G}} = 4 \text{ high-energy phosphate bonds per amino acid}$$
Step 5: Energy in thermodynamic terms
Each high-energy phosphate bond hydrolysis releases $\Delta G \approx -7.3$ kcal/mol under cellular conditions. The total energy invested per amino acid:
$$\Delta G_{\text{total}} = 4 \times 7.3 = 29.2 \text{ kcal/mol per amino acid}$$
Since peptide bond formation itself is only slightly endergonic ($\Delta G \approx +0.5$ kcal/mol), the process is driven far from equilibrium, making translation essentially irreversible.
Step 6: Cost of a complete protein and cellular energy budget
For a typical 300-amino-acid protein:
$$\text{Cost} = 300 \times 4 = 1{,}200 \text{ NTP equivalents} \approx 8{,}760 \text{ kcal/mol}$$
An E. coli cell growing with a 30-minute doubling time synthesizes ~$2 \times 10^6$ proteins per generation, consuming ~$2.4 \times 10^9$ ATP equivalents. This represents ~75% of cellular energy expenditure, explaining why translation is the dominant energy sink and why translational regulation is crucial for cellular economy.