Protein Folding Dynamics
From two-state folding kinetics and chevron analysis through phi-value mapping of transition states to the Zimm-Bragg helix-coil theory and chaperone mechanisms.
Derivation 1: Two-State Folding & the Chevron Plot
Many small single-domain proteins fold in a two-state manner, with only the native (N) and unfolded (U) states significantly populated:
Equilibrium Thermodynamics
The equilibrium constant and free energy of folding are:
where $\Delta G = G_U - G_N > 0$ for a stable protein. At the midpoint concentration of denaturant$C_m$, $\Delta G = 0$ and $K = 1$ ($k_f = k_u$).
The fraction unfolded at equilibrium is:
Kinetics: The Chevron Plot
For a two-state protein, relaxation to equilibrium follows single-exponential kinetics with observed rate:
Both rates depend on denaturant concentration [D] according to linear free energy relationships:
where $m_f < 0$ (folding slows with denaturant) and $m_u > 0$ (unfolding accelerates). A plot of $\ln k_{\text{obs}}$ vs [D] produces the characteristic V-shaped chevron plot: the left arm reflects folding and the right arm reflects unfolding.
Tanford $\beta$-Value
The Tanford $\beta$ value quantifies the position of the transition state on the folding reaction coordinate:
where $m_{\text{eq}} = m_f - m_u$ is the equilibrium m-value. A $\beta_T \approx 0.7$ indicates that the transition state has ~70% of the solvent-accessible surface area buried relative to the native state ā it is compact and native-like.
Linear Extrapolation Method (LEM)
The free energy of folding in water is obtained by linear extrapolation:
At $C_m$, $\Delta G = 0$, so $\Delta G^{\text{H}_2\text{O}} = m_{\text{eq}} \cdot C_m$. Kinetic and equilibrium measurements must agree: $\Delta G^{\text{H}_2\text{O}} = -RT\ln(k_f^{\text{H}_2\text{O}}/k_u^{\text{H}_2\text{O}})$.
Derivation 2: $\Phi$-Value Analysis
$\Phi$-value analysis, developed by Alan Fersht, is the most powerful experimental method for mapping the structure of the transition state (TS) at residue-level resolution.
Definition
For a mutation that destabilizes the native state by $\Delta\Delta G°$ and raises the folding barrier by $\Delta\Delta G^\ddagger$:
where the individual terms are computed from kinetic data:
Interpretation
- $\Phi = 1$: The mutation affects the TS as much as it affects the native state. The residue is fully structured (native-like) in the TS. The mutation slows folding but does not affect unfolding.
- $\Phi = 0$: The mutation does not affect the TS at all. The residue is fully unstructured in the TS. The mutation accelerates unfolding but does not affect folding.
- $0 < \Phi < 1$: Partial structure formation at the TS. Could reflect fractional native contacts, or an average over parallel pathways.
Requirements for Reliable $\Phi$-Values
Valid $\Phi$-value analysis requires:
- Conservative mutations (e.g., $\text{Val} \rightarrow \text{Ala}$, deletion of a methyl group) that remove non-covalent interactions without introducing new ones
- Sufficiently large $\Delta\Delta G° > 2\;\text{kJ/mol}$ to avoid noise artifacts
- The protein must remain two-state (no change in folding mechanism upon mutation)
- Linear chevron plots without rollover (curvature indicates non-two-state behavior)
Key Findings from $\Phi$-Value Studies
Decades of $\Phi$-value analysis across many proteins have revealed that:
- Transition states are heterogeneous but compact ā generally 60ā80% of native contacts are formed
- The nucleus (high-$\Phi$ residues) is typically formed by residues from multiple secondary structure elements, confirming the nucleation-condensation mechanism
- Proteins with similar topology tend to have similar $\Phi$-value patterns, supporting the idea that topology determines the folding mechanism
Derivation 3: Zimm-Bragg Helix-Coil Transition
The Zimm-Bragg model (1959) provides an exact statistical mechanical treatment of the helix-coil transition in polypeptides using a transfer matrix formalism.
The Two Parameters
- $s$ (propagation parameter): The equilibrium constant for adding a helical residue to an existing helix:$\cdots\text{hh}\text{c} \rightleftharpoons \cdots\text{hh}\text{h}$. When $s > 1$, helix propagation is favorable.
- $\sigma$ (nucleation parameter): The equilibrium constant penalty for initiating a new helix segment:$\cdots\text{cc}\text{c} \rightleftharpoons \cdots\text{cc}\text{h}$. The statistical weight for nucleation is $\sigma s$. For polypeptides, $\sigma \approx 10^{-3}$ to $10^{-4}$, reflecting the entropic cost of fixing three consecutive residues to form the first H-bond.
Transfer Matrix Formulation
Each residue is in state c (coil) or h (helix). The statistical weight of each pair transition defines the transfer matrix:
The rows index the state of residue $i$ (top: c, bottom: h) and the columns index the state of residue $i+1$. The partition function for a chain of $N$ residues is:
where $\mathbf{e}_1$ and $\mathbf{e}_2$ are appropriate boundary vectors.
Eigenvalue Solution
The eigenvalues of $\mathbf{M}$ are:
For large $N$, $Z \approx \lambda_1^N$ (the larger eigenvalue dominates), and the helix fraction is:
Key features of the Zimm-Bragg model:
- The transition midpoint occurs at $s = 1$
- The sharpness of the transition is governed by $\sigma$: smaller $\sigma$ gives a sharper (more cooperative) transition
- The width of the transition scales as $\Delta s \sim \sqrt{\sigma}$
- Longer chains show sharper transitions (finite-size effects)
Temperature Dependence
The propagation parameter $s$ depends on temperature through:
where $\Delta H_{\text{res}} \approx -4\;\text{kJ/mol}$ is the enthalpy change per residue for helix formation and $T_m$ is the melting temperature. At $T = T_m$, $s = 1$ and the helix fraction is 0.5.
Derivation 4: Chaperone Mechanisms
The GroEL/GroES System
GroEL is a tetradecameric (14-subunit) chaperonin arranged as two stacked heptameric rings, forming a barrel with a central cavity (~45 Ć diameter). GroES is a heptameric co-chaperonin lid.
Mechanism (Iterative Annealing):
- Capture: Unfolded/misfolded substrate binds to the hydrophobic inner surface of the open (trans) ring of GroEL
- Encapsulation: ATP binding triggers a conformational change; GroES caps the ring, creating an enclosed, hydrophilic chamber (~65 Ć diameter, ~175,000 Ć $^3$ volume)
- Folding: The substrate folds in the chamber for ~10 s (the time for ATP hydrolysis), protected from aggregation. The confined space may accelerate folding by limiting the conformational search
- Release: ATP hydrolysis weakens GroES binding. ATP binding to the opposite (trans) ring triggers GroES and substrate release
- Reiteration: If not yet native, the substrate can rebind for another round. This is the "iterative annealing" mechanism
The energetic cost is 7 ATP per folding cycle. Approximately 10ā15% of E. coli proteins are GroEL substrates, primarily those with complex $\alpha/\beta$ topologies (TIM barrels, Rossmann folds).
The Hsp70 System
Hsp70 (DnaK in bacteria) is the most abundant cellular chaperone. It works with the co-chaperone Hsp40 (DnaJ) and nucleotide exchange factor (GrpE/BAG). The mechanism:
- ATP-bound state: Substrate-binding domain is open (lid up), fast on/off kinetics, low affinity
- ATP hydrolysis (stimulated by Hsp40): Lid closes, trapping the substrate. High affinity, slow off-rate
- Nucleotide exchange (by GrpE/BAG): ADP is replaced by ATP, lid opens, substrate is released
Hsp70 recognizes short hydrophobic segments (~5 residues) that are exposed in unfolded or misfolded proteins. By repeatedly binding and releasing, Hsp70 prevents aggregation and gives the substrate multiple opportunities to fold correctly ā another form of iterative annealing.
Kinetic Partitioning Model
The competition between productive folding and aggregation can be described by kinetic partitioning:
Chaperones act by reducing $k_{\text{agg}}$ (sequestering aggregation-prone intermediates) and effectively increasing the folding yield. This is particularly important during heat shock, where the concentration of unfolded proteins rises dramatically.
Applications: Protein Misfolding & Disease
Amyloid Diseases
Amyloidoses are characterized by the deposition of cross-$\beta$ fibrils ā highly ordered, insoluble protein aggregates with a shared structural core: a stack of $\beta$-strands running perpendicular to the fibril axis with inter-strand H-bonds running parallel to the axis (the cross-$\beta$ motif).
Major Amyloid Diseases
- Alzheimer's disease: $\text{A}\beta_{42}$ peptide and tau protein fibrils
- Parkinson's disease: $\alpha$-synuclein Lewy body fibrils
- Type 2 diabetes: IAPP (amylin) fibrils in pancreatic islets
- Huntington's disease: polyglutamine (polyQ) expansion in huntingtin protein
- ALS: SOD1, TDP-43, FUS aggregates
- Systemic amyloidosis: immunoglobulin light chain (AL), transthyretin (ATTR)
The thermodynamic driving force for amyloid formation is that the cross-$\beta$ structure is often the global free energy minimum for polypeptide chains. Native protein structures are kinetically trapped metastable states separated from the amyloid state by large barriers.
Prion Diseases
Prions ($\text{PrP}^{\text{Sc}}$) are infectious misfolded forms of the prion protein ($\text{PrP}^{\text{C}}$). The protein-only hypothesis (Prusiner, Nobel 1997) states that $\text{PrP}^{\text{Sc}}$ propagates by templating the conversion of normal$\text{PrP}^{\text{C}}$ to the misfolded form. The conversion involves a dramatic structural rearrangement:$\text{PrP}^{\text{C}}$ is predominantly $\alpha$-helical, while $\text{PrP}^{\text{Sc}}$ is rich in $\beta$-sheet. The mechanism follows a nucleated polymerization model:
where the first term describes elongation (fast) and the second describes de novo nucleation (extremely slow, accounting for long incubation periods).
Drug Design Targeting Misfolded Proteins
- Kinetic stabilizers: Tafamidis stabilizes the native tetrameric form of transthyretin (TTR), preventing dissociation to monomers that form amyloid. FDA-approved for ATTR cardiomyopathy.
- Anti-amyloid antibodies: Lecanemab and aducanumab target $\text{A}\beta$ aggregates. Lecanemab (FDA-approved 2023) shows modest but significant slowing of cognitive decline.
- Chemical chaperones: Small molecules (e.g., 4-phenylbutyrate) that stabilize native protein conformations, used in cystic fibrosis (stabilizing $\Delta\text{F508}$ CFTR).
- Aggregation inhibitors: Compounds that cap growing fibril ends or redirect aggregation pathways toward off-pathway, non-toxic species.
Python Simulation: Two-State Chevron Plot & Kinetics
This simulation generates the chevron plot showing how $\ln(k_{\text{obs}})$ varies with denaturant concentration, along with the equilibrium denaturation curve and the free energy dependence on [denaturant].
Two-State Folding: Chevron Plot, Free Energy, and Denaturation Curve
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Python Simulation: Zimm-Bragg Helix-Coil Transition
Exploring the Zimm-Bragg model with the transfer matrix eigenvalue solution. The three panels show the effects of the nucleation parameter $\sigma$, chain length $N$, and temperature on the sharpness of the helix-coil transition.
Zimm-Bragg Helix-Coil Transition: Nucleation, Chain Length, and Temperature
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Python Simulation: Folding Kinetics & $\Phi$-Value Analysis
This simulation shows: (1) two-state folding kinetics at different denaturant concentrations, (2) a simulated $\Phi$-value analysis scatter plot mapping transition state structure, and (3) free energy profiles comparing two-state, three-state, and downhill folding mechanisms.
Folding Kinetics, Phi-Value Analysis, and Free Energy Profiles
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Folding Rate Predictors & Contact Order
Relative Contact Order (RCO)
Plaxco, Simons, and Baker (1998) discovered that the folding rates of two-state proteins correlate remarkably well with a simple topological property: the relative contact order.
where $L$ is the total number of residues, $N_c$ is the number of native contacts, and$|i - j|$ is the sequence separation between contacting residues $i$ and $j$. The correlation with folding rate is:
with correlation coefficient $r \approx 0.8$. Proteins with predominantly local contacts ($\alpha$-helical, low RCO) fold faster than those with many non-local contacts ($\beta$-sheet rich, high RCO). This supports the idea that topology is the primary determinant of folding rate.
Chain Length Dependence
For two-state folders, the folding rate also depends on chain length:
The exponent $\sim 0.6$ is consistent with polymer theory: the conformational search time scales with the number of effective chain segments raised to a power related to the Flory exponent.
Nucleation-Condensation Mechanism
The dominant mechanism for two-state folding is nucleation-condensation(Fersht, 1995). In contrast to the earlier framework model (secondary structure forms first) and hydrophobic collapse model (collapse occurs first), nucleation-condensation proposes that:
- A diffuse folding nucleus forms in the transition state, comprising elements of secondary and tertiary structure simultaneously
- The nucleus is stabilized by a combination of local (secondary structure) and non-local (tertiary) interactions
- Once the nucleus forms, the rest of the chain rapidly condenses around it
- $\Phi$-values for nucleus residues are typically 0.3ā0.7 (fractional, not fully native-like)
Diffusion-Collision Model
For larger proteins, the diffusion-collision model (Karplus and Weaver) describes folding as a hierarchical process: pre-formed microdomains (secondary structure elements) diffuse and collide to form the native tertiary structure. The rate depends on the diffusion rate of the microdomains and the probability that a collision is productive:
where $P_{\text{productive}}$ is the probability that colliding microdomains are correctly oriented and$P_{\text{correct}}$ accounts for the combinatorics of forming all necessary contacts.
Experimental Methods for Studying Folding
Stopped-Flow Kinetics
The workhorse for measuring folding/unfolding kinetics on the millisecond timescale. Two solutions (protein + denaturant at different concentrations) are rapidly mixed (dead time ~1 ms), and the signal (fluorescence, CD, absorbance) is monitored as a function of time. The observed rate constant is extracted by fitting to single or multi-exponential functions:
Temperature Jump (T-Jump)
Ultrafast heating (nanoseconds) using infrared laser pulses or electrical discharge perturbs the folding equilibrium, enabling the study of folding dynamics on the microsecond timescale. The temperature change is typically 5ā15°C. Combined with fluorescence or IR spectroscopy, T-jump can reveal the earliest events in folding: helix formation, hydrophobic collapse, and the formation of the folding nucleus.
Hydrogen/Deuterium Exchange (HDX)
Backbone amide hydrogens exchange with solvent D$_2$O at rates that depend on their structural environment. In a native protein, amides involved in H-bonds or buried in the core exchange slowly (protection factors of $10^3$ to $10^8$). The exchange rate is:
where $k_{\text{int}}$ is the intrinsic (unprotected) exchange rate, $k_{\text{op}}$ and$k_{\text{cl}}$ are the local opening and closing rates, and the protection factor $P_f = k_{\text{cl}}/k_{\text{op}}$. Under EX2 conditions (most physiological):
HDX monitored by NMR provides residue-level information; HDX-MS provides peptide-level resolution for larger proteins and complexes.
Single-Molecule FRET
Fluorescence resonance energy transfer between donor and acceptor dyes attached to specific sites on the protein reports on intramolecular distances in real time. The FRET efficiency is:
where $r$ is the donor-acceptor distance and $R_0$ is the Forster radius (typically 40ā60 Ć ). Single-molecule experiments reveal conformational heterogeneity and rare folding intermediates that are hidden in ensemble-averaged experiments.
Key Equations Summary
Two-State Equilibrium
$\Phi$-Value Definition
Zimm-Bragg Helix Fraction
Tanford $\beta$-Value
FRET Efficiency
Relative Contact Order