Part VI: Computational & Systems Biophysics | Chapter 2

Mass Spectrometry in Biophysics

Mass-to-charge analysis, electrospray ionization, tandem MS fragmentation, hydrogen-deuterium exchange, and native mass spectrometry

Mass Spectrometry: Measuring Molecular Mass with Extraordinary Precision

Mass spectrometry (MS) determines the mass-to-charge ratio ($m/z$) of ions, providing molecular mass, sequence information, and structural insights for biomolecules ranging from small metabolites to megadalton complexes. The development of soft ionization methods — electrospray ionization (ESI, Nobel Prize to Fenn, 2002) and matrix-assisted laser desorption ionization (MALDI, Nobel Prize to Tanaka, 2002) — revolutionized the analysis of biological macromolecules.

Modern mass spectrometry achieves mass accuracy of parts per million, resolving power exceeding$10^6$, and sensitivity at the attomole level, making it indispensable in proteomics, metabolomics, structural biology, and clinical diagnostics.

1. Principles of Mass Analysis

All mass analyzers separate ions based on their mass-to-charge ratio $m/z$. The two fundamental approaches — magnetic sector and time-of-flight — illustrate the physics of mass separation.

Derivation: Magnetic Sector Mass Analyzer

An ion of mass $m$, charge $q = ze$ (where $z$ is the charge state and $e$ the elementary charge) is first accelerated through a potential difference $V$, gaining kinetic energy:

$$zeV = \frac{1}{2}mv^2 \implies v = \sqrt{\frac{2zeV}{m}}$$

The ion then enters a magnetic field $B$ perpendicular to its velocity. The Lorentz force provides centripetal acceleration:

$$zevB = \frac{mv^2}{r}$$

Solving for $r$: $r = mv/(zeB)$. Substituting the expression for $v$:

$$r = \frac{m}{zeB}\sqrt{\frac{2zeV}{m}} = \frac{1}{B}\sqrt{\frac{2mV}{ze}}$$

Solving for $m/z$:

$$\boxed{\frac{m}{z} = \frac{B^2 r^2 e}{2V}}$$

By scanning $B$ or $V$ while keeping $r$ fixed (at the detector position), different $m/z$ values are selected sequentially.

Derivation: Time-of-Flight (TOF) Mass Analyzer

In a TOF analyzer, all ions are accelerated through the same potential $V$ and then drift through a field-free region of length $L$. The drift time depends on the ion velocity, which in turn depends on $m/z$.

From $zeV = \frac{1}{2}mv^2$, the velocity is $v = \sqrt{2zeV/m}$. The flight time through the drift tube is:

$$t = \frac{L}{v} = L\sqrt{\frac{m}{2zeV}}$$

Solving for $m/z$:

$$\boxed{\frac{m}{z} = \frac{2eVt^2}{L^2}}$$

Mass resolution: The resolving power is defined as$R = m/\Delta m$, where $\Delta m$ is the smallest mass difference that can be distinguished. For TOF:

$$R = \frac{m}{\Delta m} = \frac{t}{2\Delta t}$$

where $\Delta t$ is the temporal peak width. Modern reflectron TOF instruments achieve $R > 50{,}000$ by correcting for initial kinetic energy spread. Orbitrap and FT-ICR analyzers achieve $R > 10^6$.

2. Electrospray Ionization

Electrospray ionization (ESI) gently transfers proteins and other biomolecules from solution into the gas phase as multiply charged ions, preserving noncovalent interactions under native conditions. Understanding the physics of charged droplet fission is essential.

Derivation: The Rayleigh Limit

A charged droplet of radius $R$ carrying charge $q$ experiences two competing forces: surface tension $\gamma$ tends to minimize the surface area (keep the droplet spherical), while Coulomb repulsion tends to tear it apart.

The surface energy of a sphere is:

$$E_{\text{surface}} = 4\pi R^2 \gamma$$

The electrostatic self-energy of a uniformly charged sphere is:

$$E_{\text{Coulomb}} = \frac{q^2}{8\pi\varepsilon_0 R}$$

Rayleigh's stability analysis considers the energy change when the sphere deforms into a prolate spheroid. The droplet becomes unstable when the Coulomb energy equals twice the surface energy (for the $l = 2$ deformation mode). Setting$\partial^2 E / \partial \epsilon^2 = 0$ for the deformation parameter$\epsilon$ gives the Rayleigh limit:

$$\boxed{q_R = 8\pi\sqrt{\varepsilon_0 \gamma R^3}}$$

When the droplet charge exceeds $q_R$, Coulomb fission occurs: the droplet ejects small, highly charged offspring droplets containing approximately 2% of the parent mass but 15% of the charge. This process repeats as solvent evaporates, ultimately yielding bare, multiply charged analyte ions.

Charge state distribution: For proteins under denaturing conditions, the average charge state follows an empirical relationship with molecular weight:

$$z_{\text{avg}} \approx 0.0778 \sqrt{MW}$$

A 50 kDa protein thus acquires approximately $0.0778 \sqrt{50000} \approx 17$charges. Under native conditions, proteins acquire fewer charges due to their compact structure, with the maximum charge limited by the solvent-accessible surface area.

3. Tandem MS and Peptide Sequencing

Tandem mass spectrometry (MS/MS) fragments selected precursor ions and measures the masses of the resulting product ions, providing sequence information for peptides and structural information for other biomolecules.

Derivation: Collision-Induced Dissociation Energy

In collision-induced dissociation (CID), a precursor ion collides with a neutral gas (N$_2$, Ar, or He). The maximum energy available for internal excitation in a single collision is limited by the center-of-mass collision energy:

$$E_{\text{cm}} = E_{\text{lab}} \cdot \frac{m_{\text{gas}}}{m_{\text{ion}} + m_{\text{gas}}}$$

where $E_{\text{lab}}$ is the laboratory-frame kinetic energy of the ion. For a peptide of mass 1000 Da colliding with N$_2$ (28 Da) at$E_{\text{lab}} = 30$ eV:

$$E_{\text{cm}} = 30 \times \frac{28}{1000 + 28} = 0.82 \text{ eV}$$

Multiple collisions deposit enough internal energy to break covalent bonds. For peptides, the weakest bonds along the backbone are the amide bonds, leading to characteristic fragment ion series.

Peptide fragmentation nomenclature: Along the peptide backbone, cleavage of the amide bond C(O)–NH generates:

b ions: N-terminal fragments retaining the charge on the N-terminal side. $b_n$ contains residues 1 through $n$.
y ions: C-terminal fragments retaining the charge on the C-terminal side. $y_n$ contains the last $n$ residues.

For a peptide of $n$ residues, $b_i + y_{n-i} = M + H$ (where$M$ is the neutral peptide mass). The mass differences between consecutive$b$ or $y$ ions correspond to amino acid residue masses:

$$b_{i+1} - b_i = \text{residue mass of amino acid } (i+1)$$

$$y_{i+1} - y_i = \text{residue mass of amino acid from C-terminus}$$

By reading the mass ladder, the peptide sequence can be determined de novo or matched against a protein database. The mass accuracy of modern instruments ($< 5$ ppm) enables unambiguous identification of most amino acids (except the isobaric pair Leu/Ile at 113.084 Da).

Detailed Example: De Novo Sequencing from an MS/MS Spectrum

Consider a peptide with [M+2H]$^{2+}$ at $m/z = 530.28$, giving$M = 2 \times 530.28 - 2 \times 1.008 = 1058.54$ Da. The MS/MS spectrum shows the following dominant y-ion series (singly charged):

y$_1$ = 175.12, y$_2$ = 304.16, y$_3$ = 417.24, y$_4$ = 530.33, y$_5$ = 627.38, y$_6$ = 740.47, y$_7$ = 869.51, y$_8$ = 940.55

The mass differences between consecutive y ions reveal the sequence (reading from C-terminus):

y$_2$ - y$_1$ = 129.04 $\to$ Glu (E, 129.043)
y$_3$ - y$_2$ = 113.08 $\to$ Leu/Ile (L/I, 113.084)
y$_4$ - y$_3$ = 113.09 $\to$ Leu/Ile (L/I, 113.084)
y$_5$ - y$_4$ = 97.05 $\to$ Pro (P, 97.053)
y$_6$ - y$_5$ = 113.09 $\to$ Leu/Ile (L/I, 113.084)
y$_7$ - y$_6$ = 129.04 $\to$ Glu (E, 129.043)
y$_8$ - y$_7$ = 71.04 $\to$ Ala (A, 71.037)

Reading from C-terminus: R-E-L-L-P-L-E-A-... The remaining mass at the N-terminus ($M + 1.008 - y_8 = 1059.55 - 940.55 = 119.00$) corresponds to Thr (T, 101.048) plus the N-terminal H (1.008) and additional mass. Database searching confirms the peptide.

Mass accuracy requirements: At 5 ppm mass accuracy on a 1000 Da peptide, the mass tolerance is 0.005 Da. Since the closest amino acid pair (Gln/Lys) differs by 0.036 Da, this accuracy is sufficient for unambiguous identification of all residues except Leu/Ile (identical mass to 4 decimal places, distinguishable only by specialized fragmentation methods such as ECD or ETD).

Mass Analyzer Comparison

Different mass analyzers offer distinct trade-offs between resolution, mass range, speed, and sensitivity:

Quadrupole: Four parallel rods with oscillating RF/DC fields act as a mass filter. Resolution $R \sim 1{,}000\text{--}4{,}000$. Fast scanning, robust, widely used for quantitation (SRM/MRM mode).
Time-of-flight (TOF): Resolution$R \sim 20{,}000\text{--}60{,}000$ with reflectron. Theoretically unlimited mass range. Very fast (all ions detected simultaneously). Ideal for MALDI and intact protein analysis.
Orbitrap: Ions orbit around a central spindle electrode. Frequency of axial oscillation: $\omega_z = \sqrt{(z/m) \cdot k}$ where$k$ is a field constant. Resolution $R \sim 100{,}000\text{--}1{,}000{,}000$. Image current detection enables exquisite mass accuracy ($< 1$ ppm).
FT-ICR: Ions orbit in a magnetic field at the cyclotron frequency $\omega_c = zeB/m$. Highest resolution ($R > 10^7$) and mass accuracy ($< 0.1$ ppm). Requires superconducting magnets (7–15 T).

ESI Charge States, Isotope Patterns, and Peptide Fragmentation

Python

script.py171 lines

import numpy as np

# Mass Spectrometry Simulations
np.random.seed(42)

print("=== Electrospray Ionization: Charge State Distribution ===")
print()

# Rayleigh limit calculation
epsilon_0 = 8.854e-12  # F/m
gamma_water = 0.072     # N/m (surface tension of water)

print("1. Rayleigh Limit for Charged Droplets")
print(f"   Surface tension (water): {gamma_water} N/m")
print(f"{'Radius (um)':>12} {'q_R (C)':>14} {'q_R (charges)':>15} {'V_R (kV)':>10}")
for R_um in [0.1, 0.5, 1.0, 2.0, 5.0, 10.0]:
    R = R_um * 1e-6  # convert to meters
    q_R = 8 * np.pi * np.sqrt(epsilon_0 * gamma_water * R**3)
    n_charges = q_R / 1.602e-19
    V_R = q_R / (4 * np.pi * epsilon_0 * R)
    print(f"{R_um:>12.1f} {q_R:>14.3e} {n_charges:>15.0f} {V_R/1000:>10.2f}")

# ESI charge state distribution for proteins
print()
print("2. ESI Charge State Distribution for Proteins")
print(f"   Empirical: z_avg = 0.0778 * sqrt(MW)")
print()

proteins = [
    ("Ubiquitin", 8565),
    ("Cytochrome c", 12384),
    ("Lysozyme", 14313),
    ("Myoglobin", 16951),
    ("Carbonic Anh.", 29024),
    ("BSA", 66430),
    ("IgG antibody", 148000),
]

print(f"{'Protein':>16} {'MW (Da)':>10} {'z_avg':>7} {'m/z range':>15}")
for name, mw in proteins:
    z_avg = 0.0778 * np.sqrt(mw)
    z_min = max(1, int(z_avg - 3))
    z_max = int(z_avg + 3)
    mz_min = (mw + z_max * 1.008) / z_max
    mz_max = (mw + z_min * 1.008) / z_min
    print(f"{name:>16} {mw:>10} {z_avg:>7.1f} {mz_min:>7.0f}-{mz_max:.0f}")

# Simulate ESI spectrum for a protein
print()
print("3. Simulated ESI Spectrum for Myoglobin (MW = 16951 Da)")
MW = 16951
z_avg = 0.0778 * np.sqrt(MW)
z_sigma = 1.5

charge_states = list(range(8, 25))
print(f"{'Charge (z)':>11} {'m/z':>10} {'Relative Intensity':>20}")
intensities = []
for z in charge_states:
    mz = (MW + z * 1.00794) / z
    intensity = np.exp(-0.5 * ((z - z_avg) / z_sigma)**2)
    intensities.append(intensity)
    if intensity > 0.05:
        print(f"{z:>11} {mz:>10.2f} {intensity:>20.4f}")

# Isotope pattern calculation
print()
print("=== Isotope Pattern Calculation ===")
print()

# Average isotopic composition of amino acids
# Averagine: C4.9384 H7.7583 N1.3577 O1.4773 S0.0417
# We will compute isotope pattern using binomial/multinomial expansion

def isotope_pattern_peptide(MW, max_peaks=10):
    """Approximate isotope pattern using Poisson model for large molecules"""
    # Average number of each element per Da
    # Averagine composition per residue (avg MW 111.1254)
    n_residues = MW / 111.1254
    n_C = 4.9384 * n_residues
    n_H = 7.7583 * n_residues
    n_N = 1.3577 * n_residues
    n_O = 1.4773 * n_residues
    n_S = 0.0417 * n_residues

# Average number of heavy isotopes (Poisson parameter)
    # 13C: 1.07%, 2H: 0.012%, 15N: 0.37%, 17O: 0.038%, 18O: 0.205%, 33S: 0.75%, 34S: 4.25%
    lambda_val = (n_C * 0.0107 + n_H * 0.00012 + n_N * 0.0037 +
                  n_O * (0.00038 + 0.00205) + n_S * (0.0075 + 0.0425))

# Poisson distribution for number of heavy isotopes
    pattern = []
    for k in range(max_peaks):
        # Poisson probability
        if k == 0:
            prob = np.exp(-lambda_val)
        else:
            prob = np.exp(-lambda_val) * lambda_val**k
            for j in range(1, k+1):
                prob /= j
        pattern.append(prob)

# Normalize
    max_p = max(pattern)
    pattern = [p/max_p for p in pattern]
    return pattern, lambda_val

print("Isotope Patterns for Peptides/Proteins (Averagine model):")
print()
test_masses = [500, 1000, 2000, 5000, 10000, 50000]
for mw in test_masses:
    pattern, lam = isotope_pattern_peptide(mw, max_peaks=8)
    mono_idx = 0
    max_idx = pattern.index(max(pattern))
    print(f"MW = {mw:>6} Da, lambda = {lam:.2f}, most abundant isotope: M+{max_idx}")
    print(f"  Peaks: ", end="")
    for i, p in enumerate(pattern):
        if p > 0.01:
            print(f"M+{i}:{p:.3f} ", end="")
    print()
    print()

# Peptide fragmentation (b and y ions)
print("=== Tandem MS: Peptide Fragmentation ===")
print()

# Amino acid residue masses (monoisotopic)
aa_masses = {
    'G': 57.02146, 'A': 71.03711, 'V': 99.06841, 'L': 113.08406,
    'I': 113.08406, 'P': 97.05276, 'F': 147.06841, 'W': 186.07931,
    'M': 131.04049, 'S': 87.03203, 'T': 101.04768, 'C': 103.00919,
    'Y': 163.06333, 'H': 137.05891, 'D': 115.02694, 'E': 129.04259,
    'N': 114.04293, 'Q': 128.05858, 'K': 128.09496, 'R': 156.10111
}

peptide = "GASPVFIHDK"
print(f"Peptide sequence: {peptide}")
print(f"Length: {len(peptide)} residues")

# Calculate fragment ions
total_mass = sum(aa_masses[aa] for aa in peptide) + 18.01056  # add water
print(f"Monoisotopic mass [M]: {total_mass:.5f} Da")
print(f"[M+H]+: {total_mass + 1.00728:.5f}")
print(f"[M+2H]2+: {(total_mass + 2*1.00728)/2:.5f}")
print()

# b ions (N-terminal fragments)
print(f"{'Ion':>6} {'Sequence':>12} {'m/z (z=1)':>12}")
b_mass = 0
for i in range(1, len(peptide)):
    b_mass += aa_masses[peptide[i-1]]
    print(f"{'b'+str(i):>6} {peptide[:i]:>12} {b_mass + 1.00728:>12.5f}")

print()
# y ions (C-terminal fragments)
y_mass = 18.01056  # water
print(f"{'Ion':>6} {'Sequence':>12} {'m/z (z=1)':>12}")
for i in range(1, len(peptide)):
    y_mass_i = sum(aa_masses[aa] for aa in peptide[len(peptide)-i:]) + 18.01056
    print(f"{'y'+str(i):>6} {peptide[len(peptide)-i:]:>12} {y_mass_i + 1.00728:>12.5f}")

# Verify: b_i + y_(n-i) = M + H
print()
print("Verification: b_i + y_(n-i) should equal [M+H]+ = {:.5f}".format(total_mass + 1.00728))
b_mass = 0
for i in range(1, len(peptide)):
    b_mass += aa_masses[peptide[i-1]]
    b_i = b_mass + 1.00728
    y_ni = sum(aa_masses[aa] for aa in peptide[i:]) + 18.01056 + 1.00728
    total = b_i + y_ni - 1.00728
    print(f"  b{i} + y{len(peptide)-i} = {b_i:.3f} + {y_ni:.3f} - H = {total:.3f}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Hydrogen-Deuterium Exchange Mass Spectrometry

Hydrogen-deuterium exchange (HDX) monitored by mass spectrometry reports on the solvent accessibility and structural dynamics of proteins. Backbone amide hydrogens that are exposed to solvent and not engaged in hydrogen bonds exchange rapidly with deuterium from D$_2$O, while protected hydrogens exchange slowly.

Derivation: HDX Exchange Kinetics

The exchange of a backbone amide hydrogen is catalyzed by both acid and base:

$$k_{\text{int}} = k_A[\text{H}^+] + k_B[\text{OH}^-] + k_W$$

where $k_{\text{int}}$ is the intrinsic (unprotected) exchange rate. At physiological pH ($\sim 7$), base catalysis dominates, giving rates of$\sim 1\text{--}100$ s$^{-1}$ for unstructured peptides.

In a folded protein, an amide hydrogen must undergo a local unfolding event before exchange can occur. In the EX2 regime (dominant under most conditions):

$$\text{NH}_{\text{closed}} \underset{k_{cl}}{\overset{k_{op}}{\rightleftharpoons}} \text{NH}_{\text{open}} \xrightarrow{k_{\text{int}}} \text{ND}$$

When $k_{cl} \gg k_{\text{int}}$ (EX2 limit), the opening/closing equilibrium is established before exchange occurs, and the observed exchange rate is:

$$k_{\text{ex}} = k_{\text{int}} \cdot \frac{K_{\text{op}}}{K_{\text{op}} + 1} \approx k_{\text{int}} \cdot K_{\text{op}}$$

where $K_{\text{op}} = k_{\text{op}} / k_{\text{cl}} \ll 1$ for well-folded regions. The protection factor quantifies the slowing of exchange:

$$\boxed{P = \frac{k_{\text{int}}}{k_{\text{ex}}} = \frac{1}{K_{\text{op}}} = \frac{k_{\text{cl}}}{k_{\text{op}}}}$$

The protection factor is related to the free energy of local unfolding:

$$\Delta G_{\text{op}} = -RT\ln K_{\text{op}} = RT\ln P$$

Protection factors range from $\sim 1$ (fully exposed loops) to$> 10^8$ (buried core residues), corresponding to local stability differences of 0 to $> 11$ kcal/mol.

HDX-MS workflow: Protein in H$_2$O is diluted into D$_2$O, incubated for various times (10 s to hours), quenched at pH 2.5/0°C (slowing exchange $\sim 10^6$-fold), digested with pepsin, and peptides analyzed by LC-MS. The mass increase of each peptide ($+1.006$ Da per deuterium) reports on exchange kinetics in that region.

HDX-MS Kinetics and Protection Factor Analysis

Python

script.py113 lines

import numpy as np

# HDX-MS Kinetics Simulation
np.random.seed(42)

print("=== Hydrogen-Deuterium Exchange (HDX) Kinetics ===")
print()

# Intrinsic exchange rates as function of pH
print("1. pH Dependence of Intrinsic Exchange Rate")
# Simplified model: k_int = k_A * [H+] + k_B * [OH-]
# For poly-DL-alanine at 25C (reference values)
k_A = 41.7      # M-1 min-1 (acid-catalyzed)
k_B = 1.12e10   # M-1 min-1 (base-catalyzed)
k_W = 3.16e-2   # min-1 (water-catalyzed)
Kw = 1e-14      # water autoionization

print(f"{'pH':>5} {'k_int (min-1)':>14} {'Half-life':>15} {'Dominant':>12}")
for pH in [2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]:
    H_conc = 10**(-pH)
    OH_conc = Kw / H_conc
    k_acid = k_A * H_conc
    k_base = k_B * OH_conc
    k_int = k_acid + k_base + k_W
    half_life = np.log(2) / k_int

if half_life < 1:
        hl_str = f"{half_life*60:.1f} s"
    elif half_life < 60:
        hl_str = f"{half_life:.1f} min"
    elif half_life < 1440:
        hl_str = f"{half_life/60:.1f} hr"
    else:
        hl_str = f"{half_life/1440:.1f} days"

dominant = "acid" if k_acid > k_base else "base"
    print(f"{pH:>5.1f} {k_int:>14.4f} {hl_str:>15} {dominant:>12}")

# HDX kinetics with protection factors
print()
print("2. HDX Kinetics with Different Protection Factors")
k_int_ref = 10.0  # min-1 at pH 7
time_points = [0, 0.17, 0.5, 1, 2, 5, 10, 30, 60, 120, 360, 1440]  # minutes

protection_factors = [1, 10, 100, 1000, 10000, 100000]
print(f"{'Time':>8}", end="")
for P in protection_factors:
    print(f"{'P='+str(P):>10}", end="")
print("  (fractional deuteration)")

for t in time_points:
    if t < 1:
        t_str = f"{t*60:.0f}s"
    elif t < 60:
        t_str = f"{t:.0f}min"
    else:
        t_str = f"{t/60:.0f}hr"
    print(f"{t_str:>8}", end="")
    for P in protection_factors:
        k_ex = k_int_ref / P
        D_frac = 1 - np.exp(-k_ex * t)
        print(f"{D_frac:>10.3f}", end="")
    print()

# Simulated HDX-MS mass shift for a protein region
print()
print("3. Simulated HDX-MS for a 15-Residue Peptide")
print("   (Different structural regions)")
n_amides = 14  # 15 residues, 14 backbone amides (minus first)

# Assign protection factors based on structure
regions = {
    "Alpha helix core": [1e5, 5e4, 1e5, 2e5, 1e5, 5e4, 1e5, 2e5, 5e4, 1e5, 2e5, 5e4, 1e5, 1e5],
    "Flexible loop":    [1, 2, 5, 3, 1, 2, 8, 3, 1, 5, 2, 3, 1, 2],
    "Beta sheet":       [1e3, 5e2, 1e4, 5e3, 1e3, 5e2, 1e4, 5e3, 1e3, 5e2, 1e4, 5e3, 1e3, 5e2],
    "Mixed (helix+loop)": [1e5, 1e5, 5e4, 1e4, 100, 5, 2, 5, 100, 1e4, 5e4, 1e5, 1e5, 1e5],
}

hdx_times = [0.167, 1, 10, 60, 360, 1440]  # minutes

for region_name, pf_list in regions.items():
    print(f"\n  {region_name}:")
    print(f"    {'Time':>8} {'Deuterons':>11} {'Mass shift (Da)':>16} {'% exchanged':>13}")
    for t in hdx_times:
        n_exchanged = 0
        for pf in pf_list:
            k_ex = k_int_ref / pf
            n_exchanged += 1 - np.exp(-k_ex * t)
        mass_shift = n_exchanged * 1.00628  # Da per deuterium
        pct = 100 * n_exchanged / n_amides

if t < 1:
            t_str = f"{t*60:.0f} s"
        elif t < 60:
            t_str = f"{t:.0f} min"
        else:
            t_str = f"{t/60:.0f} hr"
        print(f"    {t_str:>8} {n_exchanged:>11.2f} {mass_shift:>16.3f} {pct:>13.1f}")

# Protection factor to free energy
print()
print("4. Protection Factor to Local Stability")
R = 1.987e-3  # kcal/mol/K
T = 298  # K
print(f"   T = {T} K, DG = RT * ln(P)")
print(f"{'log10(P)':>10} {'P':>12} {'DG (kcal/mol)':>16} {'Structural context':>25}")
contexts = ["Surface loop", "Beta turn", "Helix cap", "Beta edge", "Helix middle", "Sheet core", "Buried core", "Hydrophobic core"]
for i, logP in enumerate([0, 1, 2, 3, 4, 5, 6, 7]):
    P = 10**logP
    DG = R * T * np.log(P)
    ctx = contexts[i] if i < len(contexts) else ""
    print(f"{logP:>10} {P:>12.0f} {DG:>16.2f} {ctx:>25}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

5. Native Mass Spectrometry and Ion Mobility

Native mass spectrometry preserves noncovalent interactions during the electrospray process, enabling the study of intact protein complexes, ligand binding, and conformational states. Combined with ion mobility spectrometry (IMS), it provides information about the size and shape of gas-phase ions.

Derivation: Charge State and Compactness

Under native conditions (aqueous, neutral pH, ammonium acetate buffer), proteins retain their folded conformation during electrospray. The maximum charge a protein can acquire is limited by the solvent-accessible surface area (SASA). For a roughly spherical protein of radius $R$, the Rayleigh charge limit is:

$$z_{\max} \propto R^{3/2} \propto MW^{1/2}$$

Under denaturing conditions, the unfolded protein exposes more surface area, acquiring more charges. The charge state ratio between native and denatured forms reports on structural compactness:

$$\frac{z_{\text{native}}}{z_{\text{denatured}}} \approx \left(\frac{R_{\text{native}}}{R_{\text{denatured}}}\right)^{3/2} \approx \left(\frac{R_g^{\text{native}}}{R_g^{\text{denatured}}}\right)^{3/2}$$

Typically, $z_{\text{native}} / z_{\text{denatured}} \approx 0.5\text{--}0.7$.

Derivation: Collision Cross-Section from Ion Mobility

In ion mobility spectrometry, ions drift through a buffer gas (typically N$_2$ or He) under a weak electric field $E$. The drift velocity $v_d$ is proportional to $E$:

$$v_d = KE$$

where $K$ is the ion mobility. The Mason-Schamp equation relates $K$to the rotationally averaged collision cross-section $\Omega$:

$$\boxed{\Omega = \frac{3ze}{16N}\sqrt{\frac{2\pi}{\mu k_B T}} \cdot \frac{1}{K}}$$

where $N$ is the buffer gas number density, $\mu = m_{\text{ion}}m_{\text{gas}}/(m_{\text{ion}} + m_{\text{gas}})$is the reduced mass, and $k_B T$ is the thermal energy. The collision cross-section$\Omega$ provides a measure of the ion's projected area.

For globular proteins, $\Omega$ scales with molecular weight approximately as:

$$\Omega \propto MW^{2/3}$$

consistent with a sphere whose volume is proportional to mass. Deviations from this scaling indicate non-spherical shapes or conformational changes. Unfolded proteins show$\Omega$ values 50–100% larger than their native forms.

Derivation: Mason-Schamp Equation from Kinetic Theory

The Mason-Schamp equation can be derived from the kinetic theory of ion transport. An ion drifting through a buffer gas at low field experiences a balance between the electric force and the frictional drag from collisions:

$$zeE = \frac{m_{\text{ion}}v_d}{\tau}$$

where $\tau$ is the mean time between collisions. From the kinetic theory of gases, the collision frequency is $1/\tau = N\Omega\bar{v}$, where $\bar{v}$ is the mean relative velocity. For a Maxwell-Boltzmann distribution of relative velocities:

$$\bar{v} = \sqrt{\frac{8k_BT}{\pi\mu}}$$

Substituting and applying correction factors from rigorous kinetic theory (the first-order Chapman-Enskog approximation introduces a factor of $3\pi/16$):

$$K = \frac{v_d}{E} = \frac{3ze}{16N\Omega}\sqrt{\frac{2\pi}{\mu k_BT}}$$

Rearranging for $\Omega$ gives the Mason-Schamp equation. The key physical insight is that larger ions (larger $\Omega$) experience more drag and drift more slowly, arriving at the detector later. This separates ions by shape and size in addition to$m/z$, adding an orthogonal dimension of structural information.

Typical CCS values for proteins: Ubiquitin (8.6 kDa):$\Omega \approx 1{,}000$ Å$^2$; cytochrome c (12.4 kDa):$\sim 1{,}500$ Å$^2$; BSA (66.4 kDa):$\sim 5{,}000$ Å$^2$; GroEL (800 kDa):$\sim 25{,}000$ Å$^2$. These values agree well with CCS computed from crystal structures using trajectory or projection methods.

Native MS Applications in Structural Biology

Stoichiometry determination: Native MS directly measures the mass of intact complexes, revealing subunit stoichiometry. For example, native MS showed that the 20S proteasome is a 28-subunit complex ($\alpha_7\beta_7\beta_7\alpha_7$, ~700 kDa), and that the GroEL-GroES complex consists of GroEL$_{14}$-GroES$_7$ (~870 kDa).

Binding affinity measurement: By quantifying the relative intensities of free and bound species at different ligand concentrations, native MS can determine dissociation constants $K_d$ for protein-ligand interactions. The ratio of bound to free protein is:

$$\frac{[\text{PL}]}{[\text{P}]} = \frac{[\text{L}]}{K_d}$$

Native MS is particularly powerful for studying heterogeneous mixtures where multiple binding partners or states coexist, as each species appears at a distinct $m/z$. It has been used to study membrane protein complexes (solubilized in detergent micelles or lipid nanodiscs), viral capsids ($> 10$ MDa), and the assembly pathways of large macromolecular machines.

6. Applications of Mass Spectrometry in Biophysics

Proteomics

Bottom-up proteomics (protein digestion, LC-MS/MS of peptides, database searching) routinely identifies and quantifies thousands of proteins in a single experiment. Label-free quantification, TMT/iTRAQ isobaric labeling, and SILAC provide quantitative comparisons across conditions. Top-down proteomics analyzes intact proteins, preserving information about proteoforms (splice variants, post-translational modifications, mutations).

Structural Biology and Drug Binding

Native MS reveals stoichiometry and binding constants for protein-ligand, protein-protein, and protein-nucleic acid complexes. HDX-MS maps conformational changes upon ligand binding. Cross-linking MS (XL-MS) provides distance restraints for integrative structural modeling. These MS-based approaches complement crystallography, cryo-EM, and NMR, especially for dynamic, heterogeneous systems.

Clinical Diagnostics and Metabolomics

MALDI-TOF MS identifies bacteria in clinical microbiology within minutes (replacing overnight culture-based methods). LC-MS/MS is the gold standard for clinical measurement of steroid hormones, drugs of abuse, vitamin D, and newborn screening. Metabolomics by MS provides comprehensive snapshots of cellular metabolism for biomarker discovery and disease diagnosis.

Chapter Summary

• Mass analysis separates ions by $m/z$: magnetic sector ($m/z = B^2r^2e/2V$), TOF ($m/z = 2eVt^2/L^2$).
• ESI produces multiply charged ions; the Rayleigh limit $q_R = 8\pi\sqrt{\varepsilon_0\gamma R^3}$ governs charged droplet stability.
• Tandem MS fragments peptides into b/y ion ladders whose mass differences reveal the amino acid sequence.
• HDX-MS measures solvent accessibility and dynamics via protection factors $P = k_{\text{int}}/k_{\text{ex}}$.
• Native MS and ion mobility preserve complexes and measure collision cross-sections via the Mason-Schamp equation.

← NMR Spectroscopy Computational Biophysics →