Biophysics of Vision
Ocular optics, photoreceptor physics, phototransduction cascades, colour vision theory, and spatial resolution β the quantitative science of seeing
Why Vision Biophysics?
The human eye is an optical instrument of extraordinary performance. Rods can detect individual photons with quantum efficiency ~0.67 (Hecht, Shlaer & Pirenne 1942). The retina processes $\sim 10^9$ bits/s, comparable to an Ethernet connection. Colour vision arises from just three cone types, yet we perceive millions of distinct hues. Understanding vision requires optics, photochemistry, signal transduction, and neural computation.
This chapter derives the optics of the eye, the physics of single-photon detection, the amplification cascade of phototransduction, the mathematical basis of colour vision, and the limits of spatial resolution set by diffraction and photoreceptor sampling.
1. Optics of the Eye
The eye forms images on the retina using a two-element optical system: the cornea and the crystalline lens. The cornea provides most of the refractive power because it has the largest refractive index difference (air $n = 1.0$ to cornea $n = 1.376$).
Derivation: The Thin Lens Equation and Ocular Optics
For a thin lens in air, Snell's law applied at two spherical surfaces gives the lensmaker's equation. For a single refracting surface of radius $R$separating media of indices $n_1$ and $n_2$:
$$\frac{n_1}{d_o} + \frac{n_2}{d_i} = \frac{n_2 - n_1}{R}$$
For a thin lens with two surfaces (radii $R_1, R_2$) in air:
$$\boxed{\frac{1}{f} = \frac{1}{d_o} + \frac{1}{d_i} = (n-1)\left(\frac{1}{R_1} - \frac{1}{R_2}\right)}$$
Optical power is measured in dioptres: $P = 1/f$ (with $f$ in metres). The cornea has $R \approx 7.8$ mm and provides:
$$P_{\text{cornea}} = \frac{n_{\text{cornea}} - n_{\text{air}}}{R} = \frac{1.376 - 1.000}{0.0078} \approx 43 \; \text{D}$$
The crystalline lens adds ~17 D, giving a total power of approximately:
$$P_{\text{total}} \approx 60 \; \text{D} \quad \Rightarrow \quad f \approx 17 \; \text{mm}$$
(Note: the focal length is measured in the vitreous humor with $n = 1.336$, so the retinal image distance is $d_i = n_{\text{vitreous}}/P \approx 22.3$ mm, matching the axial length of the eye.)
Derivation: Accommodation and Presbyopia
To focus on near objects, the ciliary muscle contracts, releasing tension on the zonular fibres, allowing the elastic lens to become more convex. This increases its optical power. The accommodation amplitude is:
$$\Delta P = P_{\text{near}} - P_{\text{far}} = \frac{1}{d_{\text{near}}} - \frac{1}{d_{\text{far}}}$$
For a young adult with near point $d_{\text{near}} = 10$ cm and far point at infinity:
$$\Delta P = \frac{1}{0.10} - 0 = 10 \; \text{D}$$
With age, the lens stiffens (presbyopia). The accommodation amplitude decreases as:
$$\Delta P(\text{age}) \approx 14 - 0.25 \times \text{age (years)}$$
By age 50, $\Delta P \approx 1.5$ D, giving a near point of 67 cm β hence the need for reading glasses. The corrective lens power needed is:
$$P_{\text{correction}} = \frac{1}{d_{\text{desired}}} - \frac{1}{d_{\text{near, actual}}}$$
2. Retinal Photoreceptor Physics
The retina contains ~120 million rods (scotopic vision) and ~6 million cones (photopic/colour vision). Rods are exquisitely sensitive: under optimal conditions, a human observer can reliably detect a flash of just 5β7 photons at the cornea, meaning individual rods respond to single photons.
Derivation: Photon Absorption Probability
A rod outer segment is a cylindrical stack of ~1,000 membrane discs, each loaded with rhodopsin at concentration $c \approx 3$ mM. The outer segment has length$L \approx 25 \; \mu$m. By Beer's law, the probability that a photon traversing the outer segment is absorbed is:
$$\boxed{P_{\text{abs}} = 1 - e^{-\alpha c L}}$$
where $\alpha$ is the molar absorption cross-section of rhodopsin at the wavelength of peak absorption ($\lambda_{\max} = 498$ nm):
$$\alpha = \frac{\epsilon \cdot \ln 10}{N_A} \approx \frac{40{,}000 \times 2.303}{6.02 \times 10^{23}} \approx 1.5 \times 10^{-16} \; \text{cm}^2$$
The optical density of a rod outer segment is $\text{OD} = \alpha c L \approx 0.4$, giving $P_{\text{abs}} = 1 - 10^{-0.4/2.303} \approx 0.33$ for axially incident light. Accounting for waveguiding effects that increase the effective path length, the total quantum efficiency of isomerisation is approximately:
$$\eta = P_{\text{abs}} \times P_{\text{isom}} \approx 0.5 \times 0.67 \approx 0.33$$
where $P_{\text{isom}} = 0.67$ is the quantum yield of rhodopsin photoisomerisation (11-cis to all-trans retinal).
The Single-Photon Response (Hecht, Shlaer & Pirenne 1942)
In their landmark experiment, subjects viewed dim flashes in complete darkness. They found that the minimum flash detectable contained 5β14 photons at the cornea. Accounting for optical losses (~50%) and the probability of absorption (~6% for non-axial incidence), only ~1 photon per rod was being absorbed. This proved that a single photon can trigger a rod response.
The single-photon response in a primate rod produces a current suppression of ~1 pA lasting ~200 ms (from a dark current of ~20 pA). The signal-to-noise ratio of the single-photon response is $\text{SNR} \approx 4$β$8$, remarkable for a molecular-scale detector at room temperature.
3. Phototransduction Cascade
The phototransduction cascade amplifies the absorption of a single photon into a macroscopic electrical signal through a G-protein signalling pathway of extraordinary efficiency.
Derivation: Amplification Gain at Each Step
The cascade proceeds through sequential amplification stages:
- β’ Step 1 β Rhodopsin activation: One photon isomerises one rhodopsin molecule ($R \to R^*$). Gain = 1.
- β’ Step 2 β G-protein activation: Each $R^*$catalyses nucleotide exchange on ~500 transducin molecules ($G_t$) during its active lifetime (~80 ms): $k_{\text{cat}} \approx 6000$ s$^{-1}$. Gain = ~500.
- β’ Step 3 β PDE activation: Each$G_t^*$ activates one PDE subunit. Each PDE hydrolyses cGMP at a rate of$k_{\text{cat}}^{\text{PDE}} \approx 2000$ s$^{-1}$. Gain = ~1 (per G-protein, but each PDE is very fast).
- β’ Step 4 β cGMP hydrolysis: During the ~200 ms signal, each PDE hydrolyses ~400 cGMP molecules. Total cGMP reduction per photon: ~500 PDE $\times$ 400 cGMP = $2 \times 10^5$ cGMP molecules.
The CNG (cyclic nucleotide-gated) channels close cooperatively with the fall in cGMP. The current depends on cGMP concentration via a Hill equation:
$$\boxed{I(t) = I_{\text{dark}} \cdot \left[\frac{[\text{cGMP}](t)}{[\text{cGMP}]_{\text{dark}}}\right]^n}$$
where $n \approx 3$ is the Hill coefficient (each CNG channel requires ~3 cGMP molecules to open). The dark current is $I_{\text{dark}} \approx 20$ pA. A single photon reduces cGMP by ~5%, giving:
$$\frac{\Delta I}{I_{\text{dark}}} = n \cdot \frac{\Delta[\text{cGMP}]}{[\text{cGMP}]_{\text{dark}}} \approx 3 \times 0.05 = 0.15$$
So $\Delta I \approx 3$ pA from a single photon β a beautifully reliable signal.
Derivation: Light Adaptation via Calcium Feedback
In steady light, the photoreceptor adapts by reducing its gain. The key mediator is intracellular Ca$^{2+}$. In darkness, Ca$^{2+}$ enters through CNG channels and is extruded by the Na$^+$/Ca$^{2+}$-K$^+$exchanger (NCKX). When channels close in light, Ca influx stops but extrusion continues:
$$\frac{d[\text{Ca}^{2+}]}{dt} = \frac{f_{\text{Ca}} \cdot I_{\text{CNG}}}{2FV_{\text{os}}} - k_{\text{ex}}[\text{Ca}^{2+}]$$
where $f_{\text{Ca}} \approx 0.12$ is the fraction of CNG current carried by Ca$^{2+}$. The falling Ca$^{2+}$ has three major effects that together reduce the gain and speed recovery:
- β’ Guanylate cyclase acceleration: Low Ca$^{2+}$ via GCAPs increases cGMP synthesis rate 5β10 fold
- β’ Channel affinity increase: Recoverin releases from CNG channels, increasing their cGMP affinity
- β’ Rhodopsin kinase acceleration: Recoverin releases from RK, speeding R* shutoff
Together these feedbacks allow rods to operate over ~3 log units and cones over ~7 log units of light intensity, an enormous adaptation range.
4. Colour Vision
Derivation: Trichromatic Theory
Young (1802) and Helmholtz (1867) proposed that colour vision arises from three types of cone photoreceptors with overlapping spectral sensitivities. Each cone integrates all wavelengths weighted by its spectral sensitivity:
$$L = \int_0^\infty S_L(\lambda) \cdot \Phi(\lambda) \, d\lambda, \quad M = \int_0^\infty S_M(\lambda) \cdot \Phi(\lambda) \, d\lambda, \quad S = \int_0^\infty S_S(\lambda) \cdot \Phi(\lambda) \, d\lambda$$
where $S_L, S_M, S_S$ are the spectral sensitivities of the L (long, peak ~565 nm), M (medium, ~535 nm), and S (short, ~420 nm) cones, and $\Phi(\lambda)$ is the spectral power distribution of the stimulus. Any spectrum is reduced to a three-dimensional colour signal $(L, M, S)$. This means very different spectra can produce the same $(L, M, S)$ values β these are metamers.
The CIE colour matching functions $\bar{x}(\lambda), \bar{y}(\lambda), \bar{z}(\lambda)$are linear transformations of $S_L, S_M, S_S$ chosen so that$\bar{y}(\lambda)$ matches the luminosity function. The chromaticity coordinates are:
$$x = \frac{X}{X+Y+Z}, \quad y = \frac{Y}{X+Y+Z}$$
Plotting $(x, y)$ gives the CIE chromaticity diagram. Monochromatic lights trace the horseshoe-shaped spectrum locus; all perceivable colours lie within it.
Derivation: Colour Opponency (Hering)
Hering (1878) proposed that colour is coded not as three independent channels but as opponent pairs. Retinal ganglion cells compute:
$$\text{Red-Green:} \quad \text{RG} = L - M$$
$$\text{Blue-Yellow:} \quad \text{BY} = S - (L + M)$$
$$\text{Luminance:} \quad \text{Lum} = L + M$$
This recoding is a linear transformation of the cone signals:
$$\begin{pmatrix} \text{Lum} \\ \text{RG} \\ \text{BY} \end{pmatrix} = \begin{pmatrix} 1 & 1 & 0 \\ 1 & -1 & 0 \\ -1 & -1 & 2 \end{pmatrix} \begin{pmatrix} L \\ M \\ S \end{pmatrix}$$
This explains why we perceive "reddish-green" as impossible (opponent channels cannot be simultaneously positive and negative) while "reddish-blue" (purple) is perfectly possible.
Derivation: Colour Blindness
Colour vision deficiencies arise from missing or altered cone pigments. The most common forms are X-linked (affecting ~8% of males):
- β’ Protanopia (missing L cones): colour space collapses to $(M, S)$ β 2D space, confuse reds with greens
- β’ Deuteranopia (missing M cones): colour space collapses to $(L, S)$ β similar confusions
- β’ Tritanopia (missing S cones, rare): colour space collapses to $(L, M)$ β confuse blues with yellows
In dichromatic vision, all spectra project onto a 2D subspace. Lines of confusion in the chromaticity diagram pass through the copunctal point (the chromaticity of the missing cone's peak sensitivity). Any two colours along a confusion line are metameric for the dichromat.
5. Spatial Resolution
Derivation: Rayleigh Criterion for the Eye
The pupil acts as a circular aperture, producing an Airy diffraction pattern on the retina. Two point sources are just resolved when the central maximum of one falls on the first minimum of the other (Rayleigh criterion):
$$\boxed{\theta_{\min} = 1.22 \frac{\lambda}{D}}$$
For $\lambda = 555$ nm (peak of photopic vision) and pupil diameter$D = 2.5$ mm (typical bright-light pupil):
$$\theta_{\min} = 1.22 \times \frac{555 \times 10^{-9}}{2.5 \times 10^{-3}} = 2.7 \times 10^{-4} \; \text{rad} \approx 0.93 \; \text{arcmin}$$
This is remarkably close to 1 arcminute, the clinical standard for 20/20 (6/6) vision. On the retina, this corresponds to a spot size of:
$$\Delta x = f \cdot \theta_{\min} = 17 \times 10^{-3} \times 2.7 \times 10^{-4} \approx 4.6 \; \mu\text{m}$$
Derivation: Nyquist Sampling Limit
The foveal cone mosaic has a centre-to-centre spacing of approximately $s \approx 2 \; \mu$m (in the foveal centre), corresponding to a density of ~200,000 cones/mm$^2$ or ~120 cones/degree. By the Nyquist-Shannon sampling theorem, the maximum spatial frequency that can be faithfully represented is:
$$f_{\text{Nyquist}} = \frac{1}{2s} \approx \frac{1}{2 \times 2 \; \mu\text{m}} = 250 \; \text{mm}^{-1}$$
Converting to angular frequency (using $f = 17$ mm focal length):
$$f_{\text{Nyquist}} = \frac{180 \times 60}{\pi} \times \frac{1}{2sf} \approx 60 \; \text{cycles/degree}$$
This matches the optical cutoff from diffraction, showing that the foveal cone mosaic is optimally matched to the optics β a beautiful example of evolutionary optimisation.
Derivation: Contrast Sensitivity Function
The contrast sensitivity function (CSF) describes the eye's sensitivity to sinusoidal gratings as a function of spatial frequency. It is the product of the optical transfer function (OTF) and the neural transfer function (NTF):
$$\text{CSF}(f) = \text{OTF}(f) \times \text{NTF}(f)$$
The OTF is approximately Gaussian for small aberrations:$\text{OTF}(f) \propto \exp(-\pi^2 \sigma_{\text{blur}}^2 f^2)$. The NTF is bandpass due to lateral inhibition, peaking at ~3β5 cycles/degree. The combined CSF peaks at ~3β5 cpd and falls off at both low frequencies (neural attenuation from lateral inhibition) and high frequencies (optical blur + receptor spacing). An empirical fit is:
$$\text{CSF}(f) = af^b \cdot \exp(-cf)$$
with typical values $a \approx 75$, $b \approx 0.2$, $c \approx 0.06$(cpd units). The maximum sensitivity (~200β500) occurs near 3β5 cpd.
Derivation: Retinal Image Quality and Aberrations
Real eyes deviate from the ideal diffraction-limited system due to optical aberrations. The wavefront error $W(\rho, \theta)$ across the pupil can be decomposed into Zernike polynomials:
$$W(\rho, \theta) = \sum_{n,m} c_n^m Z_n^m(\rho, \theta)$$
The dominant aberrations of the eye include:
- β’ Defocus ($Z_2^0$): Myopia or hyperopia β the most common refractive error
- β’ Astigmatism ($Z_2^{-2}, Z_2^2$): Different curvatures in orthogonal meridians
- β’ Spherical aberration ($Z_4^0$): Positive in young eyes, causes halos around bright lights at large pupils
- β’ Coma ($Z_3^{-1}, Z_3^1$): Off-axis aberration from non-centred optics
The Strehl ratio quantifies image quality relative to diffraction-limited performance:
$$S = \exp(-\sigma_W^2) \approx 1 - \sigma_W^2 \quad \text{(for small } \sigma_W \text{)}$$
where $\sigma_W^2 = (2\pi/\lambda)^2 \langle W^2 \rangle$ is the wavefront variance in radians$^2$. For $S > 0.8$ (MarΓ©chal criterion, near-diffraction-limited), we need $\sigma_W < \lambda/14 \approx 40$ nm. Most young eyes have RMS wavefront error ~0.1β0.3 $\mu$m at 3 mm pupil, increasing to ~0.5β1.0 $\mu$m at 6 mm pupil.
Scotopic vs. Photopic Vision
The visual system operates in two distinct regimes. In scotopic (dark-adapted) vision, only rods are active. Rods have peak sensitivity at $\lambda = 498$ nm and saturate above ~100 photons/rod/s. In photopic vision, cones operate from ~1 to $\sim 10^6$ cd/m$^2$. The transition (mesopic range) occurs between 0.001 and 3 cd/m$^2$.
The Purkinje shift describes the change in peak spectral sensitivity from 555 nm (photopic,$V(\lambda)$) to 507 nm (scotopic, $V'(\lambda)$) as rods take over from cones. This is why red flowers appear dark in twilight while blue flowers remain bright.
Rod spatial resolution is much poorer than cone resolution because of convergence: ~120 rods feed onto a single ganglion cell (spatial summation for sensitivity), giving a resolution of only ~1 degree in scotopic vision, compared to ~1 arcminute in photopic foveal vision.
Retinal Processing: Centre-Surround Receptive Fields
Retinal ganglion cells do not simply relay photoreceptor signals. They compute contrast through centre-surround antagonism, described by a difference of Gaussians:
$$\text{RF}(r) = \frac{K_c}{\pi r_c^2} \exp\!\left(-\frac{r^2}{r_c^2}\right) - \frac{K_s}{\pi r_s^2} \exp\!\left(-\frac{r^2}{r_s^2}\right)$$
where $r_c$ and $r_s$ are the centre and surround radii, and$K_c$, $K_s$ are their strengths. The response to a sinusoidal grating of spatial frequency $f$ is:
$$R(f) = K_c \exp(-\pi^2 r_c^2 f^2) - K_s \exp(-\pi^2 r_s^2 f^2)$$
This produces a bandpass spatial frequency response, peaking at a frequency that depends on the receptive field size. Small foveal ganglion cells have centre radii $r_c \approx 2$ arcmin and peak at ~10 cpd. Large peripheral cells ($r_c \approx 20$ arcmin) peak at ~1 cpd. This is the neural basis of the low-frequency attenuation in the CSF.
6. Applications
Corrective Lenses and LASIK
Myopia (nearsightedness) occurs when the eye is too long for its optical power: the focal point falls in front of the retina. Correction requires a diverging lens:$P = -1/d_{\text{far}}$. LASIK ablates corneal tissue to flatten the central cornea, reducing $P_{\text{cornea}}$ by the needed amount. The ablation depth at the center for correction of $\Delta P$ dioptres is:
$$t = \frac{d^2 \cdot \Delta P}{8(n_{\text{cornea}} - 1)}$$
where $d$ is the ablation zone diameter (typically 6 mm). For -3 D correction:$t \approx 36 \; \mu$m.
Why RGB Displays Work
Because colour vision is trichromatic, any perceived colour can be reproduced by a suitable mixture of just three primary lights. RGB displays exploit this: each pixel has red (~615 nm), green (~530 nm), and blue (~465 nm) sub-pixels. The gamut of reproducible colours is the triangle in the CIE diagram with vertices at the primary chromaticities. All colours within this gamut can be created by additive mixing, exploiting the linearity of cone responses.
The gamut area of a display in the CIE diagram is:
$$A_{\text{gamut}} = \frac{1}{2} |x_R(y_G - y_B) + x_G(y_B - y_R) + x_B(y_R - y_G)|$$
Standard sRGB covers about 35% of the CIE horseshoe. Wide-gamut displays (P3, Rec.2020) use primaries at more saturated chromaticities to cover 45β75%. However, no three primaries can reproduce all perceivable colours, since the spectrum locus is curved β a fundamental limitation of trichromatic technology.
Retinal Implants and Visual Prosthetics
For patients with photoreceptor degeneration (retinitis pigmentosa), electronic retinal implants can restore limited vision. The Argus II device uses a 60-electrode array on the epiretinal surface. The current spread from each electrode stimulates a region of retinal ganglion cells, producing a phosphene (perceived spot of light) of ~2β5 degrees visual angle. With 60 electrodes, the effective resolution is only ~20$\times$20 pixels β enough for navigation and large-letter reading, but far below normal acuity.
7. Historical Development
- β’ Thomas Young (1802) & Hermann von Helmholtz (1867): Proposed the trichromatic theory β three types of colour receptors suffice for full colour vision. Confirmed in the 1960s by microspectrophotometry.
- β’ Ewald Hering (1878): Proposed opponent-process theory β colour is coded as opposing pairs. Both theories are correct at different levels of the visual pathway.
- β’ George Wald (Nobel 1967): Elucidated the photochemistry of rhodopsin: the chromophore retinal undergoes 11-cis to all-trans isomerisation upon absorbing a photon, triggering the entire transduction cascade.
- β’ Hecht, Shlaer & Pirenne (1942): Demonstrated that human rods can detect single photons β placing a biological sensor at the quantum limit.
- β’ Hubel & Wiesel (Nobel 1981): Discovered orientation-selective neurons in the visual cortex, showing how the brain constructs visual features from retinal input.
8. Python Simulations
Vision Biophysics: Cone Spectra, Color Matching & Phototransduction
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Photon Absorption, Hecht Experiment & Color Blindness
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Chapter Summary
- β’ Ocular optics: The cornea (43 D) + lens (17 D) = 60 D total power. Accommodation changes lens curvature; presbyopia from lens stiffening follows $\Delta P \approx 14 - 0.25 \times \text{age}$.
- β’ Photon detection: Absorption probability $P = 1 - e^{-\alpha c L}$; quantum efficiency ~0.67. Rods detect single photons with SNR ~4β8.
- β’ Phototransduction: 1 photon $\to$ 500 transducins $\to$ $2 \times 10^5$ cGMP hydrolysed. Current: $I \propto [\text{cGMP}]^3$. Ca$^{2+}$ feedback enables light adaptation over 7 log units.
- β’ Colour vision: Trichromacy from L, M, S cones; opponent processing (L-M, S-(L+M), L+M). Dichromats lose one cone type, reducing colour space to 2D.
- β’ Spatial resolution: Rayleigh limit $\theta = 1.22\lambda/D \approx 1$ arcmin. Foveal cone Nyquist limit ~60 cpd matches the optical cutoff β optimal design.