Neural Coding
Spike trains, rate coding, temporal coding, and the information-theoretic foundations of neural communication
How Neurons Encode Information
The brain processes information through electrical impulses called action potentials, or spikes. Understanding how sensory stimuli, motor commands, and cognitive states are represented in the patterns of these spikes is the central question of neural coding. Adrian's pioneering recordings in 1926 demonstrated that stimulus intensity is encoded in firing rate, launching a century of investigation into the neural code.
This chapter explores two fundamental coding paradigms — rate coding and temporal coding — and applies Shannon's information theory to quantify how much information neural responses carry about stimuli. We derive key results for spike train statistics, mutual information, and population coding efficiency.
1. Spike Trains and Point Processes
A neuron's output is a sequence of action potentials — stereotyped voltage pulses of roughly 1 ms duration and ~100 mV amplitude. Because individual spikes are nearly identical, all information is carried in their timing. We represent a spike train as a sum of Dirac delta functions:
The Spike Train Representation
Given spike times $\{t_1, t_2, \ldots, t_n\}$, the spike train is:
$$\rho(t) = \sum_{i=1}^{n} \delta(t - t_i)$$
The instantaneous firing rate is obtained by averaging over many trials (ensemble average) or by smoothing with a kernel $K(t)$:
$$r(t) = \langle \rho(t) \rangle = \lim_{\Delta t \to 0} \frac{\langle n(t, t+\Delta t) \rangle}{\Delta t}$$
For a Poisson process with constant rate $\lambda$, the interspike intervals (ISIs) follow an exponential distribution: $P(\tau) = \lambda e^{-\lambda \tau}$ with mean $\langle \tau \rangle = 1/\lambda$ and coefficient of variation$\text{CV} = 1$.
The Fano factor $F = \sigma_n^2 / \langle n \rangle$ measures spike count variability. For a Poisson process, $F = 1$. Cortical neurons typically show $F \approx 1\text{--}1.5$, suggesting near-Poisson variability, though this is debated. Refractoriness introduces negative correlations that reduce $F$ below 1 at short timescales.
2. Rate Coding
Rate coding posits that information is carried in the mean firing rate over some time window. Adrian and Zotterman (1926) showed that muscle spindle afferents increase their firing rate proportionally to applied force. The neural response function, or tuning curve, relates stimulus features to mean firing rate.
Derivation 1: Tuning Curves and the Cosine Model
Georgopoulos et al. (1982) found that motor cortex neurons have cosine tuning to movement direction. For a neuron with preferred direction $\theta_{\text{pref}}$:
$$r(\theta) = r_0 + r_{\max} \cos(\theta - \theta_{\text{pref}})$$
where $r_0$ is the baseline rate and $r_{\max}$ is the modulation depth. For a population of $N$ neurons with uniformly distributed preferred directions, the population vector estimate of movement direction is:
$$\hat{\theta}_{\text{pop}} = \arg\left(\sum_{i=1}^{N} r_i \, e^{i \theta_{\text{pref},i}}\right)$$
As $N \to \infty$, this estimator converges to the true direction with variance scaling as $\sigma^2 \propto 1/N$. The population vector averages out noise across neurons, achieving a population-level signal-to-noise ratio that grows as $\sqrt{N}$.
Sensory neurons display diverse tuning curve shapes: Gaussian tuning in orientation selectivity ($r(\theta) = r_{\max} \exp[-((\theta - \theta_0)/\sigma)^2 / 2]$), sigmoidal tuning in intensity coding, and bell-shaped tuning in frequency selectivity. The choice of tuning curve shape profoundly affects the information capacity of the neural population.
3. Temporal Coding
Temporal coding proposes that precise spike timing carries information beyond what is captured by firing rate. Evidence comes from multiple systems: auditory neurons phase-lock to sound waveforms with sub-millisecond precision, retinal ganglion cells encode visual stimuli with precise first-spike latencies, and hippocampal place cells exhibit theta-phase precession.
Derivation 2: Spike Timing Precision and the Victor-Purpura Metric
To quantify temporal coding, we need a distance metric between spike trains. The Victor-Purpura (1996) metric defines the distance as the minimum cost to transform one spike train into another using three operations:
- Insert a spike: cost = 1
- Delete a spike: cost = 1
- Move a spike by $\Delta t$: cost = $q |\Delta t|$
The parameter $q$ (units: 1/time) sets the temporal resolution. When$q \to 0$, the metric counts only spike number differences (rate code). When $q \to \infty$, it counts coincident spikes only (temporal code). The metric is computed via dynamic programming with complexity $O(n_1 n_2)$ where$n_1, n_2$ are spike counts:
$$D(i,j) = \min\begin{cases} D(i-1, j) + 1 \\ D(i, j-1) + 1 \\ D(i-1, j-1) + q|t_i^{(1)} - t_j^{(2)}| \end{cases}$$
Applying this metric to neural data and varying $q$ reveals the timescale at which temporal information becomes relevant. For cortical neurons, $q_{\text{opt}} \sim 10\text{--}100$ s$^{-1}$, corresponding to temporal precision of 10–100 ms.
3.1 Phase Coding in the Hippocampus
Hippocampal place cells fire at progressively earlier phases of the theta rhythm (6–10 Hz) as an animal traverses a place field — the phenomenon of theta phase precession discovered by O'Keefe and Recce (1993). The phase encodes position within the place field:
$$\phi(x) = \phi_0 - 2\pi \frac{x - x_{\text{start}}}{x_{\text{end}} - x_{\text{start}}}$$
This dual code (firing rate encodes location, phase encodes position within the field) provides a higher-resolution spatial signal than rate alone, with theoretical precision improvements of up to an order of magnitude.
4. Information Theory of Neural Coding
Shannon's information theory provides a rigorous framework for quantifying neural coding efficiency. The mutual information between stimulus $S$ and response $R$ is:
$$I(S; R) = H(R) - H(R|S) = \sum_{s,r} P(s,r) \log_2 \frac{P(s,r)}{P(s)P(r)}$$
Derivation 3: Information in a Poisson Spike Count
Consider a neuron with Poisson statistics whose rate $\lambda(s)$ depends on stimulus$s$. The spike count $n$ in a window $T$ follows $P(n|s) = \frac{(\lambda T)^n e^{-\lambda T}}{n!}$. The response entropy is:
$$H(R) = -\sum_n P(n) \log_2 P(n)$$
where $P(n) = \langle P(n|s) \rangle_s$. The noise entropy is:
$$H(R|S) = \left\langle -\sum_n P(n|s) \log_2 P(n|s) \right\rangle_s$$
For a Poisson distribution with mean $\mu = \lambda T$, the entropy is approximately:
$$H_{\text{Poisson}}(\mu) \approx \frac{1}{2} \log_2(2\pi e \mu) \quad \text{for } \mu \gg 1$$
The mutual information is then bounded by the difference between the total response variability and the intrinsic noise: $I(S;R) \leq \frac{1}{2}\log_2\left(1 + \frac{\text{Var}[\lambda]}{\langle \lambda \rangle}\, T\right)$. This shows that information grows logarithmically with time window $T$, and linearly with the signal-to-noise ratio $\text{Var}[\lambda]/\langle \lambda \rangle$.
Derivation 4: The Direct Method for Entropy Estimation
Strong et al. (1998) developed the "direct method" for estimating information in spike trains without assuming a coding model. Discretize time into bins of width $\Delta t$, encoding the spike train as a binary string. The total entropy of response words of length $L$ is:
$$H_{\text{total}}(L) = -\sum_{\text{words } w} P(w) \log_2 P(w)$$
The noise entropy uses stimulus-locked responses:
$$H_{\text{noise}}(L) = \left\langle -\sum_w P(w|s) \log_2 P(w|s) \right\rangle_s$$
The information rate is:
$$\dot{I} = \frac{H_{\text{total}}(L) - H_{\text{noise}}(L)}{L \cdot \Delta t} \quad \text{(bits/s)}$$
Extrapolation to $L \to \infty$ and correction for finite data bias (quadratic extrapolation in $1/N$) yields the true information rate. Fly H1 neurons transmit ~90 bits/s, approaching 50% of the channel capacity, indicating remarkably efficient coding.
Derivation 5: Fisher Information and the Cramér-Rao Bound
Fisher information quantifies the precision of neural population codes. For a population of $N$ neurons with tuning curves $f_i(s)$ and independent Poisson noise, the Fisher information is:
$$J(s) = \sum_{i=1}^{N} \frac{[f_i'(s)]^2}{f_i(s)}$$
The Cramér-Rao bound sets a lower limit on the variance of any unbiased estimator$\hat{s}$ of the stimulus:
$$\text{Var}(\hat{s}) \geq \frac{1}{J(s)}$$
For Gaussian tuning curves $f_i(s) = A \exp[-(s - s_i)^2 / (2\sigma^2)]$ with uniformly spaced preferred stimuli (spacing $\Delta s$), the total Fisher information is:
$$J_{\text{total}} = \frac{N A}{\sigma^2} \cdot C(\sigma / \Delta s)$$
where $C$ is a coverage factor that depends on the ratio of tuning width to spacing. Optimal coding requires $\sigma / \Delta s \approx 1$, balancing resolution against coverage. This result predicts that wider tuning curves are preferred when noise is high.
5. Historical Development
- • 1926: Edgar Adrian records single-unit activity from sensory neurons, discovers rate coding — firing rate increases with stimulus intensity (Nobel Prize, 1932).
- • 1948: Claude Shannon publishes "A Mathematical Theory of Communication," providing the mathematical framework for quantifying information.
- • 1959: Hubel and Wiesel discover orientation-selective neurons in primary visual cortex, revealing feature-selective coding (Nobel Prize, 1981).
- • 1982: Georgopoulos introduces the population vector model for motor cortex direction coding.
- • 1993: O'Keefe and Recce discover theta phase precession in hippocampal place cells, demonstrating temporal coding of position.
- • 1996: Rieke, Warland, de Ruyter van Steveninck, and Bialek publish "Spikes," establishing information-theoretic analysis of neural coding.
- • 1998: Strong et al. develop the direct method for estimating information rates in spike trains, showing fly neurons operate near channel capacity.
6. Applications
Brain-Computer Interfaces
Population vector decoding and Bayesian decoders use tuning curve models to interpret neural activity for prosthetic control. Understanding rate vs. temporal coding determines optimal decoding strategies and required recording bandwidth.
Cochlear Implants
Temporal coding principles guide the design of electrode stimulation patterns. Phase-locking in auditory nerve fibers up to ~5 kHz means temporal coding strategies can restore pitch perception for lower frequencies.
Retinal Prosthetics
Information-theoretic analysis determines the number of electrodes needed to restore useful vision. Fisher information calculations predict the spatial resolution achievable with a given electrode array geometry.
Machine Learning
Neural coding principles inspire efficient encoding in spiking neural networks. Population coding and sparse coding ideas have influenced compressed sensing algorithms and neuromorphic computing architectures.
7. Computational Exploration
Neural Coding: Spike Trains, Tuning Curves, and Information Theory
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server
Chapter Summary
- • Spike trains are modeled as point processes; Poisson statistics give $\text{CV} = 1$ and Fano factor $F = 1$, consistent with cortical recordings.
- • Rate coding via cosine tuning curves enables population vector decoding with precision scaling as $1/\sqrt{N}$.
- • Temporal coding carries information in spike timing; the Victor-Purpura metric quantifies temporal precision at different timescales.
- • Mutual information $I(S;R) = H(R) - H(R|S)$ quantifies stimulus information in neural responses, growing logarithmically with integration time.
- • Fisher information sets fundamental limits on decoding precision via the Cramér-Rao bound: $\text{Var}(\hat{s}) \geq 1/J(s)$.
Practice Problems
Problem 1: Firing Rate from a Spike TrainA neuron produces 45 spikes in a 1.5-second recording window during sustained stimulation. Compute the mean firing rate, the average inter-spike interval, and the expected Fano factor if the spike train is a Poisson process.
Solution:
1. The mean firing rate is the spike count divided by the observation window:
2. The average inter-spike interval (ISI) is the reciprocal of the firing rate:
3. For a Poisson process, the spike count in any window has variance equal to its mean. The Fano factor is defined as:
4. For a homogeneous Poisson process, $\text{Var}(N) = \langle N \rangle = \lambda T$, so:
5. This is a hallmark of Poisson statistics. Deviations from $F = 1$ indicate sub-Poisson ($F < 1$, more regular) or super-Poisson ($F > 1$, more bursty) firing.
Problem 2: Mutual Information Between Stimulus and ResponseA neuron responds to two equiprobable stimuli. Under stimulus A the spike count is 0 or 1 with probabilities 0.8 and 0.2; under stimulus B the count is 0 or 1 with probabilities 0.3 and 0.7. Calculate the mutual information $I(S;R)$ in bits.
Solution:
1. Since $P(A) = P(B) = 0.5$, the marginal response probabilities are:
2. The response entropy is:
3. The conditional entropy (noise entropy) is the average over stimuli:
4. The mutual information is:
5. Since the stimuli are equiprobable, the maximum possible information is $H(S) = 1$ bit. The neuron transmits about 19% of the available stimulus information in a single spike-count observation.
Problem 3: Fano Factor from Repeated TrialsA neuron is recorded over 8 identical stimulus presentations (250 ms windows). The spike counts are: 5, 7, 4, 6, 8, 5, 6, 7. Calculate the Fano factor and comment on the firing regularity.
Solution:
1. Compute the mean spike count across trials:
2. Compute the variance:
3. The Fano factor is:
4. Since $F < 1$, the spike train is sub-Poisson, meaning the neuron fires more regularly than a Poisson process would predict.
5. This sub-Poisson behavior (often seen in auditory nerve fibers and during sustained stimulation) suggests refractoriness or inhibitory feedback that regularizes spike timing.
Problem 4: Population Vector DirectionThree motor cortex neurons have preferred directions $\theta_1 = 0°$, $\theta_2 = 120°$, $\theta_3 = 240°$ and fire at rates $r_1 = 40$, $r_2 = 25$, $r_3 = 10$ Hz during a reaching movement. Compute the population vector direction.
Solution:
1. The population vector is the firing-rate-weighted sum of preferred direction unit vectors:
2. Compute the x-component:
3. Compute the y-component:
4. The population vector direction is:
5. The predicted reach direction is 30° from the horizontal. The magnitude $|\vec{P}| = \sqrt{22.5^2 + 12.99^2} \approx 26.0$ reflects the overall drive strength. The population vector correctly biases toward neuron 1's preferred direction since it fires most strongly.
Problem 5: Fisher Information for Gaussian Tuning CurvesA population of $N = 100$ neurons has Gaussian tuning curves $f_i(s) = A\exp[-(s-s_i)^2/(2\sigma^2)]$ with peak rate $A = 50$ Hz, tuning width $\sigma = 10°$, and uniform spacing $\Delta s = 3.6°$ over 360°. Estimate the Fisher information and the minimum achievable standard deviation for stimulus estimation.
Solution:
1. For independent Poisson neurons with tuning curves $f_i(s)$, the Fisher information is:
2. For a Gaussian tuning curve, $f_i'(s) = -\frac{(s-s_i)}{\sigma^2}f_i(s)$, so each neuron's contribution is:
3. For densely packed neurons ($\Delta s \ll \sigma$), we convert the sum to an integral. The total Fisher information becomes approximately:
4. Plugging in the values:
5. The Cramér-Rao bound gives the minimum standard deviation:
This sub-degree precision is consistent with experimental orientation discrimination thresholds and demonstrates how large neural populations achieve high precision despite noisy individual neurons.