Part II: Systems Neuroscience | Chapter 2

Learning & Memory

LTP/LTD, Hebbian learning rules, memory consolidation, and the hippocampal memory system

The Neural Basis of Memory

Learning and memory are among the most fundamental capabilities of the nervous system. At the cellular level, information is stored through changes in synaptic strength — a process first hypothesized by Hebb (1949) and experimentally confirmed with the discovery of long-term potentiation (LTP) by Bliss and Lomo (1973). These synaptic modifications, occurring at billions of connections, give rise to the diverse forms of memory we experience.

This chapter covers the biophysical mechanisms of synaptic plasticity (LTP and LTD), their formalization as Hebbian learning rules, the systems-level process of memory consolidation involving hippocampal-cortical interactions, and the computational models that link synaptic changes to memory storage and retrieval.

1. Long-Term Potentiation and Depression

Long-term potentiation (LTP) is a persistent increase in synaptic strength following high-frequency stimulation of a synapse. At excitatory synapses in the hippocampus, LTP requires activation of NMDA receptors, which serve as molecular coincidence detectors: they open only when the postsynaptic membrane is depolarized (removing the Mg$^{2+}$ block) AND glutamate is bound (presynaptic activity).

Derivation 1: NMDA Receptor as Coincidence Detector

The NMDA receptor current depends on both glutamate binding and voltage-dependent Mg$^{2+}$ block. The current is:

$$I_{\text{NMDA}} = \bar{g}_{\text{NMDA}} \cdot s(t) \cdot B(V) \cdot (V - E_{\text{rev}})$$

where $s(t)$ is the fraction of open channels (glutamate-gated) and $B(V)$ is the voltage-dependent Mg$^{2+}$ block factor:

$$B(V) = \frac{1}{1 + [\text{Mg}^{2+}]_o / 3.57 \cdot \exp(-0.062 \, V)}$$

The calcium influx through NMDA receptors, $J_{\text{Ca}} \propto I_{\text{NMDA}}$, triggers LTP when it exceeds a high threshold $\theta_+$ (activating CaMKII) or LTD when it is between a lower threshold $\theta_-$ and $\theta_+$ (activating calcineurin):

$$\Delta w = \begin{cases} +\eta_+ & \text{if } [\text{Ca}^{2+}] > \theta_+ \\ -\eta_- & \text{if } \theta_- < [\text{Ca}^{2+}] < \theta_+ \\ 0 & \text{if } [\text{Ca}^{2+}] < \theta_- \end{cases}$$

This calcium-based model (Shouval et al., 2002) unifies LTP and LTD under a single mechanism and explains frequency-dependent plasticity: high-frequency stimulation produces large Ca$^{2+}$ transients (LTP), while low-frequency stimulation produces moderate Ca$^{2+}$ elevations (LTD).

2. Hebbian Learning Rules

Hebb's postulate (1949) states: "When an axon of cell A is near enough to excite cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased." This qualitative idea has been formalized in several mathematical learning rules.

Derivation 2: BCM Theory and Sliding Threshold

The Bienenstock-Cooper-Munro (BCM) rule (1982) addresses the instability of pure Hebbian learning by introducing a sliding modification threshold. The weight change is:

$$\frac{dw_i}{dt} = \eta \, x_i \, y \, (y - \theta_M)$$

where $x_i$ is presynaptic activity, $y$ is postsynaptic activity, and$\theta_M$ is the sliding threshold. The threshold adjusts based on the time-averaged postsynaptic activity:

$$\theta_M = \langle y^2 \rangle / y_0$$

When $y > \theta_M$: LTP occurs (pre-post correlation strengthens the synapse). When $0 < y < \theta_M$: LTD occurs. The key stability property is that the threshold slides upward when the neuron is too active, preventing runaway excitation:

$$\frac{d\theta_M}{dt} = \frac{1}{\tau_\theta}\left(\frac{y^2}{y_0} - \theta_M\right)$$

BCM theory predicts ocular dominance plasticity: if one eye is deprived, its synapses weaken (LTD) while the open eye's synapses strengthen (LTP), consistent with monocular deprivation experiments.

Derivation 3: Spike-Timing-Dependent Plasticity (STDP)

STDP (Markram et al., 1997; Bi and Poo, 1998) formalizes the temporal asymmetry of Hebbian plasticity. The weight change depends on the relative timing of pre- and postsynaptic spikes:

$$\Delta w = \begin{cases} A_+ \exp(-\Delta t / \tau_+) & \text{if } \Delta t > 0 \text{ (pre before post, LTP)} \\ -A_- \exp(\Delta t / \tau_-) & \text{if } \Delta t < 0 \text{ (post before pre, LTD)} \end{cases}$$

where $\Delta t = t_{\text{post}} - t_{\text{pre}}$. Typical values are$\tau_+ \approx 20$ ms, $\tau_- \approx 20$ ms, with $A_- > A_+$ to maintain stability. The STDP rule can be derived from the calcium model: pre-before-post timing produces maximal NMDA receptor activation and Ca$^{2+}$ influx, while post-before-pre timing produces only moderate Ca$^{2+}$. The net effect over Poisson spike trains with rates $r_{\text{pre}}, r_{\text{post}}$ gives the expected weight change:

$$\langle \Delta w \rangle = r_{\text{pre}} r_{\text{post}} (A_+ \tau_+ - A_- \tau_-)$$

When $A_+\tau_+ > A_-\tau_-$, the rule is net potentiating for correlated firing; when $A_+\tau_+ < A_-\tau_-$, there is a net depression that stabilizes weights.

3. Memory Consolidation

The complementary learning systems (CLS) theory (McClelland et al., 1995) proposes that the hippocampus rapidly encodes episodic memories, which are then gradually consolidated into neocortical long-term storage during sleep. This two-stage process solves the stability-plasticity dilemma: the hippocampus can learn quickly without catastrophically interfering with existing cortical memories.

Derivation 4: Hippocampal Replay and Systems Consolidation

During slow-wave sleep, hippocampal place cells replay recent experiences in compressed form (sharp-wave ripples, 150–250 Hz). Model the hippocampal memory as a pattern$\boldsymbol{\xi}$ stored via one-shot Hebbian learning in a weight matrix:

$$\mathbf{W}_{\text{hipp}} = \frac{1}{N} \boldsymbol{\xi} \boldsymbol{\xi}^T$$

Consolidation transfers this memory to the cortical network through repeated replay. The cortical weights update incrementally with each replay:

$$\Delta \mathbf{W}_{\text{ctx}} = \frac{\epsilon}{N} \boldsymbol{\xi}_{\text{replay}} \boldsymbol{\xi}_{\text{replay}}^T$$

where $\epsilon \ll 1$ is the slow cortical learning rate. After $K$ replay events, the cortical memory strength is:

$$\text{SNR}_{\text{ctx}} \approx K \epsilon \cdot \text{SNR}_{\text{hipp}}$$

The gradual interleaving of old and new memories during replay prevents catastrophic forgetting. The consolidation timescale $\tau_c \sim 1/(r_{\text{replay}} \epsilon)$predicts that stronger hippocampal traces (more replays) consolidate faster.

Derivation 5: Hopfield Network Memory Capacity

The Hopfield network (1982) stores binary patterns $\boldsymbol{\xi}^\mu \in \{-1, +1\}^N$using Hebbian weights:

$$w_{ij} = \frac{1}{N} \sum_{\mu=1}^{P} \xi_i^\mu \xi_j^\mu$$

The signal-to-noise ratio for recalling pattern $\mu$ is the alignment field versus the cross-talk noise. At neuron $i$:

$$h_i = \underbrace{\xi_i^\mu}_{\text{signal}} + \underbrace{\frac{1}{N}\sum_{\nu \neq \mu}\sum_j \xi_i^\nu \xi_j^\nu \xi_j^\mu}_{\text{crosstalk noise}}$$

The crosstalk term is a sum of $N(P-1)$ approximately independent terms, each with zero mean and variance $1/N^2$. By the central limit theorem:

$$\text{noise} \sim \mathcal{N}\left(0, \frac{P-1}{N}\right)$$

For reliable recall (error rate below ~1%), we need the noise to be much smaller than the signal: $\sqrt{(P-1)/N} \ll 1$, giving the classic capacity bound$P_{\max} \approx 0.138 N$ (Amit et al., 1985). Beyond this limit, catastrophic forgetting occurs as stored patterns become mutually corrupted.

4. Historical Development

  • 1949: Donald Hebb publishes "The Organization of Behavior," proposing that coincident neural activity strengthens synaptic connections.
  • 1957: Scoville and Milner describe patient H.M., demonstrating that the hippocampus is essential for forming new declarative memories.
  • 1973: Bliss and Lomo discover long-term potentiation (LTP) in the hippocampus of anesthetized rabbits.
  • 1982: Hopfield introduces the associative memory network; Bienenstock, Cooper, and Munro propose the BCM sliding threshold rule.
  • 1992: Dudek and Bear demonstrate homosynaptic long-term depression (LTD) in hippocampal slices.
  • 1997: Markram et al. discover spike-timing-dependent plasticity (STDP) in cortical neurons.
  • 2006: Pastalkova et al. demonstrate hippocampal replay during sleep, supporting systems consolidation theory.
  • 2014: O'Keefe and the Mosers receive the Nobel Prize for discovering place cells and grid cells in the hippocampal formation.

5. Applications

Alzheimer's Disease

Understanding LTP/LTD mechanisms guides therapeutic targets. Amyloid-beta oligomers impair LTP and enhance LTD, explaining early memory deficits. NMDA receptor modulators (e.g., memantine) aim to restore synaptic plasticity balance.

Deep Learning

STDP-inspired local learning rules enable neuromorphic computing. Hopfield networks have been revitalized as modern Hopfield networks with exponentially larger capacity, connecting to transformer attention mechanisms.

Memory Prosthetics

Hippocampal prosthetics aim to restore memory formation by mimicking the input-output transformation of damaged hippocampal circuits. STDP principles guide stimulation protocols for memory enhancement.

Sleep and Learning

Consolidation theory explains why sleep deprivation impairs memory. Targeted memory reactivation during sleep (re-presenting learning-associated cues) enhances consolidation, with applications in education and therapy.

6. Computational Exploration

Learning and Memory: LTP/LTD, STDP, Hopfield Networks, and Consolidation

Python
script.py299 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Chapter Summary

  • LTP and LTD are calcium-dependent: high Ca$^{2+}$ triggers LTP via CaMKII, moderate Ca$^{2+}$ triggers LTD via calcineurin.
  • BCM theory stabilizes Hebbian learning with a sliding threshold $\theta_M = \langle y^2 \rangle / y_0$.
  • STDP introduces temporal asymmetry: pre-before-post timing yields LTP, post-before-pre yields LTD.
  • Hopfield networks store memories as attractor states with capacity $P_{\max} \approx 0.138N$.
  • Systems consolidation transfers hippocampal memories to cortex via sleep replay, preventing catastrophic forgetting.

Video Lectures

Smell and Memory — Boston Children's Hospital

How We Learn to Read — Boston Children's Hospital

Practice Problems

Problem 1: STDP Weight Change CalculationA presynaptic neuron fires at $t_{\text{pre}} = 100$ ms and the postsynaptic neuron fires at $t_{\text{post}} = 110$ ms. The STDP rule is $\Delta w = A_+\exp(-|\Delta t|/\tau_+)$ for $\Delta t > 0$ and $\Delta w = -A_-\exp(-|\Delta t|/\tau_-)$ for $\Delta t < 0$, with $A_+ = 0.01$, $A_- = 0.012$, $\tau_+ = 20$ ms, $\tau_- = 20$ ms. Calculate the weight change.

Solution:

1. Compute the spike timing difference:

$$\Delta t = t_{\text{post}} - t_{\text{pre}} = 110 - 100 = +10\;\text{ms}$$

2. Since $\Delta t > 0$ (pre-before-post), we use the LTP branch:

$$\Delta w = A_+ \exp\!\left(-\frac{|\Delta t|}{\tau_+}\right) = 0.01 \exp\!\left(-\frac{10}{20}\right)$$

3. Evaluate the exponential:

$$\Delta w = 0.01 \times e^{-0.5} = 0.01 \times 0.6065$$

4. The weight change is:

$$\boxed{\Delta w = +0.00607}$$

5. This positive weight change means the synapse is potentiated (LTP). The pre-before-post timing is causally consistent — the presynaptic spike may have contributed to the postsynaptic firing — so the synapse is strengthened. With the asymmetric $A_-/A_+ = 1.2$ ratio, net depression occurs when pre- and post-synaptic neurons fire at equal rates with random timing.

Problem 2: Hebbian Learning Rule ApplicationA network has 3 input neurons with activities $x = (0.8, 0.2, 0.5)$ and one output neuron with activity $y = 0.6$. The current weight vector is $w = (0.3, 0.4, 0.1)$. Using the basic Hebbian rule $\Delta w_i = \eta\, x_i\, y$ with learning rate $\eta = 0.1$, compute the updated weights.

Solution:

1. Apply the Hebbian rule $\Delta w_i = \eta\, x_i\, y$ for each synapse:

$$\Delta w_1 = 0.1 \times 0.8 \times 0.6 = 0.048$$
$$\Delta w_2 = 0.1 \times 0.2 \times 0.6 = 0.012$$
$$\Delta w_3 = 0.1 \times 0.5 \times 0.6 = 0.030$$

2. Update the weights:

$$w_1' = 0.3 + 0.048 = 0.348$$
$$w_2' = 0.4 + 0.012 = 0.412$$
$$w_3' = 0.1 + 0.030 = 0.130$$

3. The updated weight vector is:

$$\boxed{w' = (0.348,\; 0.412,\; 0.130)}$$

4. Note that the most active input ($x_1 = 0.8$) receives the largest weight increase, reinforcing the correlation between input and output.

5. The basic Hebbian rule has no mechanism for weight decrease, leading to unbounded growth. This motivates modifications like Oja's rule ($\Delta w_i = \eta\,y(x_i - y\,w_i)$) which normalizes weights and extracts the first principal component.

Problem 3: BCM Theory Sliding ThresholdIn the BCM model, the modification threshold is $\theta_M = \langle y^2 \rangle$, where $y$ is the postsynaptic activity. If the neuron's recent activity over 5 time steps is $y = (1.0, 2.0, 0.5, 1.5, 3.0)$, compute $\theta_M$ and determine whether a current response of $y = 1.8$ leads to potentiation or depression.

Solution:

1. Compute the sliding threshold as the time-averaged squared activity:

$$\theta_M = \langle y^2 \rangle = \frac{1}{5}(1.0^2 + 2.0^2 + 0.5^2 + 1.5^2 + 3.0^2)$$

2. Evaluate:

$$\theta_M = \frac{1 + 4 + 0.25 + 2.25 + 9}{5} = \frac{16.5}{5} = 3.3$$

3. The BCM weight change rule is $\Delta w \propto y(y - \theta_M) \cdot x$. With current $y = 1.8$:

$$y - \theta_M = 1.8 - 3.3 = -1.5 < 0$$

4. Since $y > 0$ but $y < \theta_M$, the product $y(y - \theta_M) < 0$:

$$\boxed{\Delta w \propto 1.8 \times (-1.5) = -2.7 \implies \text{Depression (LTD)}}$$

5. The sliding threshold ensures homeostatic stability: when the neuron has been very active (high $\theta_M$), moderate responses now cause depression, preventing runaway excitation. Potentiation requires $y > \theta_M = 3.3$ — only strong responses are reinforced.

Problem 4: Hopfield Network Storage CapacityA Hopfield network has $N = 200$ neurons. Using the theoretical capacity limit, how many random binary patterns can be reliably stored? If 30 patterns are stored, estimate the probability of a single-bit retrieval error.

Solution:

1. The maximum number of patterns a Hopfield network can store with reliable retrieval is given by the Amit-Gutfreund-Sompolinsky result:

$$p_{\max} = \frac{N}{2\ln N}$$

2. For $N = 200$:

$$\boxed{p_{\max} = \frac{200}{2\ln 200} = \frac{200}{2 \times 5.30} \approx 18.9 \approx 18\;\text{patterns}}$$

3. The simpler rule-of-thumb capacity is $p_{\max} \approx 0.138N \approx 27.6$ patterns (for vanishing error probability as $N \to \infty$).

4. For $p = 30$ stored patterns, the load parameter is $\alpha = p/N = 30/200 = 0.15$. The single-bit error probability from the signal-to-noise analysis is:

$$P_{\text{error}} = \frac{1}{2}\text{erfc}\!\left(\frac{1}{\sqrt{2\alpha}}\right) = \frac{1}{2}\text{erfc}\!\left(\frac{1}{\sqrt{0.3}}\right) \approx \frac{1}{2}\text{erfc}(1.826) \approx 0.017$$

5. At $\alpha = 0.15 > 0.138$, the network is above the critical capacity. While individual bit errors are only ~1.7%, the probability that the entire pattern is retrieved correctly is $(1 - 0.017)^{200} \approx 0.033$, meaning reliable full-pattern retrieval fails. This sharp transition at $\alpha_c \approx 0.138$ is the storage catastrophe.

Problem 5: Net STDP Effect from Spike Pair InteractionsPre and post neurons both fire as Poisson processes at rate $\nu = 10$ Hz. Using the STDP kernel $W(\Delta t) = A_+e^{-\Delta t/\tau_+}$ for $\Delta t > 0$ and $W(\Delta t) = -A_-e^{+\Delta t/\tau_-}$ for $\Delta t < 0$, with $A_+ = 0.005$, $A_- = 0.00525$, $\tau_+ = \tau_- = 20$ ms, find the net weight change rate $\dot{w}$.

Solution:

1. For uncorrelated Poisson processes, spike pairs occur at rate $\nu^2$ per unit time. The mean weight change per unit time integrates the STDP kernel over all timing differences:

$$\dot{w} = \nu^2 \int_{-\infty}^{\infty} W(\Delta t)\,d(\Delta t)$$

2. Evaluate the integral for each branch:

$$\int_0^{\infty} A_+ e^{-\Delta t/\tau_+}\,d(\Delta t) = A_+\tau_+ = 0.005 \times 0.020 = 1.0 \times 10^{-4}\;\text{s}$$
$$\int_{-\infty}^{0} (-A_-) e^{\Delta t/\tau_-}\,d(\Delta t) = -A_-\tau_- = -0.00525 \times 0.020 = -1.05 \times 10^{-4}\;\text{s}$$

3. Sum the contributions:

$$\int_{-\infty}^{\infty} W(\Delta t)\,d(\Delta t) = (1.0 - 1.05) \times 10^{-4} = -0.05 \times 10^{-4}\;\text{s}$$

4. Multiply by $\nu^2 = (10)^2 = 100$ Hz$^2$:

$$\boxed{\dot{w} = 100 \times (-5 \times 10^{-6}) = -5 \times 10^{-4}\;\text{s}^{-1}}$$

5. The net effect is slow depression, because $A_-\tau_- > A_+\tau_+$. This ensures that uncorrelated firing leads to synaptic weakening, providing a natural homeostatic mechanism. Only causally correlated spike pairs (where pre-before-post timing is statistically enriched) can overcome this depression bias to produce net potentiation.

Share:XRedditLinkedIn
Rate this chapter:
2. Learning & Memory | Neuroscience | CoursesHub.World