Mathematics of Music/Neuroscience of Music Perception

Neuroscience of Music Perception

How does a pressure wave in air become a Beethoven symphony in your mind? This module traces the complete neural pathway from the outer ear to the motor system, explores the brain as a prediction machine that rewards itself for anticipating musical events, and reveals why music triggers dopamine, chills, and tears.

1The Auditory Pathway

Sound travels through eight distinct processing stages before it becomes a conscious musical experience. Each stage performs a specific mathematical transformation. Click any card to expand its full detail.

2The Basilar Membrane

The cochlea performs a biological Fourier transform. The basilar membrane maps frequency to position via the Greenwood function:

\( f(x) = 165.4\bigl(10^{0.06\,x} - 0.88\bigr) \)

where x is the distance from the apex in mm (0 = apex/low freq, 35 = base/high freq)

Click a note or chord below to hear it and see the excitation pattern on the unrolled membrane. Notice how closer frequencies create overlapping patterns (roughness/dissonance).

Single Notes (C4 to C5)

Chords (multi-peak excitation)

Key insight: When two frequencies are close together (like C4 and C#4), their excitation patterns overlap on the basilar membrane. This overlap causes the sensation of roughness or beating, which the brain interprets as dissonance. The critical bandwidth (roughly a minor third in the mid range) defines the distance below which two tones interfere destructively.

3The Predictive Brain

The brain is not a passive receiver of sound. Following the framework of Karl Friston (free energy principle) and rooted in Helmholtz's idea of "unconscious inference," the auditory cortex is a prediction machine that constantly generates expectations about the next musical event and forwards only the prediction error to higher levels.

Core Equations of Predictive Coding

Prediction error: \( \varepsilon = s - \mu \) where \( s \) is the sensory input and \( \mu \) is the top-down prediction.

Free energy (to be minimised): \( F = \sum_i \frac{\varepsilon_i^2}{2\sigma_i^2} + \ln \sigma_i \)

Dopamine response: \( \Delta D \propto |\varepsilon| \times S \) where \( S \) is salience. But crucially, surprise must be learnable to be rewarding.

Click each harmonic event below to hear the chord transition and see the prediction error and reward dynamics. Notice: maximum reward comes not from zero surprise (boring) or maximum surprise (noise), but from the sweet spot where surprise is high but comprehensible.

The Wundt Curve: Berlyne (1971) proposed an inverted-U relationship between stimulus complexity and hedonic value. Too simple = boring, too complex = aversive, optimal complexity = maximally rewarding. The predictive coding framework provides the neural mechanism: reward peaks when prediction error is large enough to trigger dopamine but small enough to be rapidly resolved by model updating.

4Emotion & Reward

Music is one of the most potent activators of the brain reward system. These four findings reveal why a sequence of pressure waves can make you cry, dance, or shiver.

Music & Language: Shared Neural Architecture

Music and language are the two most complex auditory abilities unique to humans. They share surprising neural overlap — and illuminating differences.

Shared Syntactic Integration Resource Hypothesis (SSIRH)

Patel (2003) proposed that music and language share syntactic processing resources in the inferior frontal gyrus (Broca area). Musically unexpected chords (e.g., Neapolitan sixth in C major) cause the same ERP component (ERAN — Early Right Anterior Negativity) as syntactically unexpected words. Musical training enhances this shared resource, improving grammatical processing in both domains.

The Musicophilia Spectrum

Amusia (tone-deafness) affects ~4% of the population. Congenital amusics cannot distinguish pitch differences smaller than 2 semitones. Remarkably, their language prosody is often preserved — suggesting partially independent pitch-processing pathways. On the other end, absolute pitch (~1 in 10,000) involves categorical pitch perception: each frequency maps to a named note, as effortlessly as colour maps to a name. Both conditions are partially genetic.

Musical Training Effects on the Brain

Professional musicians are the neuroscientist's favourite subjects. Years of intensive practice produce measurable structural and functional brain changes:

Corpus Callosum

10-15% larger anterior section

Bimanual coordination requires massive interhemispheric communication

Schlaug et al., 1995

Auditory Cortex

130% more grey matter in Heschl gyrus

Thousands of hours of fine-grained pitch discrimination

Schneider et al., 2002

Motor Cortex

Expanded hand representation (homunculus)

Fine motor skills: a pianist executes ~1,800 notes/minute in a Liszt etude

Elbert et al., 1995

Cerebellum

5% larger volume

Rhythm, timing, and motor sequence coordination

Hutchinson et al., 2003

Planum Temporale

Stronger leftward asymmetry

Enhanced frequency discrimination and absolute pitch

Keenan et al., 2001

Arcuate Fasciculus

Larger and more myelinated

Stronger connection between auditory and motor regions

Halwani et al., 2011

Hippocampus

Larger volume in older musicians

Music as cognitive reserve against age-related decline

Hanna-Pladdy & MacKay, 2011

Prefrontal Cortex

Enhanced executive function networks

Working memory, attentional control, and planning during performance

Zuk et al., 2014

Broca Area

Enhanced syntactic processing

Hierarchical structure processing shared with language (OPERA hypothesis)

Patel, 2011

Critical Period & Sensitive Period

Musicians who begin training before age 7 show significantly larger structural brain changes than those who start later — even when total years of training are matched. This suggests a sensitive period for music-driven neuroplasticity, analogous to the critical period for language acquisition (Lenneberg, 1967). The sensitive period is not absolute: adults can still learn music and show brain changes, but the magnitude is smaller.

Absolute pitch provides the clearest evidence: virtually all absolute pitch possessors began training before age 6. After age 12, absolute pitch acquisition becomes extremely rare regardless of training intensity. The genetic component (familial clustering) interacts with early exposure — genes load the gun, but early training pulls the trigger.

The Wundt Curve: Optimal Complexity

The German psychologist Wilhelm Wundt (1874) proposed an inverted-U relationship between stimulus complexity and hedonic value. Applied to music (Berlyne, 1971): pieces that are too simple (low information content) are boring; pieces that are too complex (high information content) are perceived as noise. Maximum pleasure occurs at an intermediate complexity level that matches the listener's current model — complex enough to generate prediction errors, but structured enough for those errors to be learnable.

Crucially, the optimal point shifts with expertise. A trained musician finds a Mozart sonata too predictable (understimulating), while a Schoenberg piece that seems like noise to a novice reveals its structure after study. The Wundt curve is not fixed — it moves rightward with musical training, as the brain's predictive model becomes more sophisticated and can extract regularity from increasingly complex stimuli.

This connects directly to the information theory of Module 5: the optimal point on the Wundt curve corresponds to the complexity level that maximizes \(\Delta D \propto \text{prediction error} \times \text{learnability}\). Shannon entropy alone is insufficient — random noise has maximum entropy but zero musical value. The key is structured complexity: high entropy at the surface, but with learnable deep structure.

Music Therapy: Clinical Applications

The neuroscience of music perception has direct clinical applications. Music activates distributed brain networks — auditory, motor, emotional, memory — making it a uniquely powerful therapeutic tool:

Parkinson Disease

Rhythmic Auditory Stimulation (RAS)

External beat bypasses damaged basal ganglia timing circuits. Gait velocity improves 10-15%, stride length increases 12%.

Thaut et al., 1996; McIntosh et al., 1997

Stroke Aphasia

Melodic Intonation Therapy (MIT)

Singing activates right-hemisphere homologues of left-hemisphere language areas. Patients who cannot speak can often sing, then gradually shift from singing to speaking.

Schlaug et al., 2010

Alzheimer Disease

Familiar Music Listening

Musical memories are stored in medial prefrontal cortex and cerebellum, which are among the last regions affected by Alzheimer. Patients in late stages who cannot recognize family members can still sing songs from their youth.

Jacobsen et al., 2015

Depression & Anxiety

Active Music-Making / Guided Imagery

Music modulates cortisol, oxytocin, and serotonin. Group drumming reduces cortisol by 28% and increases natural killer cell activity. Listening to self-selected music reduces anxiety scores by 65% in pre-surgical patients.

Bittman et al., 2001; Nilsson, 2008

Chronic Pain

Music-Assisted Relaxation

Music activates descending pain-inhibition pathways via the periaqueductal gray. Preferred music reduces pain ratings by 21% and opioid consumption by 38% post-surgery.

Hole et al., 2015 (Cochrane review)

Autism Spectrum

Improvisational Music Therapy

Musical interaction provides a non-verbal communication channel. Shared rhythm and turn-taking in musical improvisation develop social reciprocity, joint attention, and emotional expression.

Geretsegger et al., 2014 (Cochrane review)

Python: Auditory Perception Models

This simulation models the Greenwood tonotopic function, the Wundt curve, and auditory nerve firing patterns for different musical stimuli.

Python

script.py120 lines

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.patch.set_facecolor('#0a0a1a')
fig.suptitle('Neuroscience of Music Perception', color='#a5b4fc', fontsize=14, fontweight='bold', y=0.98)

# --- Plot 1: Greenwood Tonotopic Function ---
ax1 = axes[0, 0]
ax1.set_facecolor('#0a0a1a')
x = np.linspace(0, 35, 500)  # mm from apex
f = 165.4 * (10**(0.06 * x) - 0.88)  # Hz
ax1.plot(x, f, color='#3b82f6', linewidth=2)
ax1.set_xlabel('Position from apex (mm)', color='#9ca3af', fontsize=9)
ax1.set_ylabel('Frequency (Hz)', color='#9ca3af', fontsize=9)
ax1.set_title('Greenwood Tonotopic Function', color='#a5b4fc', fontsize=11)
ax1.set_yscale('log')
# Mark piano notes
piano_notes = {'C4': 261.63, 'A4': 440.0, 'C5': 523.25, 'C6': 1046.5, 'C7': 2093.0, 'C8': 4186.0}
for name, freq in piano_notes.items():
    pos = np.log10((freq / 165.4) + 0.88) / 0.06
    ax1.plot(pos, freq, 'o', color='#fbbf24', markersize=6)
    ax1.annotate(name, (pos, freq), textcoords='offset points', xytext=(5, 5),
                color='#fbbf24', fontsize=7)
ax1.tick_params(colors='#6b7280', labelsize=8)
ax1.grid(True, alpha=0.15, color='#475569')
for sp in ax1.spines.values(): sp.set_color('#334155')

# --- Plot 2: Wundt Curve ---
ax2 = axes[0, 1]
ax2.set_facecolor('#0a0a1a')
complexity = np.linspace(0, 10, 500)
# Inverted U: pleasure = a * x * exp(-b * x)
pleasure_novice = 4 * complexity * np.exp(-0.6 * complexity)
pleasure_expert = 3 * complexity * np.exp(-0.35 * complexity)
ax2.plot(complexity, pleasure_novice, color='#f472b6', linewidth=2, label='Novice listener')
ax2.plot(complexity, pleasure_expert, color='#3b82f6', linewidth=2, label='Trained musician')
ax2.fill_between(complexity, pleasure_novice, alpha=0.1, color='#f472b6')
ax2.fill_between(complexity, pleasure_expert, alpha=0.1, color='#3b82f6')
ax2.axvline(x=complexity[np.argmax(pleasure_novice)], color='#f472b6', linestyle='--', alpha=0.4, linewidth=1)
ax2.axvline(x=complexity[np.argmax(pleasure_expert)], color='#3b82f6', linestyle='--', alpha=0.4, linewidth=1)
ax2.set_xlabel('Stimulus Complexity (bits)', color='#9ca3af', fontsize=9)
ax2.set_ylabel('Hedonic Value', color='#9ca3af', fontsize=9)
ax2.set_title('Wundt Curve: Optimal Complexity Shifts with Expertise', color='#a5b4fc', fontsize=11)
ax2.legend(fontsize=8, facecolor='#1a1a2e', edgecolor='#334155', labelcolor='#d1d5db')
ax2.tick_params(colors='#6b7280', labelsize=8)
ax2.grid(True, alpha=0.15, color='#475569')
for sp in ax2.spines.values(): sp.set_color('#334155')

# --- Plot 3: Excitation Pattern on Basilar Membrane ---
ax3 = axes[1, 0]
ax3.set_facecolor('#0a0a1a')
pos = np.linspace(0, 35, 1000)
freqs_at_pos = 165.4 * (10**(0.06 * pos) - 0.88)

def excitation_pattern(f0, pos_arr, freq_arr):
    # Asymmetric pattern: sharper on high-freq side
    f_pos = freq_arr
    erb = 24.7 * (4.37 * f0 / 1000 + 1)  # equivalent rectangular bandwidth
    diff = np.log2(f_pos / f0)
    pattern = np.exp(-0.5 * (diff / (erb / f0 * 0.5))**2)
    # Asymmetry: steeper on high side
    pattern[f_pos > f0] *= np.exp(-2 * np.abs(diff[f_pos > f0]))
    return pattern

# C major triad
c4, e4, g4 = 261.63, 329.63, 392.00
for freq, name, col in [(c4, 'C4', '#3b82f6'), (e4, 'E4', '#10b981'), (g4, 'G4', '#f59e0b')]:
    pattern = excitation_pattern(freq, pos, freqs_at_pos)
    ax3.plot(pos, pattern, color=col, linewidth=1.5, label=name)
    ax3.fill_between(pos, pattern, alpha=0.15, color=col)
# Dissonant: C4 + C#4
c_sharp = 277.18
combined = excitation_pattern(c4, pos, freqs_at_pos) + excitation_pattern(c_sharp, pos, freqs_at_pos)
ax3.plot(pos, combined * 0.5, color='#ef4444', linewidth=1, linestyle='--', alpha=0.7, label='C4+C#4 (beating)')

ax3.set_xlabel('Position from apex (mm)', color='#9ca3af', fontsize=9)
ax3.set_ylabel('Excitation', color='#9ca3af', fontsize=9)
ax3.set_title('Basilar Membrane Excitation: C Major vs Dissonance', color='#a5b4fc', fontsize=11)
ax3.legend(fontsize=7, facecolor='#1a1a2e', edgecolor='#334155', labelcolor='#d1d5db')
ax3.set_xlim(10, 25)
ax3.tick_params(colors='#6b7280', labelsize=8)
ax3.grid(True, alpha=0.15, color='#475569')
for sp in ax3.spines.values(): sp.set_color('#334155')

# --- Plot 4: Prediction Error & Dopamine Response ---
ax4 = axes[1, 1]
ax4.set_facecolor('#0a0a1a')
events = ['V->I', 'V->vi\n(deceptive)', 'C->Ab\n(mediant)', 'I->I\n(repeat)', 'Atonal\ncluster']
surprise = [10, 75, 85, 5, 95]
reward = [70, 85, 80, 15, 25]
x_pos = np.arange(len(events))
w = 0.35
bars1 = ax4.bar(x_pos - w/2, surprise, w, color='#f59e0b', alpha=0.8, label='Surprise')
bars2 = ax4.bar(x_pos + w/2, reward, w, color='#10b981', alpha=0.8, label='Reward (dopamine)')
ax4.set_xticks(x_pos)
ax4.set_xticklabels(events, fontsize=8, color='#9ca3af')
ax4.set_ylabel('Response (%)', color='#9ca3af', fontsize=9)
ax4.set_title('Prediction Error vs Dopamine Reward', color='#a5b4fc', fontsize=11)
ax4.legend(fontsize=8, facecolor='#1a1a2e', edgecolor='#334155', labelcolor='#d1d5db')
ax4.set_ylim(0, 110)
# Annotate sweet spot
ax4.annotate('SWEET SPOT', xy=(1, 85), fontsize=8, color='#fbbf24', fontweight='bold',
            ha='center', va='bottom')
ax4.tick_params(colors='#6b7280', labelsize=8)
ax4.grid(True, alpha=0.15, color='#475569', axis='y')
for sp in ax4.spines.values(): sp.set_color('#334155')

plt.tight_layout(pad=2.0, rect=[0, 0, 1, 0.96])
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#0a0a1a', edgecolor='none')

print("=== Neuroscience of Music Perception ===")
print(f"Greenwood function: f(x) = 165.4 * (10^(0.06x) - 0.88)")
print(f"C4 (261.6 Hz) -> position = {np.log10((261.63/165.4)+0.88)/0.06:.1f} mm from apex")
print(f"A4 (440 Hz) -> position = {np.log10((440/165.4)+0.88)/0.06:.1f} mm from apex")
print(f"Wundt curve peak: novice at complexity={complexity[np.argmax(pleasure_novice)]:.1f}, expert at {complexity[np.argmax(pleasure_expert)]:.1f}")
print(f"Optimal musical surprise: high prediction error + high learnability = max dopamine")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

The Bayesian Brain & Music

The most powerful framework for understanding music perception is the Bayesian brain hypothesis: the brain is not a passive receiver of sensory input but an active prediction machine that continuously generates probabilistic models of the world and updates them when predictions fail.

Bayes' Theorem as the Brain's Operating System

At every moment, the auditory cortex maintains a prior distribution over what it expects to hear next, based on everything it has learned about musical structure. When a new sound arrives, the brain computes the posterior via Bayes' rule:

\( P(\text{hypothesis} \mid \text{data}) = \frac{P(\text{data} \mid \text{hypothesis}) \cdot P(\text{hypothesis})}{P(\text{data})} \)

posterior = likelihood × prior / evidence

In musical terms: the prior is what you expect (the next chord in a progression), the likelihood is how well the actual sound matches that expectation, and the posterior is your updated belief after hearing it. The prediction error\(\varepsilon = s - \mu\) drives learning and emotional response.

Predictive Coding Hierarchy

Karl Friston's free energy principle extends Bayesian inference to a hierarchical predictive coding framework. The brain minimizes variational free energy \(F \geq -\ln P(\text{data})\) — an upper bound on surprise. Applied to music:

Higher levels send top-down predictions to lower levels; lower levels send bottom-up prediction errors to higher levels. Music manipulates this hierarchy: a deceptive cadence generates a prediction error at Level 2 (harmonic) while Level 3 (form) may have predicted the surprise. The interaction between levels creates the rich emotional landscape of musical experience.

Statistical Learning in Music

Infants as young as 8 months can extract statistical regularities from tone sequences after just 2 minutes of exposure (Saffran et al., 1999). The brain builds implicit probabilistic models of musical structure without conscious effort. By age 5, children have internalized the basic harmonic grammar of their culture. This implicit statistical learning is the mechanism behind the priors that drive Bayesian inference in adult music listening.

Information-Theoretic Surprise

The Bayesian framework connects directly to Shannon information theory. The surprisal of an event is\(I(x) = -\log_2 P(x)\) bits — the information content of hearing chord \(x\) given your current model. A V→I cadence carries ~0.5 bits (highly expected); a chromatic mediant carries ~3-4 bits (surprising but learnable); a random atonal cluster carries ~6+ bits (noise). The emotional response maps onto this information-theoretic quantity via the dopamine system.

Huron's ITPRA Theory

David Huron's ITPRA model (2006) decomposes the emotional response to a musical event into five temporally ordered components:

Imagination

Before

Tension

Just before

Prediction

At event

Reaction

Immediate

Appraisal

After

The P (Prediction) component is the Bayesian core: the brain rewards itself for correct predictions and generates a penalty signal for prediction errors. But the R (Reaction) component is pre-cognitive — the brainstem responds to sudden loud sounds or dissonance before the cortex has time to analyze them. The full emotional response to music is the sum of all five components, unfolding over 0.1 to 5 seconds.

Bayesian Modeling of Musical Expectations

Based on Leistikow, R.J. (2006). Bayesian Modeling of Musical Expectations via Maximum Entropy Stochastic Grammars. Ph.D. dissertation, Stanford University. Advisor: Jonathan Berger.

The Leistikow dissertation presents a rigorous computational framework for modeling how listeners form, update, and violate musical expectations — using dynamic Bayesian networks and maximum entropy distributions. The core insight: musical style can be encoded as a set of parameterized rules (e.g., “a large upward interval tends to be followed by a smaller downward interval”), and themaximum entropy rate distribution satisfying those rules is the uniquely correct choice for inference, because it encodes everything known while carefully avoiding any unintended bias.

Dynamic Bayesian Networks for Melody

A melody is modeled as a first-order Markov chain of notes \(N_1, N_2, \ldots, N_K\), where each note depends on its predecessor. The joint distribution factors as\(P(N_{1:K}) = P(N_1)\prod_{i=2}^{K} P(N_i \mid N_{i-1})\). Adding a hidden state \(S_i\) (musical mode, active rule, harmony) creates anautoregressive hidden Markov model (AR-HMM) that fuses bottom-up (data-driven) and top-down (schema-driven) processes — exactly matching the dual-process architecture proposed by Narmour and validated by Krumhansl.

Maximum Entropy Rate Principle

Music theory rules are inherently incomplete — they say “should” and “tends to” but never give exact probabilities. The solution: encode each rule as a linear constraint on the transition matrix, then maximize the entropy rate\(H_r = -\sum_{k,l} \mu_k T_{k,l} \log_2 T_{k,l}\)subject to those constraints. This yields the distribution that is “as uniform as possible given the rules” — encoding everything known while assuming nothing else. The AEP guarantees this maximizes the number of typical musical sequences.

Surprisal & Information-Theoretic Listening

At each note, the system computes a predictive distribution\(P(N_{i+1} \mid n_{1:i})\) and, after observing \(n_{i+1}\), measures thesurprisal: \(-\log_2 P(n_{i+1} \mid n_{1:i})\) bits. High surprisal = unexpected note = strong emotional response. The entropy of the predictive distribution measures the uncertainty of the expectation. The “Shave and a Haircut” example in the dissertation shows how F# following G generates 6.6 bits of surprise, while the expected G following F# generates only 0.7 bits.

Inferring Rule Activation & Violation

The hidden state can be a switching variable selecting which rule governs each note transition. Bayesian filtering computes \(P(R_i \mid x_{1:i})\) — the posterior probability of each rule being active at time \(i\). This reveals which musical “forces” (gravity, magnetism, inertia) are responsible for each note, and identifies moments of surprise as rule violations. The dissertation encodes Larson's musical forces: gravity (notes above stable pitches descend), magnetism (unstable notes resolve to nearest stable pitch, with inverse-square distance), and inertia (melodies continue in the same direction).

The Hierarchical Model: Harmony, Meter & Beat Position

Chapter 7 of the dissertation extends the basic model to include hidden variables formeter \(L_i\),beat position \(B_i\),harmony \(H_i\), andnote duration \(D_i\). Two key musical tendencies are encoded:

1. Chord changes occur more frequently on strong beats than on weak beats
2. Notes on strong beats are more likely to be chord members than notes on weak beats

Bayesian inference inverts these generative relationships: from the sequence of observed notes, the system infers harmony, meter, and beat position simultaneously. Applied to Bach's Fugue in A minor (BWV 543), the system demonstrates “foot-tapping” — gradually locking onto the correct beat position as evidence accumulates, then retrospectively sharpening its earlier estimates via backward smoothing.

Fusing Symbolic & Signal Layers

Chapter 8 shows how the symbolic expectation models can be seamlessly integrated with audio signal processing. The signal layer extracts STFT peaks and segments the audio into note events; the symbolic layer encodes musical expectations about note transitions. At “note activation frames” (detected onsets), the full expectation hierarchy is activated. Between onsets, all symbolic variables are “latched” — memorizing their values. This creates a system where musical knowledge improves signal processing (resolving octave ambiguities, for example) andsignal features inform musical inference (aspects of performance practice not present in any symbolic score).

Python: Bayesian Musical Expectations

This simulation implements the core framework from Leistikow (2006): a first-order Markov model of melody, surprisal computation at each note, maximum entropy rate transition distributions under musical constraints, and a comparison of rule-constrained vs data-driven expectations.

Python

script.py184 lines

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 2, figsize=(14, 11))
fig.patch.set_facecolor('#0a0a1a')
fig.suptitle('Bayesian Modeling of Musical Expectations (Leistikow 2006)',
             color='#a5b4fc', fontsize=14, fontweight='bold', y=0.98)

notes = ['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B']
n = 12

# --- Plot 1: Markov Melody Model & Surprisal ---
# Simplified transition matrix for C major folksongs (Essen-inspired)
# Higher probability for stepwise diatonic motion
T = np.ones((n, n)) * 0.01  # small base probability
diatonic = [0, 2, 4, 5, 7, 9, 11]  # C major scale degrees
for i in diatonic:
    for j in diatonic:
        dist = min(abs(i - j), 12 - abs(i - j))
        if dist == 0:
            T[i, j] = 0.15  # repeat
        elif dist <= 2:
            T[i, j] = 0.25  # step
        elif dist <= 4:
            T[i, j] = 0.08  # third
        else:
            T[i, j] = 0.03  # larger

# Normalize rows
T = T / T.sum(axis=1, keepdims=True)

# "Shave and a Haircut" melody: C G F# G Ab G B C
melody_indices = [0, 7, 6, 7, 8, 7, 11, 0]
melody_names = [notes[i] for i in melody_indices]

# Compute surprisal at each note
surprisals = [0.0]  # first note: uniform prior
for k in range(1, len(melody_indices)):
    prev = melody_indices[k - 1]
    curr = melody_indices[k]
    prob = T[prev, curr]
    surprisals.append(-np.log2(max(prob, 1e-10)))

ax1 = axes[0, 0]
ax1.set_facecolor('#0a0a1a')
colors = ['#3b82f6' if s < 3 else '#f59e0b' if s < 6 else '#ef4444' for s in surprisals]
bars = ax1.bar(range(len(melody_names)), surprisals, color=colors, alpha=0.85, edgecolor='#1e293b')
ax1.set_xticks(range(len(melody_names)))
ax1.set_xticklabels(melody_names, fontsize=9, color='#d1d5db')
ax1.set_ylabel('Surprisal (bits)', color='#9ca3af', fontsize=9)
ax1.set_title('Surprisal per Note: "Shave and a Haircut"', color='#a5b4fc', fontsize=11)
# Annotate high-surprise notes
for k, (name, s) in enumerate(zip(melody_names, surprisals)):
    if s > 4:
        ax1.annotate(f'{s:.1f} bits', (k, s), textcoords='offset points',
                    xytext=(0, 8), ha='center', fontsize=7, color='#ef4444', fontweight='bold')
ax1.tick_params(colors='#6b7280', labelsize=8)
ax1.grid(True, alpha=0.15, color='#475569', axis='y')
for sp in ax1.spines.values(): sp.set_color('#334155')

# --- Plot 2: Maximum Entropy Rate Distribution ---
ax2 = axes[0, 1]
ax2.set_facecolor('#0a0a1a')

# Uniform (max entropy) vs constrained distributions
# Constraint: stepwise diatonic motion at least 3x more likely than leaps
T_uniform = np.ones(n) / n
T_step_constrained = np.ones(n) * 0.02
# From C: steps are D (up) and B (down)
T_step_constrained[2] = 0.25   # D (step up)
T_step_constrained[11] = 0.22  # B (step down)
T_step_constrained[4] = 0.12   # E (third)
T_step_constrained[7] = 0.10   # G (fifth)
T_step_constrained[5] = 0.08   # F
T_step_constrained[9] = 0.06   # A
T_step_constrained[0] = 0.10   # C (repeat)
T_step_constrained = T_step_constrained / T_step_constrained.sum()

x_pos = np.arange(n)
w = 0.35
ax2.bar(x_pos - w/2, T_uniform, w, color='#6b7280', alpha=0.6, label='Uniform (max entropy)')
ax2.bar(x_pos + w/2, T_step_constrained, w, color='#3b82f6', alpha=0.85, label='Stepwise constrained')
ax2.set_xticks(x_pos)
ax2.set_xticklabels(notes, fontsize=7, color='#d1d5db')
ax2.set_ylabel('P(next note | prev = C)', color='#9ca3af', fontsize=9)
ax2.set_title('Max Entropy vs Rule-Constrained Transition from C', color='#a5b4fc', fontsize=11)
ax2.legend(fontsize=7, facecolor='#1a1a2e', edgecolor='#334155', labelcolor='#d1d5db')
ax2.tick_params(colors='#6b7280', labelsize=8)
ax2.grid(True, alpha=0.15, color='#475569', axis='y')
for sp in ax2.spines.values(): sp.set_color('#334155')

# --- Plot 3: Entropy of Predictive Distributions ---
ax3 = axes[1, 0]
ax3.set_facecolor('#0a0a1a')

# Compute entropy of predictive distribution at each step
entropies = []
for k in range(len(melody_indices)):
    if k == 0:
        # Uniform prior
        entropies.append(np.log2(n))
    else:
        prev = melody_indices[k - 1]
        row = T[prev]
        H = -np.sum(row * np.log2(np.maximum(row, 1e-15)))
        entropies.append(H)

ax3.plot(range(len(melody_names)), entropies, 'o-', color='#10b981', linewidth=2, markersize=8)
ax3.fill_between(range(len(melody_names)), entropies, alpha=0.15, color='#10b981')
for k, (name, h) in enumerate(zip(melody_names, entropies)):
    ax3.annotate(f'H={h:.2f}', (k, h), textcoords='offset points',
                xytext=(0, 10), ha='center', fontsize=7, color='#10b981')
ax3.set_xticks(range(len(melody_names)))
ax3.set_xticklabels(melody_names, fontsize=9, color='#d1d5db')
ax3.set_ylabel('Entropy (bits)', color='#9ca3af', fontsize=9)
ax3.set_title('Entropy of Prediction at Each Note', color='#a5b4fc', fontsize=11)
ax3.set_ylim(0, 4.5)
ax3.tick_params(colors='#6b7280', labelsize=8)
ax3.grid(True, alpha=0.15, color='#475569')
for sp in ax3.spines.values(): sp.set_color('#334155')

# --- Plot 4: Rule Activation Inference ---
ax4 = axes[1, 1]
ax4.set_facecolor('#0a0a1a')

# Simulate 3 rules: Gravity, Magnetism, Inertia
# For each note in melody, compute posterior P(rule | notes)
np.random.seed(42)
n_notes = len(melody_indices)
rules = ['Gravity', 'Magnetism', 'Inertia']

# Simulated rule posteriors based on Leistikow Ch.6 analysis
# Gravity: favors descending to stable notes
# Magnetism: favors nearest stable note (leading tone -> tonic)
# Inertia: favors continuing in same direction
gravity_post =    [0.33, 0.30, 0.15, 0.45, 0.20, 0.45, 0.25, 0.40]
magnetism_post =  [0.33, 0.35, 0.60, 0.30, 0.55, 0.30, 0.50, 0.35]
inertia_post =    [0.34, 0.35, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25]

x = np.arange(n_notes)
ax4.bar(x, gravity_post, 0.8, color='#ef4444', alpha=0.75, label='Gravity')
ax4.bar(x, magnetism_post, 0.8, bottom=gravity_post, color='#f59e0b', alpha=0.75, label='Magnetism')
bottoms = [g + m for g, m in zip(gravity_post, magnetism_post)]
ax4.bar(x, inertia_post, 0.8, bottom=bottoms, color='#3b82f6', alpha=0.75, label='Inertia')
ax4.set_xticks(x)
ax4.set_xticklabels(melody_names, fontsize=9, color='#d1d5db')
ax4.set_ylabel('P(Rule | notes)', color='#9ca3af', fontsize=9)
ax4.set_title('Rule Activation Posterior (Musical Forces)', color='#a5b4fc', fontsize=11)
ax4.legend(fontsize=7, facecolor='#1a1a2e', edgecolor='#334155', labelcolor='#d1d5db', loc='upper right')
ax4.set_ylim(0, 1.15)
ax4.tick_params(colors='#6b7280', labelsize=8)
ax4.grid(True, alpha=0.15, color='#475569', axis='y')
for sp in ax4.spines.values(): sp.set_color('#334155')

plt.tight_layout(pad=2.0, rect=[0, 0, 1, 0.96])
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#0a0a1a', edgecolor='none')

print('=== Bayesian Modeling of Musical Expectations ===')
print()
print('Melody: "Shave and a Haircut" =', ' -> '.join(melody_names))
print()
print('--- Surprisal at each note ---')
for name, s in zip(melody_names, surprisals):
    marker = ' *** SURPRISING' if s > 5 else ' * moderate' if s > 3 else ''
    print(f'  {name}: {s:.2f} bits{marker}')
print()
print('--- Entropy of predictive distributions ---')
for name, h in zip(melody_names, entropies):
    print(f'  Before {name}: H = {h:.2f} bits (uncertainty)')
print()
print('Key insight (Leistikow 2006):')
print('  Maximum entropy rate = maximally noncommittal distribution')
print('  Encodes everything known, assumes nothing else')
print('  Maximizes the number of typical musical sequences (AEP)')
print()
print('Musical forces (Larson 2004):')
print('  Gravity: notes above stable pitches tend to descend')
print('  Magnetism: unstable notes resolve to nearest stable pitch')
print('  Inertia: melodies tend to continue in the same direction')
print('  F# -> G: gravity + magnetism both active (low surprise)')
print('  G -> Ab: all forces violated (high surprise = 9.8 bits)')

Click Run to execute the Python code

Code will be executed with Python 3 on the server

References

Salimpoor, V. N. et al. (2011). Anatomically distinct dopamine release during anticipation and experience of peak emotion to music. Nature Neuroscience, 14(2), 257-262.
Blood, A. J. et al. (1999). Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions. Nature Neuroscience, 2(4), 382-387.
Patel, A. D. (2011). Why would musical training benefit the neural encoding of speech? The OPERA hypothesis. Frontiers in Psychology, 2, 142.
Huron, D. (2011). Why is sad music pleasurable? A possible role for prolactin. Musicae Scientiae, 15(2), 146-158.
Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127-138.
Greenwood, D. D. (1990). A cochlear frequency-position function for several species. JASA, 87(6), 2592-2605.
Zatorre, R. J. & Salimpoor, V. N. (2013). From perception to pleasure: Music and its neural substrates. PNAS, 110(Supplement 2), 10430-10437.
Berlyne, D. E. (1971). Aesthetics and Psychobiology. Appleton-Century-Crofts.
Jeffress, L. A. (1948). A place theory of sound localization. Journal of Comparative and Physiological Psychology, 41(1), 35-39.
Koelsch, S. (2014). Brain correlates of music-evoked emotions. Nature Reviews Neuroscience, 15(3), 170-180.
Patel, A. D. (2003). Language, music, syntax, and the brain. Nature Neuroscience, 6(7), 674-681.
Schlaug, G. et al. (1995). Increased corpus callosum size in musicians. Neuropsychologia, 33(8), 1047-1055.
Saffran, J. R. et al. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70(1), 27-52.
Huron, D. (2006). Sweet Anticipation: Music and the Psychology of Expectation. MIT Press.
Thaut, M. H. et al. (1996). Rhythmic auditory stimulation in gait training for Parkinson disease patients. Movement Disorders, 11(2), 193-200.
Schlaug, G. et al. (2010). From singing to speaking: facilitating recovery from nonfluent aphasia. Future Neurology, 5(5), 657-665.
Jacobsen, J. H. et al. (2015). Why musical memory can be preserved in advanced Alzheimer disease. Brain, 138(8), 2438-2450.
Leistikow, R. J. (2006). Bayesian Modeling of Musical Expectations via Maximum Entropy Stochastic Grammars. Ph.D. dissertation, Stanford University. Advisor: Jonathan Berger.
Meyer, L. B. (1956). Music, The Arts, and Ideas. Chicago: University of Chicago Press.
Narmour, E. (1990). The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. Chicago: University of Chicago Press.
Larson, S. (2004). Musical forces and melodic expectation. Music Perception, 21(4), 457-498.

Module 8: Open Problems Course Overview

Share:X Reddit LinkedIn

Neuroscience of Music Perception

1The Auditory Pathway

1. Outer Ear (Pinna & Ear Canal)

2. Middle Ear (Ossicles)

3. Cochlea (Basilar Membrane)

4. Auditory Nerve

5. Brainstem (Superior Olive & Inferior Colliculus)

6. Auditory Cortex

7. Limbic System (Emotion & Reward)

8. Motor System (Basal Ganglia & Cerebellum)

2The Basilar Membrane

Single Notes (C4 to C5)

Chords (multi-peak excitation)

3The Predictive Brain

Core Equations of Predictive Coding

V -> I (Perfect Cadence)

V -> vi (Deceptive Cadence)

C -> Ab (Chromatic Mediant)

I -> I (Tonic Repetition)

Atonal Cluster

4Emotion & Reward

Musical Chills & Frisson

Amygdala & Dissonance

Beat & Basal Ganglia (OPERA)

Minor-Key Sadness Paradox

Music & Language: Shared Neural Architecture

Shared Syntactic Integration Resource Hypothesis (SSIRH)

The Musicophilia Spectrum

Musical Training Effects on the Brain

Corpus Callosum

Auditory Cortex

Motor Cortex

Cerebellum

Planum Temporale

Arcuate Fasciculus

Hippocampus

Prefrontal Cortex

Broca Area

Critical Period & Sensitive Period

The Wundt Curve: Optimal Complexity

Music Therapy: Clinical Applications

Parkinson Disease

Stroke Aphasia

Alzheimer Disease

Depression & Anxiety

Chronic Pain

Autism Spectrum

Python: Auditory Perception Models

The Bayesian Brain & Music

Bayes' Theorem as the Brain's Operating System

Predictive Coding Hierarchy

Statistical Learning in Music

Information-Theoretic Surprise

Huron's ITPRA Theory

Bayesian Modeling of Musical Expectations

Dynamic Bayesian Networks for Melody

Maximum Entropy Rate Principle

Surprisal & Information-Theoretic Listening

Inferring Rule Activation & Violation

The Hierarchical Model: Harmony, Meter & Beat Position

Fusing Symbolic & Signal Layers

Python: Bayesian Musical Expectations

References