Part III: Galaxies & Cosmology | Chapter 4

Large-Scale Structure

The cosmic web: the matter power spectrum, baryon acoustic oscillations, correlation functions, and N-body simulations of structure formation

Overview

On scales of tens to hundreds of megaparsecs, the Universe displays a remarkable filamentary structure — the "cosmic web" — consisting of galaxy clusters connected by filaments of galaxies, with vast voids in between. This structure grew from tiny primordial density fluctuations (\(\delta\rho/\rho \sim 10^{-5}\)) imprinted during cosmic inflation. The statistical properties of this structure, quantified by the power spectrum and correlation functions, encode fundamental cosmological information.

In this chapter we derive the matter power spectrum from the growth of perturbations, explain baryon acoustic oscillations as a standard ruler, develop the two-point correlation function formalism, and discuss N-body simulation techniques.

1. The Matter Power Spectrum

The density contrast field \(\delta(\mathbf{x}) = (\rho(\mathbf{x}) - \bar{\rho})/\bar{\rho}\)is characterized by its Fourier transform. The power spectrum is defined as the ensemble average of the squared Fourier amplitudes.

1.1 Definition and Physical Meaning

Expanding \(\delta(\mathbf{x})\) in Fourier modes:

$$\delta(\mathbf{x}) = \int\frac{d^3k}{(2\pi)^3}\,\tilde{\delta}(\mathbf{k})\,e^{i\mathbf{k}\cdot\mathbf{x}}$$

The power spectrum \(P(k)\) is defined by:

$$\boxed{\langle\tilde{\delta}(\mathbf{k})\tilde{\delta}^*(\mathbf{k}')\rangle = (2\pi)^3\delta_D(\mathbf{k} - \mathbf{k}')P(k)}$$

where \(\delta_D\) is the Dirac delta function reflecting statistical homogeneity. \(P(k)\) has dimensions of volume and measures the variance of density fluctuations on scale \(\lambda \sim 2\pi/k\). The dimensionless power spectrum \(\Delta^2(k) = k^3 P(k)/(2\pi^2)\) gives the variance per logarithmic interval in \(k\).

1.2 The Shape of P(k)

The primordial power spectrum from inflation is nearly scale-invariant (Harrison-Zel'dovich):\(P_{\text{prim}}(k) \propto k^{n_s}\) with \(n_s \approx 0.965\). The observed power spectrum is modified by the transfer function \(T(k)\):

$$P(k, z) = A_s\,k^{n_s}\,T^2(k)\,D^2(z)$$

The transfer function encodes the physics of the radiation-dominated era. Modes that entered the horizon before matter-radiation equality (\(k > k_{\text{eq}} \approx 0.01\) Mpc\(^{-1}\)) experienced suppressed growth (Meszaros effect), leading to a turnover in \(P(k)\)at the equality scale. The result is:

$$P(k) \propto \begin{cases} k^{n_s} & k \ll k_{\text{eq}} \\ k^{n_s - 4} \propto k^{-3} & k \gg k_{\text{eq}} \end{cases}$$

2. Baryon Acoustic Oscillations

Before recombination, baryons and photons are tightly coupled by Thomson scattering, forming a single baryon-photon fluid. Gravitational compression of overdense regions is resisted by radiation pressure, driving acoustic oscillations.

2.1 The Sound Horizon

The sound speed in the baryon-photon fluid is:

$$c_s = \frac{c}{\sqrt{3(1 + R)}}, \qquad R = \frac{3\rho_b}{4\rho_\gamma}$$

The comoving sound horizon at recombination is the maximum distance a sound wave could have traveled since the Big Bang:

$$\boxed{r_s = \int_0^{t_{\text{rec}}} \frac{c_s(t)}{a(t)}\,dt \approx 147\;\text{Mpc}}$$

This scale is imprinted in both the CMB angular power spectrum (as the acoustic peaks) and in the matter distribution (as the BAO feature).

2.2 BAO as a Standard Ruler

At recombination, the acoustic oscillations freeze out, leaving a characteristic excess of galaxy pairs separated by \(\sim 150\) Mpc. This "BAO bump" appears as wiggles in the power spectrum and a peak in the correlation function. Since the physical scale is known from CMB physics, BAO serves as a standard ruler for measuring:

$$D_A(z) = \frac{r_s}{\theta_{\text{BAO}}}, \qquad D_H(z) = \frac{c}{H(z)} = \frac{r_s\,\Delta z}{\Delta r_\parallel}$$

The transverse (angular) BAO measures the angular diameter distance \(D_A(z)\), while the radial (redshift-space) BAO measures the Hubble parameter \(H(z)\). Together, they provide powerful constraints on dark energy. The DESI (Dark Energy Spectroscopic Instrument) survey has measured BAO in \(\sim 40\) million galaxies across \(0 < z < 4\).

3. The Two-Point Correlation Function

The two-point correlation function \(\xi(r)\) measures the excess probability of finding a galaxy pair at separation \(r\) compared to a random distribution.

3.1 Definition and Relation to P(k)

$$\boxed{\xi(r) = \langle\delta(\mathbf{x})\delta(\mathbf{x}+\mathbf{r})\rangle = \int_0^\infty \frac{k^2}{2\pi^2}P(k)\frac{\sin kr}{kr}\,dk}$$

The power spectrum and correlation function are a Fourier transform pair, containing identical information. On scales of \(1\text{--}10\) Mpc, the galaxy correlation function follows a power law:

$$\xi(r) \approx \left(\frac{r}{r_0}\right)^{-\gamma}, \qquad r_0 \approx 5\;h^{-1}\;\text{Mpc}, \;\; \gamma \approx 1.8$$

3.2 Redshift-Space Distortions

Galaxy redshifts include peculiar velocity contributions, distorting the correlation function in redshift space. On large scales, coherent infall toward overdensities enhances the clustering along the line of sight (the Kaiser effect):

$$P^s(k,\mu) = (1 + \beta\mu^2)^2\,P(k), \qquad \beta = f/b$$

where \(\mu = \cos\theta_k\), \(f = d\ln D/d\ln a \approx \Omega_m^{0.55}\)is the linear growth rate, and \(b\) is the galaxy bias factor. Measuring the anisotropy of the redshift-space correlation function constrains the growth rate, providing a key test of gravity on cosmological scales.

4. Galaxy Bias

Galaxies are biased tracers of the underlying dark matter distribution. The linear galaxy bias relates the galaxy and matter overdensities:

$$\delta_g = b\,\delta_m, \qquad P_g(k) = b^2\,P_m(k)$$

4.1 The Peak-Background Split

The bias of halos of mass \(M\) can be derived from the Press-Schechter formalism via the peak-background split argument:

$$\boxed{b(M) = 1 + \frac{\nu^2 - 1}{\delta_c}, \qquad \nu = \frac{\delta_c}{\sigma(M)}}$$

Low-mass halos (\(\nu < 1\)) are anti-biased (\(b < 1\)), while massive halos (\(\nu > 1\)) are biased (\(b > 1\)). Bright galaxies and quasars typically reside in massive halos and are strongly biased (\(b \sim 2\text{--}5\)).

5. N-Body Simulations

Cosmological N-body simulations solve the Vlasov-Poisson system by following the trajectories of a large number of particles representing dark matter.

5.1 The Particle-Mesh Method

The gravitational potential is computed by solving the Poisson equation on a grid using Fast Fourier Transforms:

$$\nabla^2\Phi = 4\pi G\bar{\rho}\,a^2\,\delta \;\;\Rightarrow\;\; \tilde{\Phi}(\mathbf{k}) = -\frac{4\pi G\bar{\rho}\,a^2}{k^2}\tilde{\delta}(\mathbf{k})$$

Modern simulations (Millennium, Bolshoi, AbacusSummit) use tree-PM or adaptive mesh methods to achieve high force resolution while maintaining efficiency. The largest simulations evolve \(>10^{12}\) particles in volumes exceeding\((1\;\text{Gpc})^3\). The resulting cosmic web reproduces the observed galaxy distribution with remarkable fidelity.

5.2 The Zel'dovich Approximation

Before shell-crossing, the displacement of a particle from its initial (Lagrangian) position \(\mathbf{q}\) to its Eulerian position \(\mathbf{x}\)is given by:

$$\mathbf{x}(t) = \mathbf{q} + D(t)\,\mathbf{\Psi}(\mathbf{q})$$

where \(D(t)\) is the linear growth factor and \(\mathbf{\Psi}\)is the displacement field related to the initial density through\(\nabla\cdot\mathbf{\Psi} = -\delta_0\). This first-order Lagrangian perturbation theory accurately describes the early formation of the cosmic web, including the collapse of sheets (Zel'dovich pancakes) and filaments.

Applications

Dark Energy Constraints from Galaxy Surveys

Galaxy redshift surveys (SDSS, BOSS, DESI, Euclid, Roman) measure the three-dimensional galaxy distribution to extract the power spectrum, BAO, and growth rate. Combined with CMB data, these measurements constrain the dark energy equation of state parameter \(w\) to percent-level precision and test for deviations from the cosmological constant (\(w = -1\)).

Primordial Non-Gaussianity

Deviations from Gaussian initial conditions (parameterized by \(f_{\text{NL}}\)) would be a smoking gun for inflationary physics. Large-scale structure constrains\(f_{\text{NL}}\) through the scale-dependent bias effect: non-Gaussianity modifies the bias of massive halos on large scales, producing a characteristic\(1/k^2\) scale dependence.

Cosmic Voids as Cosmological Probes

Cosmic voids — underdense regions spanning \(20\text{--}100\) Mpc — occupy the majority of the cosmic volume and provide complementary cosmological information to galaxy clusters. The void size function, analogous to the halo mass function, is sensitive to the dark energy equation of state and modified gravity. Voids are cleaner environments for testing gravity because nonlinear effects are weaker. The Alcock-Paczynski test applied to void shapes constrains the expansion rate. The integrated Sachs-Wolfe (ISW) effect produces a temperature decrement in the CMB at void locations, providing evidence for dark energy.

The Lyman-Alpha Forest

The Lyman-alpha forest — the dense series of absorption lines in quasar spectra caused by intervening neutral hydrogen — traces the cosmic web at \(z \sim 2\text{--}4\). Each absorption line corresponds to a filament or sheet in the intergalactic medium. The power spectrum of the Ly-alpha forest constrains the matter power spectrum on small scales (\(k \sim 0.1\text{--}10\) Mpc\(^{-1}\)), providing the strongest constraints on the mass of warm dark matter particles and the thermal history of the intergalactic medium. The BOSS survey measured BAO at \(z = 2.34\)using the Ly-alpha forest.

Higher-Order Statistics

The power spectrum captures only the Gaussian (two-point) information in the density field. Gravitational evolution generates non-Gaussianity, which is captured by higher-order statistics. The bispectrum (three-point function in Fourier space) is sensitive to the growth rate, galaxy bias, and primordial non-Gaussianity. The field-level inference approach, using forward models of structure formation, promises to extract all available cosmological information from galaxy surveys, going beyond traditional summary statistics.

Historical Notes

The first galaxy redshift surveys in the 1980s (CfA survey by de Lapparent, Geller, and Huchra, 1986) revealed the "bubbly" large-scale structure with voids, walls, and filaments. The "Great Wall" was one of the first recognized cosmic structures. The 2dF Galaxy Redshift Survey (2001) and the Sloan Digital Sky Survey (starting 2000) mapped hundreds of thousands of galaxy redshifts, enabling the first detection of BAO by Eisenstein et al. (2005). The theoretical framework for the power spectrum was developed by Peebles, Harrison, and Zel'dovich in the 1960s–1970s, with the CDM transfer function computed by Bardeen, Bond, Kaiser, and Szalay (BBKS, 1986). The Millennium Simulation (Springel et al., 2005) was a landmark in computational cosmology.

The detection of the BAO signal was a triumph of precision cosmology. Daniel Eisenstein and collaborators measured the correlation function of 46,748 luminous red galaxies from SDSS, identifying the acoustic peak at a comoving separation of \(100\,h^{-1}\) Mpc. Simultaneously, the 2dF team (Cole et al., 2005) detected the BAO oscillations in the power spectrum. These measurements confirmed a key prediction of the standard cosmological model and established BAO as a primary tool for measuring the cosmic expansion history.

The era of precision large-scale structure surveys has now arrived. The Dark Energy Spectroscopic Instrument (DESI), which saw first light in 2020, is measuring redshifts of 40 million galaxies and quasars. The European Space Agency's Euclid mission (launched 2023) combines weak lensing imaging with slitless spectroscopy over 15,000 square degrees. The Nancy Grace Roman Space Telescope will provide deep infrared imaging and spectroscopy. Together, these surveys will measure the dark energy equation of state to percent-level precision and test general relativity on the largest scales.

Weak Gravitational Lensing

Weak gravitational lensing — the subtle distortion of background galaxy shapes by the intervening matter distribution — provides a direct measure of the total (dark plus luminous) matter power spectrum.

Cosmic Shear

The lensing convergence power spectrum is a projection of the 3D matter power spectrum along the line of sight, weighted by the lensing kernel:

$$C_\ell^{\kappa\kappa} = \int_0^{\chi_H} d\chi\,\frac{W^2(\chi)}{\chi^2}\,P\left(k = \frac{\ell}{\chi}, z(\chi)\right)$$

Cosmic shear measurements constrain the amplitude of matter fluctuations through the parameter \(S_8 = \sigma_8\sqrt{\Omega_m/0.3}\). Current surveys (DES, KiDS, HSC) measure \(S_8 \approx 0.76 \pm 0.02\), which is\(2\text{--}3\sigma\) lower than the CMB prediction of \(S_8 \approx 0.83\). This "S8 tension" may indicate new physics, systematic errors, or baryonic feedback effects on the matter power spectrum. Rubin Observatory LSST and Euclid will measure \(S_8\) to sub-percent precision, potentially resolving this tension.

Photometric Redshifts

Weak lensing surveys require redshift information for billions of source galaxies that are too faint for spectroscopy. Photometric redshifts estimate \(z\) from broadband photometry by fitting galaxy spectral energy distribution templates or using machine learning algorithms trained on spectroscopic samples. Typical photo-z precision is \(\sigma_z \sim 0.03(1+z)\) for well-calibrated samples, sufficient for tomographic weak lensing analysis that bins sources into redshift slices. Systematic biases in photo-z calibration are currently the dominant systematic uncertainty in weak lensing cosmology. Cross-correlation methods, where photometric samples are correlated with spectroscopic reference samples, provide the most robust calibration.

The combination of galaxy clustering, weak lensing, CMB lensing, and the CMB power spectrum in a joint analysis provides the tightest cosmological constraints, breaking degeneracies between parameters. This "3x2-point" analysis (galaxy-galaxy, galaxy-shear, shear-shear correlations) has become the standard approach for photometric surveys. The Dark Energy Survey (DES) Year 3 analysis combined these probes for 100 million galaxies, constraining \(\Omega_m\) and \(\sigma_8\) to a few percent precision. LSST will extend this analysis to billions of galaxies.

Intrinsic Alignments

A key systematic in weak lensing is intrinsic alignments (IA): galaxies that form in the same tidal field have correlated shapes that mimic the lensing signal. The IA contamination is modeled using the "nonlinear alignment model" or "tidal alignment and tidal torquing" (TATT) model. For red (elliptical) galaxies, the IA signal is strong and aligned with the local tidal field. For blue (disk) galaxies, the IA signal is weaker and primarily arises from tidal torquing during formation. Accurate IA modeling is essential for extracting unbiased cosmological constraints from Stage IV weak lensing surveys.

The Alcock-Paczynski Test

An isotropic feature (such as the BAO sphere or void stacking) will appear distorted if the assumed cosmological model differs from the true one. The Alcock-Paczynski (AP) test exploits this by comparing the transverse and radial dimensions of structures:

$$F_{\text{AP}}(z) = (1+z)\frac{D_A(z) H(z)}{c}$$

Deviations of \(F_{\text{AP}}\) from unity indicate an incorrect cosmological model. The AP test is particularly powerful when applied to the 2D correlation function or power spectrum measured in redshift space, where the BAO ring provides a known isotropic standard. The combination of the AP test with RSD measurements from the same galaxy survey simultaneously constrains the expansion rate, growth rate, and geometry of the Universe.

The field-level approach to cosmological analysis — forward-modeling the full density field rather than compressing it into summary statistics — promises to extract substantially more information from galaxy surveys. Simulation-based inference methods using neural network-based density estimators are being developed to exploit the non-Gaussian information generated by gravitational evolution, which is invisible to two-point statistics. These techniques could improve cosmological constraints from LSST and Euclid by factors of 2–5 compared to traditional power spectrum analyses.

Computational Exploration

The following simulation computes the CDM power spectrum using the BBKS transfer function, models the BAO feature, generates a two-point correlation function, and performs a simple 2D N-body simulation of structure formation using the Zel'dovich approximation.

Power Spectrum, BAO, Correlation Function, and Structure Formation

Python
script.py251 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Chapter Summary

The matter power spectrum \(P(k)\) encodes the statistics of cosmic structure. It transitions from \(P \propto k^{n_s}\) on large scales to\(P \propto k^{n_s-4}\) on small scales, with the turnover at the matter-radiation equality scale \(k_{\text{eq}} \approx 0.01\) h/Mpc.

Baryon acoustic oscillations imprint a characteristic scale of \(\sim 150\) Mpc in the correlation function, serving as a standard ruler for measuring the expansion history. Redshift-space distortions from peculiar velocities constrain the growth rate, testing gravity on cosmological scales.

Galaxies are biased tracers of dark matter with \(b(M) = 1 + (\nu^2-1)/\delta_c\). N-body simulations using Fourier methods and the Zel'dovich approximation reproduce the observed cosmic web of filaments, clusters, and voids.

Rate this chapter: