Module 0 · Foundations

History & Foundations

Evolutionary computation has four classical roots, all in the 1960s and 1970s, that developed independently before converging in the 1990s into the unified discipline we know today. Each tradition picked a different inspiration from biology and a different niche of optimisation, and the differences still matter.

1. Holland & Genetic Algorithms (Michigan, 1960s–1975)

John Henry Holland (1929–2015) at the University of Michigan introduced the genetic algorithm in a series of papers in the 1960s, formalised in Adaptation in Natural and Artificial Systems(1975). Holland’s key conceptual moves:

  • Represent candidate solutions as fixed-length bit strings (chromosomes).
  • Apply crossover as the dominant variation operator, with mutation as a background mechanism.
  • Select parents in proportion to fitness, then mate them.
  • Prove the schema theorem: short, low-order, high-fitness schemata receive exponentially increasing samples in subsequent generations.

The schema theorem is the founding theoretical result of GAs, though its strength was later disputed (Wolpert & Macready 1997 no-free-lunch theorems, Vose 1999’s random-heuristic-search analysis). Holland’s broader programme — complex adaptive systems — is also the root of much modern agent-based modelling.

2. Rechenberg & Schwefel — Evolution Strategies (Berlin, 1960s–1981)

Independently in Berlin, Ingo Rechenberg and Hans-Paul Schwefel developed evolution strategies (ES) for continuous optimisation. Their initial application was the engineering optimisation of physical artefacts (a flexible plate in a wind tunnel, a two-phase nozzle). The characteristic features:

  • Real-valued vectors \(\mathbf{x} \in \mathbb{R}^n\), not bit strings.
  • Gaussian mutation with explicit, evolved step sizes (\(\sigma\)).
  • (1+1) and (μ,λ) selection schemes — the 1/5 success rule for adaptive step-size control.
  • Crossover initially absent or minor; the focus is mutation + selection.

Modern ES — the CMA-ES of Hansen & Ostermeier (Module 2) — is the direct intellectual descendant of this German tradition. ES handles continuous, non-convex, noisy optimisation problems with a robustness that surprised the GA community for decades.

3. Fogel — Evolutionary Programming (USA, 1960s)

Lawrence J. Fogel (1928–2007), working in San Diego, introduced evolutionary programming (EP) in 1962 for evolving finite-state machines that predict symbol sequences. Like ES, EP focused on mutation rather than crossover; unlike ES, it represented solutions at a higher level of abstraction (FSM topology rather than parameters of a continuous function). Fogel’s programme remained somewhat isolated from GA and ES communities until the 1990s, when the three traditions merged into “evolutionary computation.”

4. Koza — Genetic Programming (Stanford, 1992)

John Koza extended the GA paradigm from fixed-length bit strings to variable-length tree structuresrepresenting computer programs. Genetic Programming (1992, 800 pages) demonstrated GP across symbolic regression, control problems, automatic circuit design, and even the evolution of antenna designs that NASA later flew. The technique is treated in detail in Module 3.

5. The Modern Synthesis

By the late 1990s the previously-distinct traditions had converged into the unified discipline of evolutionary computation. Key conceptual unifications:

  • The Wolpert–Macready no-free-lunch theorems (1997) showed all black-box optimisation algorithms have identical average performance over all possible objective functions — performance gains come from matching the algorithm to the structure of the problem.
  • The recognition that GA, ES, EP, and GP differ chiefly in representation — once representations are unified, the algorithms become facets of a single optimisation framework.
  • The integration of covariance-matrix adaptation (CMA-ES, 2001) brought ES to its modern apex as a competitive black-box optimiser.
  • The 2017 OpenAI ES paper put evolutionary methods back on the modern ML map by showing they could match deep-RL on Atari with vastly better wall-clock parallelism.

The remaining modules of this course follow the modern synthesis: GAs (M1), ES & CMA-ES (M2), GP (M3), neuroevolution (M4), novelty search and Quality-Diversity (M5), evolutionary RL (M6), and the open-endedness frontier (M7).