Machine Learning for Science

From gradient descent to physics-informed networks — harnessing ML as a new tool for scientific discovery

Course Overview

Machine learning has become an indispensable tool across the sciences — from discovering new materials and accelerating molecular dynamics simulations to classifying galaxies and extracting physical laws from data. This course bridges the gap between the foundations of modern ML and its rapidly growing applications in physics, chemistry, biology, and astronomy.

We begin with classical supervised learning and build up to deep neural networks, covering the mathematical machinery — gradient descent, backpropagation, and the universal approximation theorem — that makes these models work. We then explore architectures tailored to scientific data: convolutional networks for images and spatial fields, recurrent networks and transformers for time series, and generative models for sampling complex distributions. The final part focuses on methods designed specifically for science: physics-informed neural networks (PINNs) that embed conservation laws into the loss function, symbolic regression that rediscovers interpretable equations, equivariant architectures that respect physical symmetries, and ML-driven simulations in molecular dynamics and cosmology.

Key Concepts & Equations

Gradient Descent

$$\theta_{t+1} = \theta_t - \eta \nabla_\theta \mathcal{L}(\theta_t)$$

Backpropagation (Chain Rule)

$$\frac{\partial \mathcal{L}}{\partial w_{ij}^{(l)}} = \frac{\partial \mathcal{L}}{\partial a_j^{(l)}} \cdot \frac{\partial a_j^{(l)}}{\partial z_j^{(l)}} \cdot \frac{\partial z_j^{(l)}}{\partial w_{ij}^{(l)}}$$

Universal Approximation

For any continuous $f: \mathbb{R}^n \to \mathbb{R}$ and $\epsilon > 0$, there exists a single-hidden-layer network $g(x) = \sum_{i=1}^{N} \alpha_i \sigma(w_i^\top x + b_i)$ such that $\|f - g\|_\infty < \epsilon$ on any compact set.

PINNs Loss Function

$$\mathcal{L}_{\text{PINN}} = \mathcal{L}_{\text{data}} + \lambda \mathcal{L}_{\text{PDE}}, \quad \mathcal{L}_{\text{PDE}} = \left\| \mathcal{N}[u_\theta](x,t) \right\|^2$$

Equivariant Networks

A function $f$ is equivariant under a group $G$ if $f(T_g x) = T_g' f(x)$ for all $g \in G$. Encoding symmetries (rotations, translations, permutations) into network architecture dramatically improves data efficiency and generalisation in scientific applications.

Part I: Foundations

The mathematical building blocks of machine learning — from linear models through multilayer networks. We develop gradient-based optimisation, regularisation strategies, and the universal approximation theorem that justifies why deep networks can learn complex scientific relationships.

Part II: Deep Learning Architectures

Specialised architectures for structured scientific data — convolutional networks that exploit spatial locality in images and fields, recurrent networks and transformers that model temporal sequences, autoencoders for nonlinear dimensionality reduction, and generative models (VAEs and GANs) that learn to sample from complex data distributions.

Part III: Scientific Applications

Machine learning methods designed specifically for science — physics-informed neural networks that embed differential equations into training, symbolic regression that rediscovers interpretable physical laws, ML-accelerated molecular dynamics with equivariant potentials, and data-driven cosmology from galaxy surveys to gravitational lensing.

Prerequisites

Mathematics

  • - Linear Algebra (vectors, matrices, eigenvalues, SVD)
  • - Multivariable Calculus (gradients, chain rule, Jacobians)
  • - Basic Probability & Statistics (distributions, Bayes' theorem, MLE)

Programming

  • - Python (NumPy, Matplotlib)
  • - Familiarity with PyTorch or TensorFlow is helpful but not required
  • - Jupyter notebooks for interactive exploration

References

  • Goodfellow, Bengio & Courville Deep Learning (MIT Press, 2016). The standard graduate textbook on deep learning foundations, optimisation, and architectures.
  • Bishop Pattern Recognition and Machine Learning (Springer, 2006). Rigorous Bayesian perspective on regression, classification, and neural networks.
  • Mehta, Pankaj et al. A high-bias, low-variance introduction to Machine Learning for physicists, Physics Reports 810, 1–124 (2019). Comprehensive review bridging ML and physics, covering supervised/unsupervised methods with examples from statistical mechanics and quantum physics.