Machine Learning for Science

From gradient descent to physics-informed networks — harnessing ML as a new tool for scientific discovery

Course Overview

Machine learning has become an indispensable tool across the sciences — from discovering new materials and accelerating molecular dynamics simulations to classifying galaxies and extracting physical laws from data. This course bridges the gap between the foundations of modern ML and its rapidly growing applications in physics, chemistry, biology, and astronomy.

We begin with classical supervised learning and build up to deep neural networks, covering the mathematical machinery — gradient descent, backpropagation, and the universal approximation theorem — that makes these models work. We then explore architectures tailored to scientific data: convolutional networks for images and spatial fields, recurrent networks and transformers for time series, and generative models for sampling complex distributions. The final part focuses on methods designed specifically for science: physics-informed neural networks (PINNs) that embed conservation laws into the loss function, symbolic regression that rediscovers interpretable equations, equivariant architectures that respect physical symmetries, and ML-driven simulations in molecular dynamics and cosmology.

Key Concepts & Equations

Gradient Descent

$$\theta_{t+1} = \theta_t - \eta \nabla_\theta \mathcal{L}(\theta_t)$$

Backpropagation (Chain Rule)

$$\frac{\partial \mathcal{L}}{\partial w_{ij}^{(l)}} = \frac{\partial \mathcal{L}}{\partial a_j^{(l)}} \cdot \frac{\partial a_j^{(l)}}{\partial z_j^{(l)}} \cdot \frac{\partial z_j^{(l)}}{\partial w_{ij}^{(l)}}$$

Universal Approximation

For any continuous $f: \mathbb{R}^n \to \mathbb{R}$ and $\epsilon > 0$, there exists a single-hidden-layer network $g(x) = \sum_{i=1}^{N} \alpha_i \sigma(w_i^\top x + b_i)$ such that $\|f - g\|_\infty < \epsilon$ on any compact set.

PINNs Loss Function

$$\mathcal{L}_{\text{PINN}} = \mathcal{L}_{\text{data}} + \lambda \mathcal{L}_{\text{PDE}}, \quad \mathcal{L}_{\text{PDE}} = \left\| \mathcal{N}[u_\theta](x,t) \right\|^2$$

Equivariant Networks

A function $f$ is equivariant under a group $G$ if $f(T_g x) = T_g' f(x)$ for all $g \in G$. Encoding symmetries (rotations, translations, permutations) into network architecture dramatically improves data efficiency and generalisation in scientific applications.

Part I: Foundations

The mathematical building blocks of machine learning — from linear models through multilayer networks. We develop gradient-based optimisation, regularisation strategies, and the universal approximation theorem that justifies why deep networks can learn complex scientific relationships.

1. Linear Regression & Regularization 2. Classification & Logistic Regression 3. Neural Networks & Universal Approximation 4. Backpropagation & Optimization

Part II: Deep Learning Architectures

Specialised architectures for structured scientific data — convolutional networks that exploit spatial locality in images and fields, recurrent networks and transformers that model temporal sequences, autoencoders for nonlinear dimensionality reduction, and generative models (VAEs and GANs) that learn to sample from complex data distributions.

5. CNNs for Image & Signal Data 6. RNNs & Transformers for Sequences 7. Autoencoders & Dimensionality Reduction 8. Generative Models (VAEs & GANs)

Part III: Scientific Applications

Machine learning methods designed specifically for science — physics-informed neural networks that embed differential equations into training, symbolic regression that rediscovers interpretable physical laws, ML-accelerated molecular dynamics with equivariant potentials, and data-driven cosmology from galaxy surveys to gravitational lensing.

9. Physics-Informed Neural Networks 10. Symbolic Regression 11. ML for Molecular Dynamics 12. ML in Cosmology & Astrophysics

Featured: Harvard CS50 AI with Python

Harvard CS50's Artificial Intelligence with Python — Full University Course

Prerequisites

Mathematics

- Linear Algebra (vectors, matrices, eigenvalues, SVD)
- Multivariable Calculus (gradients, chain rule, Jacobians)
- Probability & Statistics (distributions, Bayes' theorem, MLE)

Programming

- Python (NumPy, Matplotlib)
- Familiarity with PyTorch or TensorFlow is helpful but not required
- Jupyter notebooks for interactive exploration

References

Goodfellow, Bengio & Courville — Deep Learning (MIT Press, 2016). The standard graduate textbook on deep learning foundations, optimisation, and architectures.
Bishop — Pattern Recognition and Machine Learning (Springer, 2006). Rigorous Bayesian perspective on regression, classification, and neural networks.
Mehta, Pankaj et al. — A high-bias, low-variance introduction to Machine Learning for physicists, Physics Reports 810, 1–124 (2019). Comprehensive review bridging ML and physics, covering supervised/unsupervised methods with examples from statistical mechanics and quantum physics.

Start Chapter 1: Linear Regression & Regularization →

Share:X Reddit LinkedIn

Course Overview

Key Concepts & Equations

Gradient Descent

Backpropagation (Chain Rule)

Universal Approximation

PINNs Loss Function

Equivariant Networks

Part I: Foundations

Part II: Deep Learning Architectures

Part III: Scientific Applications

Featured: Harvard CS50 AI with Python

Prerequisites

Mathematics

Linear Algebra

Probability & Statistics

Mathematics

Quantum Mechanics

Programming

References