Part IV: Unsupervised Learning
Unsupervised learning uncovers structure in data without labels. This part covers the three pillars of the field: clustering (finding groups), dimensionality reduction (finding compact representations), and generative modelling (learning the data distribution itself). Every algorithm is derived from first principles with full mathematical rigour.
Chapter 10: Clustering: K-Means & GMM
K-means objective and Lloyd's algorithm, Gaussian Mixture Models, and the full EM algorithm derivation from the ELBO — plus the deep connection between K-means and GMM.
Chapter 11: Dimensionality Reduction
PCA from variance maximisation via eigendecomposition, kernel PCA, and manifold methods t-SNE and UMAP for non-linear structure discovery.
Chapter 12: Autoencoders & VAEs
Vanilla autoencoders, denoising autoencoders, and the full VAE derivation: ELBO, reparameterisation trick, and KL divergence between Gaussians.
What you will learn
Prerequisites
Parts I–III. You should be comfortable with matrix eigendecomposition (Part I), maximum likelihood estimation and Bayes' theorem (Part I), and the concept of gradient descent (Part I). Familiarity with the Gaussian distribution is essential for Chapters 10 and 12.