Part II: Spectral Theory | Chapter 2

Diagonalization

Reducing matrices to their simplest form via eigenvector bases

Historical Context

The idea of reducing a matrix to diagonal form traces back to the work of Cauchy (1829) on quadratic forms and Jacobi (1846) on iterative diagonalization algorithms. Arthur Cayley and William Rowan Hamilton independently discovered the theorem bearing their names in the 1850s, establishing that every matrix satisfies its own characteristic equation. This result connects the algebraic structure of eigenvalues to the analytic properties of matrix functions.

Diagonalization became a central tool in 20th-century mathematics and physics. In quantum mechanics, the measurement postulate requires diagonalizing Hermitian operators to find observable values. In dynamical systems, diagonalization decouples differential equations. The realization that not all matrices are diagonalizable led to the Jordan normal form, a deeper canonical structure explored in Part III.

2.1 Diagonalizability Criteria

Definition: Diagonalizable Matrix

A matrix $A \in \mathbb{F}^{n \times n}$ is diagonalizable if there exists an invertible matrix $P$ such that $P^{-1}AP = D$ is diagonal. Equivalently,$A = PDP^{-1}$ where the columns of $P$ are eigenvectors and $D = \text{diag}(\lambda_1, \ldots, \lambda_n)$.

A matrix is diagonalizable if and only if it has a complete set of $n$ linearly independent eigenvectors. Several sufficient conditions guarantee diagonalizability:

  • $A$ has $n$ distinct eigenvalues (sufficient, not necessary)
  • $A$ is symmetric ($A = A^T$) — the Spectral Theorem guarantees orthogonal diagonalization
  • $A$ is normal ($A^*A = AA^*$) — unitarily diagonalizable
  • For every eigenvalue, geometric multiplicity equals algebraic multiplicity

Theorem: Diagonalizability Criterion

$A$ is diagonalizable over $\mathbb{F}$ if and only if the characteristic polynomial splits over $\mathbb{F}$ and for each eigenvalue $\lambda$, the geometric multiplicity equals the algebraic multiplicity: $\dim\ker(A - \lambda I) = \text{mult}_{\chi_A}(\lambda)$.

2.2 The Diagonalization Algorithm

Given $A \in \mathbb{R}^{n \times n}$, the diagonalization procedure is:

  1. Compute the characteristic polynomial $p(\lambda) = \det(A - \lambda I)$
  2. Find all roots $\lambda_1, \ldots, \lambda_k$ with multiplicities
  3. For each $\lambda_i$, solve $(A - \lambda_i I)v = 0$ to find the eigenspace
  4. Check that $\sum_i \dim E_{\lambda_i} = n$; if so, form $P$ from eigenvectors

The diagonal matrix $D$ contains the eigenvalues on its diagonal (in the order matching the columns of $P$). The transformation $P$ is not unique: any basis of each eigenspace works, and the eigenvalues in $D$ can be permuted (with corresponding reordering of columns in $P$).

2.3 The Cayley-Hamilton Theorem

Theorem (Cayley-Hamilton)

Every square matrix satisfies its own characteristic equation. If $p(\lambda) = \det(\lambda I - A) = \lambda^n + c_{n-1}\lambda^{n-1} + \cdots + c_0$, then $p(A) = A^n + c_{n-1}A^{n-1} + \cdots + c_0 I = 0$.

For diagonalizable matrices, the proof is immediate: if $A = PDP^{-1}$, then$p(A) = Pp(D)P^{-1}$, and $p(D) = \text{diag}(p(\lambda_1), \ldots, p(\lambda_n)) = 0$ since each $\lambda_i$ is a root of $p$. The general proof uses the adjugate matrix.

A powerful consequence: $A^{-1}$ can be expressed as a polynomial in $A$. From$p(A) = 0$, we get $A^n = -(c_{n-1}A^{n-1} + \cdots + c_0 I)$, and if $c_0 \neq 0$ (i.e.,$A$ is invertible), then $A^{-1} = -\frac{1}{c_0}(A^{n-1} + c_{n-1}A^{n-2} + \cdots + c_1 I)$.

2.4 Matrix Powers and Exponentials

The primary computational advantage of diagonalization is that powers become trivial:

$$A^k = PD^kP^{-1} = P\begin{pmatrix} \lambda_1^k & & \\ & \ddots & \\ & & \lambda_n^k \end{pmatrix}P^{-1}$$

This extends to any analytic function $f$ via the matrix function:

$$f(A) = Pf(D)P^{-1} = P\begin{pmatrix} f(\lambda_1) & & \\ & \ddots & \\ & & f(\lambda_n) \end{pmatrix}P^{-1}$$

The matrix exponential $e^{tA}$ is particularly important for solving linear ODE systems $\dot{x} = Ax$, where the solution is $x(t) = e^{tA}x_0$. Via diagonalization:

$$e^{tA} = P\begin{pmatrix} e^{t\lambda_1} & & \\ & \ddots & \\ & & e^{t\lambda_n} \end{pmatrix}P^{-1}$$

2.5 Simultaneous Diagonalization

Two diagonalizable matrices $A$ and $B$ can be simultaneously diagonalized (by the same matrix $P$) if and only if they commute: $AB = BA$.

Theorem: Simultaneous Diagonalization

Let $A, B \in \mathbb{C}^{n \times n}$ be diagonalizable. There exists an invertible $P$ such that both $P^{-1}AP$ and $P^{-1}BP$ are diagonal if and only if $AB = BA$.

This result has profound implications in quantum mechanics: commuting observables (Hermitian operators) can be simultaneously measured, corresponding to shared eigenstates. The proof proceeds by showing that $B$ maps each eigenspace of $A$ to itself, so one can diagonalize $B$ within each eigenspace of $A$.

For real symmetric matrices, simultaneous diagonalization can always be achieved with an orthogonal matrix $P$ when the matrices commute, which is the basis of principal component analysis and many signal processing algorithms.

Computational Laboratory

This simulation demonstrates the diagonalization algorithm, verifies the Cayley-Hamilton theorem, computes matrix powers and exponentials via diagonalization, and explores simultaneous diagonalization of commuting symmetric matrices.

Diagonalization: Algorithm, Powers, and Exponentials

Python
diagonalization.py149 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Rate this chapter: