Diagonalization
Reducing matrices to their simplest form via eigenvector bases
Historical Context
The idea of reducing a matrix to diagonal form traces back to the work of Cauchy (1829) on quadratic forms and Jacobi (1846) on iterative diagonalization algorithms. Arthur Cayley and William Rowan Hamilton independently discovered the theorem bearing their names in the 1850s, establishing that every matrix satisfies its own characteristic equation. This result connects the algebraic structure of eigenvalues to the analytic properties of matrix functions.
Diagonalization became a central tool in 20th-century mathematics and physics. In quantum mechanics, the measurement postulate requires diagonalizing Hermitian operators to find observable values. In dynamical systems, diagonalization decouples differential equations. The realization that not all matrices are diagonalizable led to the Jordan normal form, a deeper canonical structure explored in Part III.
2.1 Diagonalizability Criteria
Definition: Diagonalizable Matrix
A matrix $A \in \mathbb{F}^{n \times n}$ is diagonalizable if there exists an invertible matrix $P$ such that $P^{-1}AP = D$ is diagonal. Equivalently,$A = PDP^{-1}$ where the columns of $P$ are eigenvectors and $D = \text{diag}(\lambda_1, \ldots, \lambda_n)$.
A matrix is diagonalizable if and only if it has a complete set of $n$ linearly independent eigenvectors. Several sufficient conditions guarantee diagonalizability:
- $A$ has $n$ distinct eigenvalues (sufficient, not necessary)
- $A$ is symmetric ($A = A^T$) — the Spectral Theorem guarantees orthogonal diagonalization
- $A$ is normal ($A^*A = AA^*$) — unitarily diagonalizable
- For every eigenvalue, geometric multiplicity equals algebraic multiplicity
Theorem: Diagonalizability Criterion
$A$ is diagonalizable over $\mathbb{F}$ if and only if the characteristic polynomial splits over $\mathbb{F}$ and for each eigenvalue $\lambda$, the geometric multiplicity equals the algebraic multiplicity: $\dim\ker(A - \lambda I) = \text{mult}_{\chi_A}(\lambda)$.
2.2 The Diagonalization Algorithm
Given $A \in \mathbb{R}^{n \times n}$, the diagonalization procedure is:
- Compute the characteristic polynomial $p(\lambda) = \det(A - \lambda I)$
- Find all roots $\lambda_1, \ldots, \lambda_k$ with multiplicities
- For each $\lambda_i$, solve $(A - \lambda_i I)v = 0$ to find the eigenspace
- Check that $\sum_i \dim E_{\lambda_i} = n$; if so, form $P$ from eigenvectors
The diagonal matrix $D$ contains the eigenvalues on its diagonal (in the order matching the columns of $P$). The transformation $P$ is not unique: any basis of each eigenspace works, and the eigenvalues in $D$ can be permuted (with corresponding reordering of columns in $P$).
2.3 The Cayley-Hamilton Theorem
Theorem (Cayley-Hamilton)
Every square matrix satisfies its own characteristic equation. If $p(\lambda) = \det(\lambda I - A) = \lambda^n + c_{n-1}\lambda^{n-1} + \cdots + c_0$, then $p(A) = A^n + c_{n-1}A^{n-1} + \cdots + c_0 I = 0$.
For diagonalizable matrices, the proof is immediate: if $A = PDP^{-1}$, then$p(A) = Pp(D)P^{-1}$, and $p(D) = \text{diag}(p(\lambda_1), \ldots, p(\lambda_n)) = 0$ since each $\lambda_i$ is a root of $p$. The general proof uses the adjugate matrix.
A powerful consequence: $A^{-1}$ can be expressed as a polynomial in $A$. From$p(A) = 0$, we get $A^n = -(c_{n-1}A^{n-1} + \cdots + c_0 I)$, and if $c_0 \neq 0$ (i.e.,$A$ is invertible), then $A^{-1} = -\frac{1}{c_0}(A^{n-1} + c_{n-1}A^{n-2} + \cdots + c_1 I)$.
2.4 Matrix Powers and Exponentials
The primary computational advantage of diagonalization is that powers become trivial:
This extends to any analytic function $f$ via the matrix function:
The matrix exponential $e^{tA}$ is particularly important for solving linear ODE systems $\dot{x} = Ax$, where the solution is $x(t) = e^{tA}x_0$. Via diagonalization:
2.5 Simultaneous Diagonalization
Two diagonalizable matrices $A$ and $B$ can be simultaneously diagonalized (by the same matrix $P$) if and only if they commute: $AB = BA$.
Theorem: Simultaneous Diagonalization
Let $A, B \in \mathbb{C}^{n \times n}$ be diagonalizable. There exists an invertible $P$ such that both $P^{-1}AP$ and $P^{-1}BP$ are diagonal if and only if $AB = BA$.
This result has profound implications in quantum mechanics: commuting observables (Hermitian operators) can be simultaneously measured, corresponding to shared eigenstates. The proof proceeds by showing that $B$ maps each eigenspace of $A$ to itself, so one can diagonalize $B$ within each eigenspace of $A$.
For real symmetric matrices, simultaneous diagonalization can always be achieved with an orthogonal matrix $P$ when the matrices commute, which is the basis of principal component analysis and many signal processing algorithms.
Computational Laboratory
This simulation demonstrates the diagonalization algorithm, verifies the Cayley-Hamilton theorem, computes matrix powers and exponentials via diagonalization, and explores simultaneous diagonalization of commuting symmetric matrices.
Diagonalization: Algorithm, Powers, and Exponentials
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server