Systems of Linear Equations
Gaussian elimination, LU decomposition, existence/uniqueness, and the Fredholm alternative
4.1 Gaussian Elimination
Gaussian elimination is the workhorse algorithm for solving systems of linear equations. It transforms the augmented matrix $[A \mid \mathbf{b}]$ into row echelon form (REF) using three elementary row operations, then extracts the solution by back substitution.
Elementary Row Operations
- Type I: Swap two rows: $R_i \leftrightarrow R_j$
- Type II: Scale a row: $R_i \to c R_i$ ($c \neq 0$)
- Type III: Add a multiple of one row to another: $R_i \to R_i + c R_j$
Each elementary row operation corresponds to left-multiplication by an elementary matrix. If$E_1, E_2, \ldots, E_k$ are the elementary matrices used, then:
where $U$ is upper triangular. This is the essence of the LU decomposition:$A = (E_k \cdots E_1)^{-1} U = LU$.
Derivation 1: Operation Count for Gaussian Elimination
We derive the computational cost of Gaussian elimination on an $n \times n$ system. At step $k$, we eliminate $n - k$ entries below the pivot in column $k$. Each elimination requires $n - k$ multiplications and additions across the remaining$n - k$ columns:
Back substitution costs an additional $O(n^2)$ operations. Thus, solving $A\mathbf{x} = \mathbf{b}$requires $\frac{2}{3}n^3 + O(n^2)$ floating-point operations. This cubic scaling makes Gaussian elimination practical for systems up to millions of unknowns.
4.2 LU Decomposition
Theorem: LU Decomposition
If $A$ can be reduced to upper triangular form without row swaps, then $A = LU$ where:
- $L$ is lower triangular with ones on the diagonal (unit lower triangular)
- $U$ is upper triangular
In general, with row pivoting: $PA = LU$ where $P$ is a permutation matrix.
Derivation 2: LU from Gaussian Elimination
The multipliers $\ell_{ij} = a_{ij}^{(j)} / a_{jj}^{(j)}$ used during elimination become the entries of $L$. Specifically:
This elegant result holds because the elementary elimination matrices $E_k$ are lower triangular, their inverses are obtained by negating the subdiagonal entries, and the product$E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = L$ simply assembles all multipliers into a single lower triangular matrix.
Advantage of LU
Once $A = LU$ is computed ($O(n^3)$), solving $A\mathbf{x} = \mathbf{b}$for any new right-hand side $\mathbf{b}$ requires only two triangular solves ($O(n^2)$). This is crucial when solving many systems with the same coefficient matrix, as in time-stepping PDE solvers.
4.3 Existence and Uniqueness
For the system $A\mathbf{x} = \mathbf{b}$ with $A \in \mathbb{F}^{m \times n}$, three cases arise:
Rouché–Capelli Theorem
- No solution: $\text{rank}(A) < \text{rank}([A \mid \mathbf{b}])$ — the system is inconsistent
- Unique solution: $\text{rank}(A) = \text{rank}([A \mid \mathbf{b}]) = n$ (number of unknowns)
- Infinite solutions: $\text{rank}(A) = \text{rank}([A \mid \mathbf{b}]) < n$ — the solution set is an affine subspace of dimension $n - \text{rank}(A)$
Derivation 3: Structure of the Solution Set
If $\mathbf{x}_p$ is any particular solution to $A\mathbf{x} = \mathbf{b}$, then the complete solution set is:
Proof: If $A\mathbf{x} = \mathbf{b}$, then $A(\mathbf{x} - \mathbf{x}_p) = \mathbf{b} - \mathbf{b} = \mathbf{0}$, so $\mathbf{x} - \mathbf{x}_p \in \ker(A)$. Conversely, if $\mathbf{h} \in \ker(A)$, then $A(\mathbf{x}_p + \mathbf{h}) = \mathbf{b}$. The solution set is therefore an affine subspace—a translated copy of the null space.
4.4 The Fredholm Alternative
The Fredholm alternative is a powerful dichotomy theorem that gives a clean criterion for solvability of $A\mathbf{x} = \mathbf{b}$ in terms of the adjoint operator.
Theorem: Fredholm Alternative
Exactly one of the following holds:
- (i) $A\mathbf{x} = \mathbf{b}$ has a solution, OR
- (ii) There exists $\mathbf{y}$ with $A^T\mathbf{y} = \mathbf{0}$ and $\mathbf{y}^T\mathbf{b} \neq 0$.
Equivalently: $A\mathbf{x} = \mathbf{b}$ is solvable if and only if $\mathbf{b} \perp \ker(A^T)$.
Derivation 4: Proof via Fundamental Theorem of Linear Algebra
The fundamental theorem of linear algebra states that for any $A \in \mathbb{R}^{m \times n}$:
This is because $\text{Im}(A)$ is the column space of $A$, and $\ker(A^T)$is the left null space, which is the orthogonal complement of the column space:$\ker(A^T) = \text{Im}(A)^\perp$.
The Fredholm alternative follows immediately: $A\mathbf{x} = \mathbf{b}$ is solvable$\iff \mathbf{b} \in \text{Im}(A) \iff \mathbf{b} \perp \ker(A^T)$.
The Four Fundamental Subspaces
For $A \in \mathbb{R}^{m \times n}$ with rank $r$:
- Column space: $\text{Im}(A) \subseteq \mathbb{R}^m$, dimension $r$
- Left null space: $\ker(A^T) \subseteq \mathbb{R}^m$, dimension $m - r$
- Row space: $\text{Im}(A^T) \subseteq \mathbb{R}^n$, dimension $r$
- Null space: $\ker(A) \subseteq \mathbb{R}^n$, dimension $n - r$
4.5 Pivoting and Numerical Stability
In practice, Gaussian elimination with partial pivoting (choosing the largest element in the current column as pivot) is essential for numerical stability. Without pivoting, small pivots can cause catastrophic growth of round-off errors.
Derivation 5: Growth Factor Analysis
The growth factor for Gaussian elimination is:
where $a_{ij}^{(k)}$ denotes entries at step $k$ of elimination. For partial pivoting, Wilkinson showed that $g \leq 2^{n-1}$, though in practice $g$ rarely exceeds$O(n)$. The backward error analysis gives:
This means the computed solution $\hat{\mathbf{x}}$ is the exact solution of a slightly perturbed system. The forward error satisfies:
where $\kappa(A) = \|A\| \|A^{-1}\|$ is the condition number.
4.6 Historical Development
Ancient China: The Nine Chapters (c. 200 BCE)
The Jiuzhang Suanshu (Nine Chapters on the Mathematical Art) contains the earliest known systematic method for solving systems of linear equations. Chapter 8 describes a procedure essentially identical to Gaussian elimination, applied to systems arising from agricultural taxation and trade. This predates Western discoveries by nearly two millennia.
Carl Friedrich Gauss (1809)
Gauss refined the elimination method while fitting orbits to astronomical observations, particularly in his work on the orbit of the asteroid Ceres. His systematic use of the method in least-squares problems led to it bearing his name, though the core algorithm was already ancient.
Alan Turing and James Wilkinson (1940s–1960s)
The numerical stability of Gaussian elimination became crucial with the advent of digital computers. Turing analyzed round-off error in 1948, and Wilkinson developed the definitive backward error analysis and showed that partial pivoting ensures stability. Wilkinson's work earned him the Turing Award in 1970.
Ivar Fredholm (1903)
Fredholm proved his alternative theorem in the context of integral equations, providing a deep connection between solvability conditions and the adjoint operator. His work bridged finite-dimensional linear algebra and infinite-dimensional functional analysis.
4.7 Applications
Circuit Analysis
Kirchhoff's laws produce systems of linear equations: voltage law (KVL) and current law (KCL) give one equation per loop and per node. For a circuit with $n$ nodes and $b$ branches, the resulting system has $n - 1 + b - n + 1 = b$ equations. LU decomposition is used in SPICE simulators that analyze circuits with millions of components.
Finite Element Method
The FEM discretizes partial differential equations into systems $K\mathbf{u} = \mathbf{f}$where $K$ is the stiffness matrix (often sparse and symmetric positive definite). For structural analysis, $\mathbf{u}$ contains nodal displacements and $\mathbf{f}$contains applied forces. Modern FEM codes solve systems with billions of unknowns.
Computer Graphics: Ray Tracing
Determining where a ray intersects a surface requires solving a system of equations. For a triangle mesh, each ray-triangle intersection test involves solving a 3x3 system. A single rendered frame may require billions of such solves, making efficient linear system algorithms essential for real-time rendering.
Economics: Input-Output Models
Leontief's input-output model represents an economy as $(I - A)\mathbf{x} = \mathbf{d}$where $A$ is the technology matrix (inter-industry dependencies) and $\mathbf{d}$is final demand. Solving this system determines the total output $\mathbf{x}$ each sector must produce. Leontief won the Nobel Prize in Economics (1973) for this work.
4.8 Computational Exploration
This simulation implements Gaussian elimination with partial pivoting, LU decomposition, demonstrates the three cases for existence/uniqueness, verifies the Fredholm alternative, and visualizes computational cost and conditioning effects.
Systems of Equations: Gaussian Elimination, LU, and Fredholm
PythonClick Run to execute the Python code
Code will be executed with Python 3 on the server