Part I: Foundations — Chapter 4

Systems of Linear Equations

Gaussian elimination, LU decomposition, existence/uniqueness, and the Fredholm alternative

4.1 Gaussian Elimination

Gaussian elimination is the workhorse algorithm for solving systems of linear equations. It transforms the augmented matrix $[A \mid \mathbf{b}]$ into row echelon form (REF) using three elementary row operations, then extracts the solution by back substitution.

Elementary Row Operations

Type I: Swap two rows: $R_i \leftrightarrow R_j$
Type II: Scale a row: $R_i \to c R_i$ ($c \neq 0$)
Type III: Add a multiple of one row to another: $R_i \to R_i + c R_j$

Each elementary row operation corresponds to left-multiplication by an elementary matrix. If$E_1, E_2, \ldots, E_k$ are the elementary matrices used, then:

$$E_k \cdots E_2 E_1 A = U$$

where $U$ is upper triangular. This is the essence of the LU decomposition:$A = (E_k \cdots E_1)^{-1} U = LU$.

Derivation 1: Operation Count for Gaussian Elimination

We derive the computational cost of Gaussian elimination on an $n \times n$ system. At step $k$, we eliminate $n - k$ entries below the pivot in column $k$. Each elimination requires $n - k$ multiplications and additions across the remaining$n - k$ columns:

$$\text{Total flops} \approx \sum_{k=1}^{n-1} 2(n-k)^2 = 2\sum_{j=1}^{n-1} j^2 \approx \frac{2n^3}{3}$$

Back substitution costs an additional $O(n^2)$ operations. Thus, solving $A\mathbf{x} = \mathbf{b}$requires $\frac{2}{3}n^3 + O(n^2)$ floating-point operations. This cubic scaling makes Gaussian elimination practical for systems up to millions of unknowns.

4.2 LU Decomposition

Theorem: LU Decomposition

If $A$ can be reduced to upper triangular form without row swaps, then $A = LU$ where:

$L$ is lower triangular with ones on the diagonal (unit lower triangular)
$U$ is upper triangular

In general, with row pivoting: $PA = LU$ where $P$ is a permutation matrix.

Derivation 2: LU from Gaussian Elimination

The multipliers $\ell_{ij} = a_{ij}^{(j)} / a_{jj}^{(j)}$ used during elimination become the entries of $L$. Specifically:

$$L = \begin{pmatrix} 1 & 0 & \cdots & 0 \\ \ell_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ \ell_{n1} & \ell_{n2} & \cdots & 1 \end{pmatrix}$$

This elegant result holds because the elementary elimination matrices $E_k$ are lower triangular, their inverses are obtained by negating the subdiagonal entries, and the product$E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = L$ simply assembles all multipliers into a single lower triangular matrix.

Advantage of LU

Once $A = LU$ is computed ($O(n^3)$), solving $A\mathbf{x} = \mathbf{b}$for any new right-hand side $\mathbf{b}$ requires only two triangular solves ($O(n^2)$). This is crucial when solving many systems with the same coefficient matrix, as in time-stepping PDE solvers.

4.3 Existence and Uniqueness

For the system $A\mathbf{x} = \mathbf{b}$ with $A \in \mathbb{F}^{m \times n}$, three cases arise:

Rouché–Capelli Theorem

No solution: $\text{rank}(A) < \text{rank}([A \mid \mathbf{b}])$ — the system is inconsistent
Unique solution: $\text{rank}(A) = \text{rank}([A \mid \mathbf{b}]) = n$ (number of unknowns)
Infinite solutions: $\text{rank}(A) = \text{rank}([A \mid \mathbf{b}]) < n$ — the solution set is an affine subspace of dimension $n - \text{rank}(A)$

Derivation 3: Structure of the Solution Set

If $\mathbf{x}_p$ is any particular solution to $A\mathbf{x} = \mathbf{b}$, then the complete solution set is:

$$\{\mathbf{x}_p + \mathbf{h} : \mathbf{h} \in \ker(A)\}$$

Proof: If $A\mathbf{x} = \mathbf{b}$, then $A(\mathbf{x} - \mathbf{x}_p) = \mathbf{b} - \mathbf{b} = \mathbf{0}$, so $\mathbf{x} - \mathbf{x}_p \in \ker(A)$. Conversely, if $\mathbf{h} \in \ker(A)$, then $A(\mathbf{x}_p + \mathbf{h}) = \mathbf{b}$. The solution set is therefore an affine subspace—a translated copy of the null space.

4.4 The Fredholm Alternative

The Fredholm alternative is a powerful dichotomy theorem that gives a clean criterion for solvability of $A\mathbf{x} = \mathbf{b}$ in terms of the adjoint operator.

Theorem: Fredholm Alternative

Exactly one of the following holds:

(i) $A\mathbf{x} = \mathbf{b}$ has a solution, OR
(ii) There exists $\mathbf{y}$ with $A^T\mathbf{y} = \mathbf{0}$ and $\mathbf{y}^T\mathbf{b} \neq 0$.

Equivalently: $A\mathbf{x} = \mathbf{b}$ is solvable if and only if $\mathbf{b} \perp \ker(A^T)$.

Derivation 4: Proof via Fundamental Theorem of Linear Algebra

The fundamental theorem of linear algebra states that for any $A \in \mathbb{R}^{m \times n}$:

$$\mathbb{R}^m = \text{Im}(A) \oplus \ker(A^T)$$

This is because $\text{Im}(A)$ is the column space of $A$, and $\ker(A^T)$is the left null space, which is the orthogonal complement of the column space:$\ker(A^T) = \text{Im}(A)^\perp$.

The Fredholm alternative follows immediately: $A\mathbf{x} = \mathbf{b}$ is solvable$\iff \mathbf{b} \in \text{Im}(A) \iff \mathbf{b} \perp \ker(A^T)$.

The Four Fundamental Subspaces

For $A \in \mathbb{R}^{m \times n}$ with rank $r$:

Column space: $\text{Im}(A) \subseteq \mathbb{R}^m$, dimension $r$
Left null space: $\ker(A^T) \subseteq \mathbb{R}^m$, dimension $m - r$
Row space: $\text{Im}(A^T) \subseteq \mathbb{R}^n$, dimension $r$
Null space: $\ker(A) \subseteq \mathbb{R}^n$, dimension $n - r$

4.5 Pivoting and Numerical Stability

In practice, Gaussian elimination with partial pivoting (choosing the largest element in the current column as pivot) is essential for numerical stability. Without pivoting, small pivots can cause catastrophic growth of round-off errors.

Derivation 5: Growth Factor Analysis

The growth factor for Gaussian elimination is:

$$g = \frac{\max_{i,j,k} |a_{ij}^{(k)}|}{\max_{i,j} |a_{ij}|}$$

where $a_{ij}^{(k)}$ denotes entries at step $k$ of elimination. For partial pivoting, Wilkinson showed that $g \leq 2^{n-1}$, though in practice $g$ rarely exceeds$O(n)$. The backward error analysis gives:

$$(A + \Delta A)\hat{\mathbf{x}} = \mathbf{b}, \qquad \|\Delta A\| \leq c \cdot g \cdot n \cdot \epsilon_{\text{mach}} \|A\|$$

This means the computed solution $\hat{\mathbf{x}}$ is the exact solution of a slightly perturbed system. The forward error satisfies:

$$\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} \leq \kappa(A) \cdot g \cdot n \cdot \epsilon_{\text{mach}}$$

where $\kappa(A) = \|A\| \|A^{-1}\|$ is the condition number.

4.6 Historical Development

Ancient China: The Nine Chapters (c. 200 BCE)

The Jiuzhang Suanshu (Nine Chapters on the Mathematical Art) contains the earliest known systematic method for solving systems of linear equations. Chapter 8 describes a procedure essentially identical to Gaussian elimination, applied to systems arising from agricultural taxation and trade. This predates Western discoveries by nearly two millennia.

Carl Friedrich Gauss (1809)

Gauss refined the elimination method while fitting orbits to astronomical observations, particularly in his work on the orbit of the asteroid Ceres. His systematic use of the method in least-squares problems led to it bearing his name, though the core algorithm was already ancient.

Alan Turing and James Wilkinson (1940s–1960s)

The numerical stability of Gaussian elimination became crucial with the advent of digital computers. Turing analyzed round-off error in 1948, and Wilkinson developed the definitive backward error analysis and showed that partial pivoting ensures stability. Wilkinson's work earned him the Turing Award in 1970.

Ivar Fredholm (1903)

Fredholm proved his alternative theorem in the context of integral equations, providing a deep connection between solvability conditions and the adjoint operator. His work bridged finite-dimensional linear algebra and infinite-dimensional functional analysis.

4.7 Applications

Circuit Analysis

Kirchhoff's laws produce systems of linear equations: voltage law (KVL) and current law (KCL) give one equation per loop and per node. For a circuit with $n$ nodes and $b$ branches, the resulting system has $n - 1 + b - n + 1 = b$ equations. LU decomposition is used in SPICE simulators that analyze circuits with millions of components.

Finite Element Method

The FEM discretizes partial differential equations into systems $K\mathbf{u} = \mathbf{f}$where $K$ is the stiffness matrix (often sparse and symmetric positive definite). For structural analysis, $\mathbf{u}$ contains nodal displacements and $\mathbf{f}$contains applied forces. Modern FEM codes solve systems with billions of unknowns.

Computer Graphics: Ray Tracing

Determining where a ray intersects a surface requires solving a system of equations. For a triangle mesh, each ray-triangle intersection test involves solving a 3x3 system. A single rendered frame may require billions of such solves, making efficient linear system algorithms essential for real-time rendering.

Economics: Input-Output Models

Leontief's input-output model represents an economy as $(I - A)\mathbf{x} = \mathbf{d}$where $A$ is the technology matrix (inter-industry dependencies) and $\mathbf{d}$is final demand. Solving this system determines the total output $\mathbf{x}$ each sector must produce. Leontief won the Nobel Prize in Economics (1973) for this work.

4.8 Computational Exploration

This simulation implements Gaussian elimination with partial pivoting, LU decomposition, demonstrates the three cases for existence/uniqueness, verifies the Fredholm alternative, and visualizes computational cost and conditioning effects.

Systems of Equations: Gaussian Elimination, LU, and Fredholm

Python

systems_equations.py232 lines

import numpy as np
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt

# ============================================================
# Systems of Linear Equations
# ============================================================

# --- 1. Gaussian Elimination ---
print("=" * 60)
print("GAUSSIAN ELIMINATION")
print("=" * 60)

def gaussian_elimination(A_aug):
    """Forward elimination to row echelon form."""
    A = A_aug.astype(float).copy()
    m, n = A.shape
    pivot_row = 0
    for col in range(n - 1):
        if pivot_row >= m:
            break
        max_row = pivot_row + np.argmax(np.abs(A[pivot_row:, col]))
        if np.abs(A[max_row, col]) < 1e-12:
            continue
        A[[pivot_row, max_row]] = A[[max_row, pivot_row]]
        for row in range(pivot_row + 1, m):
            factor = A[row, col] / A[pivot_row, col]
            A[row, col:] -= factor * A[pivot_row, col:]
        pivot_row += 1
    return A

def back_substitution(U_aug):
    """Back substitution from upper triangular augmented matrix."""
    n = U_aug.shape[0]
    x = np.zeros(n)
    for i in range(n - 1, -1, -1):
        x[i] = (U_aug[i, -1] - np.dot(U_aug[i, i+1:n], x[i+1:n])) / U_aug[i, i]
    return x

A = np.array([[2, 1, -1],
              [-3, -1, 2],
              [-2, 1, 2]], dtype=float)
b = np.array([8, -11, -3], dtype=float)

A_aug = np.column_stack([A, b])
print("Augmented matrix [A|b]:")
print(A_aug)
print()

U_aug = gaussian_elimination(A_aug)
print("After forward elimination (REF):")
print(np.round(U_aug, 6))

x_gauss = back_substitution(U_aug)
print(f"Solution: x = {x_gauss}")
print(f"Verification A@x = {A @ x_gauss} (should be {b})")
print(f"Match: {np.allclose(A @ x_gauss, b)}")

# --- 2. LU Decomposition ---
print()
print("=" * 60)
print("LU DECOMPOSITION")
print("=" * 60)

def lu_decomposition(A):
    """LU decomposition without pivoting."""
    n = A.shape[0]
    L = np.eye(n)
    U = A.astype(float).copy()
    for j in range(n):
        for i in range(j + 1, n):
            factor = U[i, j] / U[j, j]
            L[i, j] = factor
            U[i, j:] -= factor * U[j, j:]
    return L, U

A2 = np.array([[2, 1, 1],
               [4, 3, 3],
               [8, 7, 9]], dtype=float)

L, U = lu_decomposition(A2)
print("A =")
print(A2)
print("L =")
print(L)
print("U =")
print(U)
print(f"L @ U = A: {np.allclose(L @ U, A2)}")

# Solve using LU
b2 = np.array([1, 1, 1], dtype=float)
# Forward substitution: Ly = b
y = np.zeros(3)
for i in range(3):
    y[i] = b2[i] - np.dot(L[i, :i], y[:i])
# Back substitution: Ux = y
x_lu = np.zeros(3)
for i in range(2, -1, -1):
    x_lu[i] = (y[i] - np.dot(U[i, i+1:], x_lu[i+1:])) / U[i, i]
print(f"Solution via LU: x = {x_lu}")
print(f"Verification: {np.allclose(A2 @ x_lu, b2)}")

# Compare with scipy-style PLU
from numpy.linalg import solve
x_direct = solve(A2, b2)
print(f"Solution via numpy: x = {x_direct}")

# --- 3. Existence and Uniqueness ---
print()
print("=" * 60)
print("EXISTENCE AND UNIQUENESS")
print("=" * 60)

# Case 1: Unique solution
A_unique = np.array([[1, 2], [3, 4]], dtype=float)
b_unique = np.array([5, 6], dtype=float)
print("Case 1 - Unique solution:")
print(f"  rank(A) = {np.linalg.matrix_rank(A_unique)}, "
      f"rank([A|b]) = {np.linalg.matrix_rank(np.column_stack([A_unique, b_unique]))}")
print(f"  Solution: {np.linalg.solve(A_unique, b_unique)}")

# Case 2: No solution (inconsistent)
A_incon = np.array([[1, 2], [2, 4]], dtype=float)
b_incon = np.array([3, 7], dtype=float)
print("Case 2 - No solution (inconsistent):")
print(f"  rank(A) = {np.linalg.matrix_rank(A_incon)}, "
      f"rank([A|b]) = {np.linalg.matrix_rank(np.column_stack([A_incon, b_incon]))}")

# Case 3: Infinite solutions
A_inf = np.array([[1, 2], [2, 4]], dtype=float)
b_inf = np.array([3, 6], dtype=float)
print("Case 3 - Infinite solutions:")
print(f"  rank(A) = {np.linalg.matrix_rank(A_inf)}, "
      f"rank([A|b]) = {np.linalg.matrix_rank(np.column_stack([A_inf, b_inf]))}")
print(f"  Particular solution: x = [3, 0]")
print(f"  General: x = [3, 0] + t*[-2, 1] for any t")

# --- 4. Fredholm Alternative ---
print()
print("=" * 60)
print("FREDHOLM ALTERNATIVE")
print("=" * 60)

A_fred = np.array([[1, 2, 3],
                    [4, 5, 6],
                    [7, 8, 9]], dtype=float)
print("Singular matrix A (rank 2):")
print(A_fred)

# Null space of A^T
U_f, s_f, Vt_f = np.linalg.svd(A_fred.T)
null_AT = U_f[:, np.sum(s_f > 1e-10):]
print(f"Null space of A^T: {null_AT.T}")

# Fredholm: Ax=b solvable iff b perp to null(A^T)
b_solvable = np.array([1, 2, 3], dtype=float)
b_unsolvable = np.array([1, 0, 0], dtype=float)
print(f"b1 = {b_solvable}: <b1, null(A^T)> = {np.dot(b_solvable, null_AT.flatten()):.6f} -> Solvable")
print(f"b2 = {b_unsolvable}: <b2, null(A^T)> = {np.dot(b_unsolvable, null_AT.flatten()):.6f} -> Not solvable")

# --- 5. Conditioning ---
print()
print("=" * 60)
print("CONDITION NUMBER")
print("=" * 60)

for name, M in [("Well-cond", np.array([[1,0],[0,1]], dtype=float)),
                ("Moderate", np.array([[1,1],[1,1.001]], dtype=float)),
                ("Ill-cond", np.array([[1,1],[1,1.0001]], dtype=float))]:
    cond = np.linalg.cond(M)
    print(f"  {name}: cond(A) = {cond:.2f}")

# --- 6. Visualization ---
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Plot 1: Two lines in R^2 (three cases)
ax = axes[0]
x_range = np.linspace(-2, 5, 200)
# Unique solution: x + y = 3, x - y = 1
ax.plot(x_range, 3 - x_range, color="#3b82f6", linewidth=2, label="x + y = 3")
ax.plot(x_range, x_range - 1, color="#0ea5e9", linewidth=2, label="x - y = 1")
ax.plot(2, 1, "o", color="#f59e0b", markersize=10, zorder=5)
ax.annotate("(2, 1)", xy=(2.1, 1.2), color="#f59e0b", fontsize=10, fontweight="bold")
ax.set_xlim(-1, 4)
ax.set_ylim(-2, 4)
ax.grid(True, alpha=0.3)
ax.legend(fontsize=9)
ax.set_title("Unique Solution", fontsize=12, fontweight="bold")
ax.set_facecolor("#0f172a")

# Plot 2: LU decomposition structure
ax = axes[1]
n_sizes = [10, 50, 100, 200, 500, 1000]
gauss_flops = [(2/3) * n**3 for n in n_sizes]
lu_flops = [(2/3) * n**3 for n in n_sizes]
solve_flops = [2 * n**2 for n in n_sizes]
ax.loglog(n_sizes, gauss_flops, "o-", color="#3b82f6", linewidth=2, label="LU factorization O(n^3)")
ax.loglog(n_sizes, solve_flops, "s-", color="#0ea5e9", linewidth=2, label="Forward/back sub O(n^2)")
ax.set_xlabel("Matrix Size n", fontsize=11)
ax.set_ylabel("Floating Point Operations", fontsize=11)
ax.set_title("Computational Cost", fontsize=12, fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3, which="both")
ax.set_facecolor("#0f172a")

# Plot 3: Condition number effect
ax = axes[2]
epsilons = np.logspace(-1, -15, 50)
errors = []
for eps in epsilons:
    A_ill = np.array([[1, 1], [1, 1 + eps]])
    b_exact = np.array([2, 2 + eps])
    x_exact = np.array([1, 1], dtype=float)
    b_perturbed = b_exact + np.array([1e-10, 0])
    x_computed = np.linalg.solve(A_ill, b_perturbed)
    errors.append(np.linalg.norm(x_computed - x_exact))

ax.loglog(epsilons, errors, color="#3b82f6", linewidth=2)
ax.set_xlabel("epsilon (perturbation in A)", fontsize=11)
ax.set_ylabel("Solution Error", fontsize=11)
ax.set_title("Ill-Conditioning Effect", fontsize=12, fontweight="bold")
ax.grid(True, alpha=0.3, which="both")
ax.set_facecolor("#0f172a")
ax.invert_xaxis()

plt.tight_layout()
plt.savefig("output.png", dpi=150, bbox_inches="tight",
            facecolor="#0f172a", edgecolor="none")
plt.show()
print("\nPlot saved: Systems of equations visualizations")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Share:X Reddit LinkedIn

← Matrices & Determinants Eigenvalues & Eigenvectors →