Solving Linear Systems

The backbone of scientific computing: solving \(Ax = b\) efficiently, stably, and at scale. Direct methods for dense systems, iterative methods for sparse ones.

Why Linear Systems Are Everywhere

Almost every numerical method ultimately reduces to solving a linear system. Finite element methods, implicit ODE solvers, optimization algorithms, least-squares fitting -- all require solving \(Ax = b\) repeatedly. The cost of solving this system often dominates the total computation time.

Dense: \(O(N^3)\)

Gaussian elimination, LU, Cholesky

Sparse Iterative: \(O(N \cdot k)\)

Jacobi, Gauss-Seidel, CG

Condition Number

\(\kappa(A) = \|A\| \cdot \|A^{-1}\|\)

1. Gaussian Elimination

The fundamental direct method. Transform \(Ax = b\) to upper-triangular form \(Ux = c\)using row operations, then back-substitute. Cost: \(\frac{2}{3}N^3\) flops.

Partial Pivoting

Never divide by small numbers. At each step, swap the current row with the row having the largest absolute value in the pivot column. This prevents catastrophic growth of round-off errors and is essential for numerical stability. Without pivoting, even well-conditioned systems can produce garbage results.

Gaussian Elimination with Partial Pivoting

Python
script.py45 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

2. LU Decomposition

Factor \(A = LU\) where \(L\) is lower-triangular and \(U\) is upper-triangular. Then solve \(Ax = b\) in two steps: \(Ly = b\) (forward substitution) and \(Ux = y\) (back substitution). The key advantage: factorize once \((O(N^3))\), solve for multiple right-hand sides cheaply \((O(N^2)\) each).

\(PA = LU\)

\(P\) = permutation matrix (from pivoting), \(L\) = lower triangular with 1s on diagonal, \(U\) = upper triangular

LU Decomposition with SciPy

Python
script.py37 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

3. Cholesky Decomposition

If \(A\) is symmetric positive definite (SPD), we can factor \(A = LL^T\) where \(L\) is lower-triangular with positive diagonal. This is twice as fast as LU and more numerically stable. SPD matrices arise naturally in least-squares, covariance matrices, and FEM stiffness matrices.

Cholesky Decomposition

Python
script.py29 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Condition Number

How much can the solution change when the input data is slightly perturbed? The condition number quantifies this sensitivity:

\(\kappa(A) = \|A\| \cdot \|A^{-1}\| = \frac{\sigma_{\max}}{\sigma_{\min}}\)

Ratio of largest to smallest singular value. \(\kappa(A) \geq 1\) always.

Rule of thumb: You lose about \(\log_{10} \kappa(A)\)digits of accuracy when solving \(Ax = b\). With double precision (16 digits) and\(\kappa(A) = 10^{10}\), you get only ~6 reliable digits. If \(\kappa(A) \gtrsim 10^{16}\), the matrix is effectively singular in double precision.

Condition Number and Its Effects

Python
script.py39 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

5. Iterative Methods

For large sparse systems (\(N > 10^4\)), direct methods are too expensive. Iterative methods start from an initial guess and improve it, requiring only matrix-vector products (which are cheap for sparse matrices).

Jacobi Method

Split \(A = D + (L + U)\) where \(D\) is the diagonal. Iterate:

\(x^{(k+1)} = D^{-1}(b - (L+U)x^{(k)})\)

Converges if \(A\) is strictly diagonally dominant. Embarrassingly parallel.

Gauss-Seidel

Like Jacobi but uses updated values immediately:

\(x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j<i} a_{ij} x_j^{(k+1)} - \sum_{j>i} a_{ij} x_j^{(k)}\right)\)

Typically converges ~2x faster than Jacobi. Sequential.

Jacobi and Gauss-Seidel Iteration

Python
script.py59 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

6. Conjugate Gradient Method

The conjugate gradient (CG) method is the optimal Krylov subspace method for SPD matrices. It minimizes\(\|x - x^*\|_A = \sqrt{(x-x^*)^T A (x-x^*)}\) over successive Krylov subspaces. Converges in at most \(N\) iterations (exact arithmetic), but in practice converges much faster: \(O(\sqrt{\kappa(A)})\) iterations.

Conjugate Gradient for Sparse SPD Systems

Python
script.py34 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

7. Practical Linear Algebra with NumPy

NumPy/SciPy Linear Algebra Toolkit

Python
script.py47 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server