Part II: Coding Theory

From Shannon's Limits to Practical Codes

Part I established fundamental limits: the source coding theorem bounded compression, and the channel capacity theorem set the ceiling for reliable communication. Part II asks: how do we actually achieve these limits? The answer lies in coding theory—the design of efficient compression schemes and error-correcting codes that approach Shannon's ideals in practice.

We begin with source coding algorithms—Huffman and arithmetic codes—which construct variable-length prefix-free codes that provably compress data close to the entropy bound. We then study error-correcting codes, from the elegant Hamming(7,4) code to modern Reed-Solomon constructions. Finally, we examine turbo codes and LDPC codes, the breakthrough designs that brought us within a fraction of a decibel of the Shannon limit.

Key Question of Coding Theory

Given a channel with capacity \( C \) and a source with entropy \( H \), can we design codes that transmit reliably at rate \( R < C \) while compressing to length approaching \( H \)? Shannon said yes — this part shows how.

Chapters in This Part

The Coding Theory Landscape

Source Coding (Compression)

Goal: represent a source \( X \) using as few bits as possible. Shannon's source coding theorem gives the lower bound:

\( L^* \geq H(X) \)

Huffman codes achieve \( H(X) \leq L < H(X) + 1 \). Arithmetic codes over blocks approach \( H(X) \) arbitrarily closely.

Channel Coding (Error Correction)

Goal: transmit reliably over a noisy channel of capacity \( C \). The noisy-channel coding theorem guarantees reliable transmission for any rate \( R < C \).

\( P_e \to 0 \text{ as } n \to \infty, \; R < C \)

Turbo and LDPC codes achieve this at SNR within 0.1 dB of the Shannon limit.

Historical Timeline

1948Shannon publishes "A Mathematical Theory of Communication" — proves existence of capacity-achieving codes
1950Hamming introduces error-correcting codes; Hamming(7,4) corrects single-bit errors
1952Huffman algorithm — provably optimal prefix-free codes via greedy construction
1960Reed-Solomon codes — powerful polynomial codes over finite fields, used in CDs, QR codes, and deep-space missions
1993Turbo codes (Berrou et al.) — iterative decoding within 0.5 dB of Shannon limit, revolution in communications
1996LDPC codes rediscovered (Gallager, 1962; Mackay & Neal, 1996) — near-Shannon performance with sparse graphs
2009Polar codes (Arıkan) — first codes proven to achieve capacity with efficient encoding and decoding

Prerequisites & Notation

From Part I

  • Shannon entropy \( H(X) = -\sum_x p(x)\log_2 p(x) \)
  • Source coding theorem and typical sequences
  • Mutual information \( I(X;Y) \)
  • Channel capacity \( C = \max_{p(x)} I(X;Y) \)

New in Part II

  • Prefix-free (instantaneous) codes and Kraft inequality
  • Generator matrix \( G \) and parity-check matrix \( H \)
  • Hamming distance \( d(x,y) = \mathrm{wt}(x \oplus y) \)
  • Finite fields \( \mathbb{F}_q \) and polynomial codes
  • Factor graphs, sum-product algorithm