Examples of Codes

Definition

A source code $C$ for a random variable $X$ is a mapping from $X$ , the range of $X$ , to $D^{*}$ , the set of finite-length strings of symbols from a $D$ -ary alphabet. Let $C (x)$ denote the codeword corresponding to $x$ and let $l (x)$ denote the length of $C (x)$ .

For example, $C (red) = 00$ , $C (blue) = 11$ is a source code for $X = {red, blue}$ with alphabet $D = {0, 1}$ .

Length

The expected length $L (C)$ of a source code $C (x)$ for a random variable $X$ with PMF $p (x)$ is given by

L (C) = E [l (x)] = x \in X \sum p (x) l (x)

where $l (x)$ is the length of the codeword associated with $x$ .

Singularity

A code is said to be non-singular if every element of the range of $X$ maps into a different string in $D^{*}$ (Injection); this is,

x \neq = x^{'} ⟹ C (x) \neq = C (x^{'})

Extension

The extension $C^{*}$ of a code $C$ is the mapping from finite-length strings of $X$ to finite-length strings of $D$ , defined by

C (x_{1} x_{2} \dots x_{n}) := C (x_{1}) C (x_{2}) \dots C (x_{n})

Example

If $C (x_{1}) = 00$ and $C (x_{2}) = 11$ , then $C (x_{1} x_{2}) = 0011$

Uniquely Decodable

A code is called uniquely decodable is its extension in non-singular.

Prefix Code

A code is called a prefix code or an instantaneous code if no codeword is a prefix of any other codeword.

Note

Instantaneous codes $\subset$ Uniquely decodable codes $\subset$ Nonsingular codes $\subset$ All codes

Summary

$X$	Nonsingular, But Not Uniquely Decodable	Uniquely Decodable, But Not Instantaneous	Instantaneous
1	0	10	0
2	010	00	10
3	01	11	110
4	10	110	111

Tip

The crucial difference of non-instantaneous code and instantaneous ones lies in the delay and complexity of decoding. While both can be decoded eventually, the non-instantaneous code incurs additional costs:

For non-instantaneous code: Every encoded bitstream has only one possible original sequence of symbols. However, you might have to look ahead in the stream before you can definitively decode a symbol. It requires buffering bits and delaying the output.

For instantaneous code: You can decode each symbol immediately as you read the bits. No lookahead is needed

Kraft Inequality

Motivation

We want to construct instantaneous codes of minimum expected length to describe a given source.

Definition

For any instantaneous code (prefix code) over an alphabet of size $D$ , the codeword lengths $l_{1}, l_{2}, \dots, l_{m}$ must satisfy the inequality

i \sum D^{- l_{i}} \leq 1

Conversely, given a set of codeword lengths that satisfy this inequality, there exists an instantaneous code with these word lengths.

Intuition

For an alphabet of size $D$ , the tree is $D$ -ary (each node has $D$ children). The "space" taken by a codeword of length $l_{i}$ id $D^{- l_{i}}$ (since it occupies one of the $D^{l_{i}}$ possible leaves at depth $l_{i}$ ). And the Kraft inequality ensures that the total "space" occupied by all codewords does not exceed the tatal available space (1).

Proof

Necessity: If a prefix code exists, then Kraft holds.

Consider the full $D$ -ary tree of depth $l_{max}$ (the longest codeword)
A codeword of length $l_{i}$ blocks $D^{l_{max} - l_{i}}$ leaves (all its descendants)
Since no two codewords share a leaf, the total number of blocked leaves is

i \sum D^{l_{max} - l_{i}} \leq D^{l_{max}}

Dividing both sides by $D^{l_{max}}$ gives the inequality

Sufficiency: If Kraft holds, a prefix code exists.

Sort the lengths $l_{1} \leq l_{2} \leq \dots \leq l_{m}$
For each length $l_{i}$ , pick any unused node at depth $l_{i}$ and block all its descendants.
The Kraft Inequality ensures that there is always enough "space" to assign the next codeword without violating the prefix-free property.

Extended Kraft Inequality

For any countably infinite set of codewords that form a prefix code, the codeword lengths satisfy the extended Kraft inequality

i = 1 \sum \infty D^{- l_{i}} \leq 1

Optimal Codes

Minimization via Lagrange multipliers

We want to minimize

L = \sum p_{i} l_{i}

over all integers $l_{1}, l_{2}, \dots, l_{m}$ satisfying

\sum D^{- l_{i}} \leq 1

We neglect the integer constraint on $l_{i}$ and assume equality in the constraint. Hence, we can write the constrained minimization using Lagrange multiplier as the minimization of

J = \sum p_{i} l_{i} + λ (\sum D^{- l_{i}})

Setting $\frac{\partial J}{\partial l _{i}} = 0$ , we get $λ = \frac{1}{ln D}$ and

p_{i} = D^{- l_{i}} ⟹ l_{i}^{*} = - lo g_{D} p_{i}

This non-integer choice of codeword lengths yields expected codeword length

L^{*} = \sum p_{i} l_{i}^{*} = - \sum p_{i} lo g_{D} p_{i} = H_{D} (X)

Lower bound of expected length

The expected length $L$ of any instantaneous $D$ -ary code for a random variable $X$ is greater than or equal to the entropy $H_{D} (X)$ ; that is

L \geq H_{D} (X)

with equality if an only if $D^{- l_{i}} = p_{i}$ .

Bounds on the Optimal Code Length

Let $l_{1}^{*}, l_{2}^{*}, \dots, l_{m}^{*}$ be optimal codeword lengths for a source distribution $p$ and a $D$ -ary alphabet, and let $L^{*}$ be the associated expected length of an optimal code ( $L^{*} = \sum p_{i} l_{i}^{*}$ ). Then

H_{D} (X) \leq L^{*} < H_{D} (X) + 1

Therefore, the minimum expected codeword length per symbol satisfies

\frac{1}{n} H (X_{1}, X_{2}, \dots, X_{n}) \leq L_{n}^{*} < \frac{H ( X _{1} , X _{2} , \dots , X _{n} )}{n} + \frac{1}{n}

Wrong code

We can not always precisely estimate the real distribution of data. We consider the Shannon code assignment $l (x) = ⌈ lo g \frac{1}{q ( x )} ⌉$ designed for the probability mass function $q (x)$ . Suppose that the true probability mass function $p (x)$ . Thus, we will not achieve expected length $L \approx H (p) = - \sum p (x) lo g p (x)$ .

We now show that the increase in expected length due to the incorrect distribution is the relative entropy $D (p ∥ q)$ :

H (p) + D_{KL} (p ∥ q) \leq E_{p (x)} l (X) < H (p) + D_{KL} (p ∥ q) + 1

Kraft Inequality for Uniquely Decodable Codes

The codeword lengths of any uniquely decodable $D$ -ary code must satisfy the Kraft inequality

\sum d^{- l_{i}} \leq 1

Conversely, given a set of codeword lengths that satisfy this inequality, it is possible to construct a uniquely decodable code with these codeword lengths.

Corollary A uniquely decodable code for an infinite source alphabet $X$ also satisfies the Kraft inequality.

Huffman Codes

Algorithm

An optimal prefix code for a given distribution can be constructed by a simple algorithm discovered by Huffman.

List the symbols and their probabilities in ascending order of probabilities (or any order if probabilities are the same).
For every iteration:
1. Find the two symbols with the lowest probabilities
2. Combine these two symbols into a new node, with a probability equal to the sum of their probabilities
3. Assign '0' to the branch leading to the first symbol and '1' to the branch leading to the second symbol in our combined node.

Example

Consider a random variable $X$ taking in the set $X = {1, 2, 3, 4, 5}$ with probabilities $0.25$ , $0.25$ , $0.2$ , $0.15$ , $0.15$

Now we get

Codeword Length	Codeword	X
2	11	1
2	00	2
2	01	3
3	100	4
3	101	5

Tip

If $D \geq 3$ , we may not have a sufficient number of symbols so that we can always combine them $D$ at a time. In such a case, we add dummy symbols to the end of the set of symbols. The dummy symbols have probability $0$ and are inserted to fill the tree.

Since at each stage of the reduction, the number of symbols is reduced by $D - 1$ , we want the total number of symbols to be $1 + k (D - 1)$ , where $k$ is the number of merges.

Lin's Notes Garden

Explorer

Data Compression

Examples of Codes

Definition

Length

Singularity

Extension

Uniquely Decodable

Prefix Code

Summary

Kraft Inequality

Motivation

Definition

Intuition

Proof

Necessity: If a prefix code exists, then Kraft holds.

Sufficiency: If Kraft holds, a prefix code exists.

Extended Kraft Inequality

Optimal Codes

Minimization via Lagrange multipliers

Lower bound of expected length

Bounds on the Optimal Code Length

Wrong code

Kraft Inequality for Uniquely Decodable Codes

Huffman Codes

Algorithm

Example

Graph View

Table of Contents

Backlinks