Question 4 – Source Coding for Discrete Memoryless Source (DMS)

A DMS X has live symbols \(\{x_l,x_2,x_3,x_4,x_5\}\) with \(P(x_1) = 0.4\), \(P(x_2)=0.19\), \(P(x_3) =0.16\), \(P(x_4) = 0.15\), and \(P(x_5) = 0.1\).

(a) Construct the Shannon-Fano code for X, and calculate the efficiency of the code. (b) Repeat for the Huffman code and compare the results.


Solution

Let \(X = \{x_1, x_2, x_3, x_4, x_5\}\) be a source with probabilities:

Symbol Probability
\(x_1\) 0.40
\(x_2\) 0.19
\(x_3\) 0.16
\(x_4\) 0.15
\(x_5\) 0.10

(a) Shannon-Fano Coding

Steps:

  1. Arrange symbols in decreasing probability.
  2. Partition the list into two groups with approximately equal total probabilities.
  3. Assign binary digits: 0 for one group, 1 for the other.
  4. Repeat recursively for each group until all symbols are assigned unique codes.

Shannon-Fano Code:

Symbol Probability Code
\(x_1\) 0.40 0
\(x_2\) 0.19 10
\(x_3\) 0.16 110
\(x_4\) 0.15 1110
\(x_5\) 0.10 1111

Average Code Length:

\[ L = \sum P(x_i) \cdot \ell(x_i) = (0.4)(1) + (0.19)(2) + (0.16)(3) + (0.15)(4) + (0.10)(4) = 2.21 \text{ bits} \]

Entropy of Source:

\[ H(X) = - \sum P(x_i) \log_2 P(x_i) \approx 2.146 \text{ bits} \]

Efficiency:

\[ \text{Efficiency} = \frac{H(X)}{L} = \frac{2.146}{2.21} \approx 97.1\% \]


(b) Huffman Coding

Steps:

  1. Pair the least probable symbols and combine into a node.
  2. Repeat until a single root node remains.
  3. Assign binary digits to branches recursively from the root.

Huffman Code:

Symbol Probability Code
\(x_1\) 0.40 0
\(x_2\) 0.19 10
\(x_3\) 0.16 110
\(x_4\) 0.15 1110
\(x_5\) 0.10 1111

Note: In this case, Huffman and Shannon-Fano yield identical codes due to the distribution.

Average Code Length:
Same as before: \(L = 2.21 \text{ bits}\)

Efficiency:
Same as before: \(\approx 97.1\%\)


Comparison

  • Efficiency: Both methods give similar results in this case.
  • Code Optimality: Huffman coding guarantees the minimum average code length.
  • Code Structure: Shannon-Fano is easier to construct but may not always yield the optimal code.