Question 4 – Source Coding for Discrete Memoryless Source (DMS)
A DMS X has live symbols \(\{x_l,x_2,x_3,x_4,x_5\}\) with \(P(x_1) = 0.4\), \(P(x_2)=0.19\), \(P(x_3) =0.16\), \(P(x_4) = 0.15\), and \(P(x_5) = 0.1\).
(a) Construct the Shannon-Fano code for X, and calculate the efficiency of the code. (b) Repeat for the Huffman code and compare the results.
Solution
Let \(X = \{x_1, x_2, x_3, x_4, x_5\}\) be a source with probabilities:
Symbol | Probability |
---|---|
\(x_1\) | 0.40 |
\(x_2\) | 0.19 |
\(x_3\) | 0.16 |
\(x_4\) | 0.15 |
\(x_5\) | 0.10 |
(a) Shannon-Fano Coding
Steps:
- Arrange symbols in decreasing probability.
- Partition the list into two groups with approximately equal total probabilities.
- Assign binary digits: 0 for one group, 1 for the other.
- Repeat recursively for each group until all symbols are assigned unique codes.
Shannon-Fano Code:
Symbol | Probability | Code |
---|---|---|
\(x_1\) | 0.40 | 0 |
\(x_2\) | 0.19 | 10 |
\(x_3\) | 0.16 | 110 |
\(x_4\) | 0.15 | 1110 |
\(x_5\) | 0.10 | 1111 |
Average Code Length:
\[ L = \sum P(x_i) \cdot \ell(x_i) = (0.4)(1) + (0.19)(2) + (0.16)(3) + (0.15)(4) + (0.10)(4) = 2.21 \text{ bits} \]
Entropy of Source:
\[ H(X) = - \sum P(x_i) \log_2 P(x_i) \approx 2.146 \text{ bits} \]
Efficiency:
\[ \text{Efficiency} = \frac{H(X)}{L} = \frac{2.146}{2.21} \approx 97.1\% \]
(b) Huffman Coding
Steps:
- Pair the least probable symbols and combine into a node.
- Repeat until a single root node remains.
- Assign binary digits to branches recursively from the root.
Huffman Code:
Symbol | Probability | Code |
---|---|---|
\(x_1\) | 0.40 | 0 |
\(x_2\) | 0.19 | 10 |
\(x_3\) | 0.16 | 110 |
\(x_4\) | 0.15 | 1110 |
\(x_5\) | 0.10 | 1111 |
Note: In this case, Huffman and Shannon-Fano yield identical codes due to the distribution.
Average Code Length:
Same as before: \(L = 2.21 \text{
bits}\)
Efficiency:
Same as before: \(\approx 97.1\%\)
Comparison
- Efficiency: Both methods give similar results in this case.
- Code Optimality: Huffman coding guarantees the minimum average code length.
- Code Structure: Shannon-Fano is easier to construct but may not always yield the optimal code.