Mutual Information

Mutual information is a measure of how much information one random variable provides about another. It is a dimensionless quantity and reflects the reduction in uncertainty about one variable given knowledge of the other.

High mutual information → large reduction in uncertainty.
Low mutual information → small reduction in uncertainty.
Zero mutual information → the variables are independent.

Efficient communication systems aim for high mutual information.

Joint Entropies

Using: - Input probabilities \(P(x_i)\) - Output probabilities \(P(y_j)\) - Transition probabilities \(P(y_j|x_i)\) - Joint probabilities \(P(x_i, y_j)\)

We define:

\(H(X) = - \sum_{i=1}^{m} P(x_i) \log_2 P(x_i)\)
\(H(Y) = - \sum_{j=1}^{n} P(y_j) \log_2 P(y_j)\)
\(H(X, Y) = - \sum_{j=1}^{n} \sum_{i=1}^{m} P(x_i, y_j) \log_2 P(x_i, y_j)\)

Interpretation of Joint Entropy

\(H(X)\): average uncertainty of the input.
\(H(Y)\): average uncertainty of the output.
\(H(X, Y)\): average uncertainty of the communication channel as a whole.

Conditional Entropy

\(H(X|Y)\): average uncertainty remaining about the input given the output (also known as equivocation).
\(H(Y|X)\): average uncertainty of the output given the transmitted input.

Formulas:

\(H(X|Y) = - \sum_{j=1}^{n} \sum_{i=1}^{m} P(x_i, y_j) \log_2 P(x_i|y_j)\)
\(H(Y|X) = - \sum_{j=1}^{n} \sum_{i=1}^{m} P(x_i, y_j) \log_2 P(y_j|x_i)\)

Useful Identities

\(H(X, Y) = H(X|Y) + H(Y)\)
\(H(X, Y) = H(Y|X) + H(X)\)

These identities mirror relationships found in general probability theory.