Why the Cross Product Term is Zero in Squared Error Loss

1. Problem Setup

We consider the squared error loss function:

\[L(a, \theta) = (a - \theta)^2.\]

Let \(b\) be the posterior mean of \(\theta\) given data \(y_1, \dots, y_n\):

\[b = \mathbb{E}_{\pi(\theta|y_1,\dots,y_n)}(\theta) = \int_{-\infty}^{\infty} \theta \,\pi(\theta|y_1,\dots,y_n)\, d\theta.\]

Here, \(\pi(\theta|y_1,\dots,y_n)\) is a probability density function (pdf) — the posterior distribution of \(\theta\).

2. Expanding the Expected Loss

The posterior expected loss for an action \(a\) is:

\[\mathbb{E}[L(a, \theta)] = \int_{-\infty}^{\infty} (a - \theta)^2 \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]

Write \(a - \theta = (a - b) + (b - \theta)\). Then:

\[(a - \theta)^2 = (a - b)^2 + 2(a-b)(b - \theta) + (b - \theta)^2.\]

Thus:

\[\mathbb{E}[L(a, \theta)] = \int (a - b)^2 \pi \, d\theta \;+\; 2(a-b) \int (b - \theta) \pi \, d\theta \;+\; \int (b - \theta)^2 \pi \, d\theta.\]

3. Why the Cross Term Vanishes

The cross term is:

\[C = 2(a-b) \int_{-\infty}^{\infty} (b - \theta) \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]

Compute the inner integral:

\[\int_{-\infty}^{\infty} (b - \theta) \pi \, d\theta = b \underbrace{\int_{-\infty}^{\infty} \pi \, d\theta}_{=1} - \underbrace{\int_{-\infty}^{\infty} \theta \pi \, d\theta}_{=b} = b - b = 0.\]

Therefore \(C = 2(a-b) \cdot 0 = 0\).

4. Key Insight

The cross term is zero because \(b\) is the posterior mean. For any probability distribution with finite mean:

\[\int (\text{mean} - \theta) \, \text{pdf} \, d\theta = 0.\]

So regardless of \(a\), the expected deviation of \(\theta\) from its mean integrates to zero.

5. Final Result

Hence:

\[\mathbb{E}[L(a, \theta)] = (a - b)^2 + \int_{-\infty}^{\infty} (b - \theta)^2 \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]

The first term \((a-b)^2\) is minimized when \(a = b\), confirming that the posterior mean is the optimal point estimate under squared error loss.

The second term is the posterior variance, which does not depend on \(a\).