We consider the squared error loss function:
\[L(a, \theta) = (a - \theta)^2.\]
Let \(b\) be the posterior mean of \(\theta\) given data \(y_1, \dots, y_n\):
\[b = \mathbb{E}_{\pi(\theta|y_1,\dots,y_n)}(\theta) = \int_{-\infty}^{\infty} \theta \,\pi(\theta|y_1,\dots,y_n)\, d\theta.\]
Here, \(\pi(\theta|y_1,\dots,y_n)\) is a probability density function (pdf) — the posterior distribution of \(\theta\).
The posterior expected loss for an action \(a\) is:
\[\mathbb{E}[L(a, \theta)] = \int_{-\infty}^{\infty} (a - \theta)^2 \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]
Write \(a - \theta = (a - b) + (b - \theta)\). Then:
\[(a - \theta)^2 = (a - b)^2 + 2(a-b)(b - \theta) + (b - \theta)^2.\]
Thus:
\[\mathbb{E}[L(a, \theta)] = \int (a - b)^2 \pi \, d\theta \;+\; 2(a-b) \int (b - \theta) \pi \, d\theta \;+\; \int (b - \theta)^2 \pi \, d\theta.\]
The cross term is:
\[C = 2(a-b) \int_{-\infty}^{\infty} (b - \theta) \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]
Compute the inner integral:
\[\int_{-\infty}^{\infty} (b - \theta) \pi \, d\theta = b \underbrace{\int_{-\infty}^{\infty} \pi \, d\theta}_{=1} - \underbrace{\int_{-\infty}^{\infty} \theta \pi \, d\theta}_{=b} = b - b = 0.\]
Therefore \(C = 2(a-b) \cdot 0 = 0\).
The cross term is zero because \(b\) is the posterior mean. For any probability distribution with finite mean:
\[\int (\text{mean} - \theta) \, \text{pdf} \, d\theta = 0.\]
So regardless of \(a\), the expected deviation of \(\theta\) from its mean integrates to zero.
Hence:
\[\mathbb{E}[L(a, \theta)] = (a - b)^2 + \int_{-\infty}^{\infty} (b - \theta)^2 \, \pi(\theta|y_1,\dots,y_n) \, d\theta.\]
The first term \((a-b)^2\) is minimized when \(a = b\), confirming that the posterior mean is the optimal point estimate under squared error loss.
The second term is the posterior variance, which does not depend on \(a\).