This document explains the derivation in Section 7.2.3 of a spatial-temporal statistics text, where the author simplifies expressions involving:
for prediction at location \((s_0, t_0)\) using separable covariance matrices.
| Symbol | Meaning |
|---|---|
| \(\Sigma_s\) | \(n \times n\) spatial covariance matrix, \((\Sigma_s)_{ij} = \sigma_s(s_i - s_j)\) |
| \(\Sigma_t\) | \(T \times T\) temporal covariance matrix, \((\Sigma_t)_{km} = \sigma_t(t_k - t_m)\) |
| \(H\) | \(\Sigma_s \otimes \Sigma_t\) (Kronecker product), the covariance of observed data |
| \(\Sigma_{21}\) | \(nT \times 1\) column vector of covariances between observed data and prediction point |
| \(\Sigma_{12}\) | \(1 \times nT\) row vector, equal to \(\Sigma_{21}^\top\) |
| \(s_0, t_0\) | Prediction spatial location and time |
| \(t'\) or \(t_0\) | Prediction time (used interchangeably) |
The full joint covariance (observed data + prediction point) is:
\[ \begin{bmatrix} 1 & \Sigma_{12} \\ \Sigma_{21} & H \end{bmatrix} \]
where:
Since \(H = \Sigma_s \otimes \Sigma_t\), its inverse is:
\[ H^{-1} = \Sigma_s^{-1} \otimes \Sigma_t^{-1} \]
Let \(b_{jk}(s_0, t_0)\) be the \((j,k)\)-th entry of the row vector \(\Sigma_{12} H^{-1}\), where:
Then:
\[ b_{jk}(s_0, t_0) = \sum_{i=1}^n \sum_{m=1}^T \sigma_s(s_i - s_0) \sigma_t(t_m - t_0) (\Sigma_s^{-1})_{ij} (\Sigma_t^{-1})_{mk} \]
Factor the sums:
\[ b_{jk}(s_0, t_0) = \underbrace{\left[ \sum_{i=1}^n \sigma_s(s_i - s_0) (\Sigma_s^{-1})_{ij} \right]}_{=: b_s(j, s_0)} \cdot \underbrace{\left[ \sum_{m=1}^T \sigma_t(t_m - t_0) (\Sigma_t^{-1})_{mk} \right]}_{=: b_t(k, t_0)} \]
Thus:
\[ \boxed{b_{jk}(s_0, t_0) = b_s(j, s_0) \cdot b_t(k, t_0)} \]
We now focus on:
\[ b_t(k, t_0) = \sum_{m=1}^T \sigma_t(t_m - t_0) (\Sigma_t^{-1})_{mk} \]
When \(t_0\) is one of the observed times, \(\sigma_t(t_m - t_0) = (\Sigma_t)_{m, t_0}\). Therefore:
\[ b_t(k, t_0) = \sum_{m=1}^T (\Sigma_t)_{m, t_0} (\Sigma_t^{-1})_{mk} = \delta_{k, t_0} \]
where \(\delta_{i,j} = 1\) if \(i=j\), else \(0\).
Now \(t_0\) is beyond all observed times. We cannot directly use \(\Sigma_t\). The author introduces a key property of the exponential covariance function.
For exponential covariance:
\[ \sigma_t(\tau) = \exp\left(-\frac{|\tau|}{\phi}\right) \]
For \(t_0 > T\) and \(m = 1,\dots,T\):
\[ \sigma_t(t_m - t_0) = \sigma_t(t_0 - T) \cdot \sigma_t(T - t_m) \]
Verification:
Therefore:
\[ \begin{aligned} \sigma_t(t_m - t_0) &= \exp\left(-\frac{t_0 - t_m}{\phi}\right) \\ &= \exp\left(-\frac{t_0 - T}{\phi}\right) \cdot \exp\left(-\frac{T - t_m}{\phi}\right) \\ &= \sigma_t(t_0 - T) \cdot \sigma_t(T - t_m) \end{aligned} \]
\[ \begin{aligned} b_t(k, t_0) &= \sum_{m=1}^T \left[ \sigma_t(t_0 - T) \cdot \sigma_t(T - t_m) \right] (\Sigma_t^{-1})_{mk} \\ &= \sigma_t(t_0 - T) \sum_{m=1}^T \sigma_t(T - t_m) (\Sigma_t^{-1})_{mk} \end{aligned} \]
Assuming \(t_T = T\) (observed times are \(1, 2, \dots, T\)):
\[ (\Sigma_t)_{T, m} = \sigma_t(t_T - t_m) = \sigma_t(T - t_m) \]
Thus:
\[ b_t(k, t_0) = \sigma_t(t_0 - T) \sum_{m=1}^T (\Sigma_t)_{T, m} (\Sigma_t^{-1})_{mk} \]
By definition of the inverse matrix:
\[ \sum_{m=1}^T (\Sigma_t)_{T, m} (\Sigma_t^{-1})_{mk} = \delta_{T, k} \]
This equals \(1\) if \(k = T\), and \(0\) otherwise.
\[ b_t(k, t_0) = \sigma_t(t_0 - T) \cdot \delta_{k, T} \]
\[ \boxed{b_t(k, t_0) = \begin{cases} \delta_{k, t_0}, & t_0 \le T \\[6pt] \delta_{k, T} \cdot \sigma_t(t_0 - T), & t_0 > T \end{cases}} \]
Let \(a\) be any \(nT \times 1\) vector with entries \(a_{jk}\).
\[ \Sigma_{12} H^{-1} a = \sum_{j=1}^n \sum_{k=1}^T b_{jk}(s_0, t_0) a_{jk} = \sum_{j=1}^n b_s(j, s_0) \sum_{k=1}^T a_{jk} b_t(k, t_0) \]
Using the \(b_t\) result:
Therefore:
\[ \boxed{\Sigma_{12} H^{-1} a = \begin{cases} \sum_{j=1}^n b_s(j, s_0) a_{j, t_0}, & t_0 \le T \\[6pt] \sigma_t(t_0 - T) \sum_{j=1}^n b_s(j, s_0) a_{j, T}, & t_0 > T \end{cases}} \]
Now set \(a = \Sigma_{21}\), so \(a_{jk} = \sigma_s(s_j - s_0) \sigma_t(t_k - t_0)\).
\[ \begin{aligned} \Sigma_{12} H^{-1} \Sigma_{21} &= \sum_{j=1}^n b_s(j, s_0) \cdot \sigma_s(s_j - s_0) \cdot \sigma_t(t_0 - t_0) \\ &= \sum_{j=1}^n b_s(j, s_0) \sigma_s(s_j - s_0) \quad (\text{since } \sigma_t(0)=1) \end{aligned} \]
Define:
\[ a_s(s_0) := \sum_{i=1}^n \sum_{j=1}^n \sigma_s(s_i - s_0) (\Sigma_s^{-1})_{ij} \sigma_s(s_j - s_0) \]
Then:
\[ \Sigma_{12} H^{-1} \Sigma_{21} = a_s(s_0) \]
\[ \begin{aligned} \Sigma_{12} H^{-1} \Sigma_{21} &= \sigma_t(t_0 - T) \sum_{j=1}^n b_s(j, s_0) \cdot \sigma_s(s_j - s_0) \cdot \sigma_t(T - t_0) \\ &= \sigma_t(t_0 - T) \cdot \sigma_t(T - t_0) \cdot \sum_{j=1}^n b_s(j, s_0) \sigma_s(s_j - s_0) \end{aligned} \]
By stationarity, \(\sigma_t(T - t_0) = \sigma_t(t_0 - T)\). Thus:
\[ \Sigma_{12} H^{-1} \Sigma_{21} = a_s(s_0) \cdot [\sigma_t(t_0 - T)]^2 \]
The conditional variance of \(y(s_0, t_0)\) given observed data is:
\[ \delta^2(s_0, t_0) = 1 - \Sigma_{12} H^{-1} \Sigma_{21} \]
Therefore:
\[ \boxed{\delta^2(s_0, t_0) = 1 - a_s(s_0) \cdot a_t(t_0)} \]
where:
\[ a_t(t_0) = \begin{cases} 1, & t_0 \le T \\[4pt] [\sigma_t(t_0 - T)]^2, & t_0 > T \end{cases} \]
| Quantity | Expression |
|---|---|
| \(b_{jk}(s_0, t_0)\) | \(b_s(j, s_0) \cdot b_t(k, t_0)\) |
| \(b_t(k, t_0)\) ( \(t_0 \le T\) ) | \(\delta_{k, t_0}\) |
| \(b_t(k, t_0)\) ( \(t_0 > T\) ) | \(\delta_{k, T} \cdot \sigma_t(t_0 - T)\) |
| \(\Sigma_{12} H^{-1} \Sigma_{21}\) ( \(t_0 \le T\) ) | \(a_s(s_0)\) |
| \(\Sigma_{12} H^{-1} \Sigma_{21}\) ( \(t_0 > T\) ) | \(a_s(s_0) \cdot [\sigma_t(t_0 - T)]^2\) |
| Conditional variance | \(1 - a_s(s_0) a_t(t_0)\) |