Error covariance

Introduction

This file show simulation example of using the method of effective sample size to approximate correlated covariance struction of the measurement error.

Simulate data

First we simulate some data that we will use in the next two examples. Assume that the true process of interest is a zero mean Gaussian process with Matern covariance function over a square. The Matern parameters are set as in the following code.

## regular grids on [0,5]
loc_grid <- as.matrix(expand.grid(seq(0.05, 5, 0.1), seq(0.05, 5, 0.1)))

## Calculate the Euclidean distance matrix
locdist <- as.matrix(dist(loc_grid))

## Calculate the Matern covariance function for these grid points
Matern_cov <- fields::Matern(d =locdist, range = 0.5, smoothness = 1, phi = 2) 
## -- range = rho, the correlation length
## -- smoothness is nu as in the SPDE approach, nu = alpha - d/2, 
## -- alpha = 2 by default in INLA, d = 2 for 2D, so nu = 1 here
## -- phi is the marginal variance

## Simulate a realization of this process on the grid
set.seed(10)
data0 <- rnorm(nrow(locdist))
Matern_chol<- chol(Matern_cov) 
X <- Matern_chol %*% data0
Xdf <- data.frame(x = loc_grid[,1], y = loc_grid[,2], Xproc = X)

## Convert to a spatial grid data
Xsp <- Xdf
coordinates(Xsp) <- ~x+y
spplot(Xsp)

Next we generate some observed data at the same resolution. Suppose the measurement errors are spatially correlated with a Matern covariance function due to the processing mechanism.

## First we generate correlated measurement error for each point of the high resolution grid
## Also Assuming a Matern Correlation
## The correlation lenght is set to be 0.2
## The variance set to be 0.2
err_matern <- fields::Matern(d =locdist, range = 0.2, smoothness = 1, phi = 0.2) 

## Simulate a realization of this process on the grid
set.seed(12)
errsiid <- rnorm(nrow(locdist))
err_maternChol<- chol(err_matern) 
errs <- as.vector(err_maternChol %*% errsiid)
Xsp$data <- Xsp$Xproc + errs
Xsp$err <- errs
spplot(Xsp, "err")

We can create lower resolution observations.

## Create the lower resolution grid and aggregate data
data_loc <- as.matrix(expand.grid(seq(0.25, 5, 0.5), seq(0.25, 5, 0.5)))
data_grid <- SpatialPoints(coords = data_loc)
data_grid <- points2grid(data_grid)
data_grid <- as.SpatialPolygons.GridTopology(data_grid)
data_mean <- over(data_grid, Xsp, fn = mean)
data_sp <- SpatialPixelsDataFrame(data_grid, data=data.frame(proc_mean = data_mean$Xproc, obs = data_mean$data))
spplot(data_sp)

## The grid average can be viewd as a linear combination of the high res data Y = A%*%Xobs
## So the corresponding error covariance is given by A %*% Cov(Xobs) %*%
## Next we construct A by finding the grouping index
ids <- over(Xsp, data_grid)
A <- matrix(0, nrow = 100, ncol = 2500)
for (i in 1:100){
  A[i, which(ids == i)] <- 1/25 
}

## The covariance matrix for this aggregated data 
data_cov <- A %*% err_matern %*% t(A)

## The correlation has dropped below 0.05, so we can treat these error as almost unrelated.
data_cor <- data_cov
diag(data_cor) <- 0
max(abs(data_cor))

## [1] 0.04076291

Downscaling

In the downscaling problem, we know that the errors are uncorrelated at a lower resolution, but is given a high resolution data with correlated errors. The error structure is not given, although we can impose a parameteric correlation structure by comparing the two resolutions.

However, we do not want to incoorporate this correlation structure into the observation equation in the model. We only want to find a independent covariance structure that can best approximate the average variability of the true covariance.

We find this independent covariance structure by inflating the diagonal elements of the true covariance matrix based on the correlation structure. The inflation factor is found by letting variances of the mean of the variables to be equal under the two covariance assumptions.

## The variance of the a = mean(x) under the true covariance structure
v1 <- as.numeric(rep(1, 2500) %*% err_matern %*% rep(1, 2500)/(2500^2))

## The variance of the a = mean(x) assuming iid
v2 <- sum(diag(err_matern))/(2500^2) 

## The inflation should be 
alpha1 <- v1/v2

## The effective sample size is
n_eff1 <- sqrt(2500^2/alpha1)

## The approximate variance we should use is
v3 <- diag(2500)*0.2*alpha1

print(paste0("Inflation factor is: ", alpha1))

## [1] "Inflation factor is: 44.4907088216411"

print(paste0("Effective sample size is: ", n_eff1))

## [1] "Effective sample size is: 374.8049744042"

print(paste0("Marginal variance for correlated error is ", err_matern[1,1], "; for independent error is ", v3[1,1]))

## [1] "Marginal variance for correlated error is 0.2; for independent error is 8.89814176432822"

Upscaling

The upscaling problem is the opposite. We want to aggregate the data to higher resolution. This is very similar to what have shown in the data simulation section. We specify the covariance structure for high resolution data, and find the covariance for the low resolution one.

When the spatial unit of the low resolution one is larger than the correlation length, then the diagnal of the covariance matrix can be used approximately as the independent covariance matrix. Otherwise, we use the same infaltion method to find an independent covariance approximation.

Below we compare the diagonal of the covariance matrix of the low res grid data and the inflated covariance matrix.

v4 <- as.numeric(rep(1, 100) %*% data_cov %*% rep(1, 100)/(100^2))

v5 <- sum(diag(data_cov))/(100^2)

alpha2 <- v4/v5

n_eff2 <- sqrt(100^2/alpha2)

v6 <- diag(100)*data_cov[1,1]*alpha2

print(paste0("Inflation factor is: ", alpha2))

## [1] "Inflation factor is: 3.39162846733119"

print(paste0("Effective sample size is: ", n_eff2))

## [1] "Effective sample size is: 54.2995041944646"

print(paste0("Marginal variance for correlated error is ", data_cov[1,1], "; for independent error is ", v6[1,1]))

## [1] "Marginal variance for correlated error is 0.10494241158826; for independent error is 0.355925670573129"

The correlation for the aggregated data is already below 0.05, the difference is till quite big between the diagonal of the correlated covariance function and the independent one.

Compare in Bayesian inference

Now we try to see the effect of the error covariance matrix on the BHM prediction.

The Downscaled samples

We use the simulated downscaling samples to reconstruct the true process and compare the results from using uninflated and infalted errors.

## Generate the mesh
mesh <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 5, 0.3), seq(0, 5, 0.3))), 
                      offset = c(0.6, 0.6), max.edge = c(0.5, 0.5))

## set prior values for the hyper parameters, delibrately set to be different from the truth.
sigma0 <- 1
range0 <- 1
lkappa0 <- log(8)/2 - log(range0)
ltau0 <- 0.5*log(1/(4*pi)) - log(sigma0) - lkappa0

## build the spde model
spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3), B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                          theta.prior.mean = c(0,0), theta.prior.prec = c(0.01, 0.01))

## build the data stack
data1_loc <- as.matrix(loc_grid)
data1_obs <- Xsp$data
Ay <- inla.spde.make.A(mesh = mesh, loc = data1_loc)
st.est <- inla.stack(data = list(y=data1_obs), A = list(Ay),
                     effects = list(process = 1:spde$n.spde), tag = "est")

hyper <- list(prec = list(fixed = TRUE, initial = log(1)))
formula = y ~ -1 + f(process, model = spde)
prec_scale1 <- 1/diag(err_matern)
prec_scale2 <- 1/diag(v3)

## INLA inference 
res_inla1 <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                 scale = prec_scale1, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))

res_inla2 <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                 scale = prec_scale2, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))

## the parameters
res_pars1 <- inla.spde2.result(res_inla1, "process", spde, do.transf=TRUE)
res_pars2 <- inla.spde2.result(res_inla2, "process", spde, do.transf=TRUE)
plot(res_pars1$marginals.range.nominal[[1]], type = "l", main = "range")
lines(res_pars2$marginals.range.nominal[[1]], col = 2)

plot(res_pars1$marginals.variance.nominal[[1]], type = "l", main = "variance")
lines(res_pars2$marginals.variance.nominal[[1]], col = 2)

## the predictions
proj <- inla.mesh.projector(mesh,dims = c(500,500))
mean_post1 <- res_inla1$summary.random$process$mean
sd_post1 <- res_inla1$summary.random$process$sd
mean_post2 <- res_inla2$summary.random$process$mean
sd_post2 <- res_inla2$summary.random$process$sd

xygrid <- expand.grid(proj$x, proj$y)
predmean1 <- as.vector(inla.mesh.project(proj, as.vector(mean_post1)))
predu1 <- as.vector(inla.mesh.project(proj, as.vector(sd_post1)))
predmean2 <- as.vector(inla.mesh.project(proj, as.vector(mean_post2)))
predu2 <- as.vector(inla.mesh.project(proj, as.vector(sd_post2)))

trueprocess <- data.frame(x = loc_grid[,1], y = loc_grid[,2], mean = Xsp$Xproc)
observation <- data.frame(x = loc_grid[,1], y = loc_grid[,2], mean = Xsp$data)

predata1 <- data.frame(x = xygrid[,1], y = xygrid[,2], mean = predmean1, u = predu1, model = "uninflated")
predata2 <- data.frame(x = xygrid[,1], y = xygrid[,2], mean = predmean2, u = predu2, model = "inflated")

ggplot(trueprocess) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("true process value")

ggplot(observation) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("observed values")

ggplot(predata1) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("predicted mean")

ggplot(predata2) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("predicted mean (inflated error)")

ggplot(predata1) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = u))  + scale_fill_gradientn(colours = terrain.colors(8), limit = c(0,0.8))+ ggtitle("predicted uncertainty")

ggplot(predata2) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = u))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(0,0.8))+ ggtitle("predicted uncertainty (inflated error)")

The upscaled samples

Same procedure but we use the upscaled samples here.

## build the data stack
data2_loc <- as.matrix(data_sp@coords)
data2_obs <- data_sp$obs
Ay <- inla.spde.make.A(mesh = mesh, loc = data2_loc)
st.est <- inla.stack(data = list(y=data2_obs), A = list(Ay),
                     effects = list(process = 1:spde$n.spde), tag = "est")

hyper <- list(prec = list(fixed = TRUE, initial = log(1)))
formula = y ~ -1 + f(process, model = spde)
prec_scale1 <- 1/diag(data_cov)
prec_scale2 <- 1/diag(v6)

## INLA inference 
res_inla1 <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                 scale = prec_scale1, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))

res_inla2 <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                 scale = prec_scale2, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))

## the parameters
res_pars1 <- inla.spde2.result(res_inla1, "process", spde, do.transf=TRUE)
res_pars2 <- inla.spde2.result(res_inla2, "process", spde, do.transf=TRUE)
plot(res_pars1$marginals.range.nominal[[1]], type = "l", main = "range")
lines(res_pars2$marginals.range.nominal[[1]], col = 2)

plot(res_pars1$marginals.variance.nominal[[1]], type = "l", main = "variance")
lines(res_pars2$marginals.variance.nominal[[1]], col = 2)

## the predictions
proj <- inla.mesh.projector(mesh,dims = c(500,500))
mean_post1 <- res_inla1$summary.random$process$mean
sd_post1 <- res_inla1$summary.random$process$sd
mean_post2 <- res_inla2$summary.random$process$mean
sd_post2 <- res_inla2$summary.random$process$sd

xygrid <- expand.grid(proj$x, proj$y)
predmean1 <- as.vector(inla.mesh.project(proj, as.vector(mean_post1)))
predu1 <- as.vector(inla.mesh.project(proj, as.vector(sd_post1)))
predmean2 <- as.vector(inla.mesh.project(proj, as.vector(mean_post2)))
predu2 <- as.vector(inla.mesh.project(proj, as.vector(sd_post2)))

trueprocess <- data.frame(x = loc_grid[,1], y = loc_grid[,2], mean = Xsp$Xproc)
observation <- data.frame(x = data2_loc[,1], y = data2_loc[,2], mean = data2_obs)

predata1 <- data.frame(x = xygrid[,1], y = xygrid[,2], mean = predmean1, u = predu1, model = "uninflated")
predata2 <- data.frame(x = xygrid[,1], y = xygrid[,2], mean = predmean2, u = predu2, model = "inflated")

ggplot(trueprocess) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("true process value")

ggplot(observation) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("observed value")

ggplot(predata1) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("predicted mean")

ggplot(predata2) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = mean))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5))+ ggtitle("predicted mean (inflated error)")

ggplot(predata1) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = u))  + scale_fill_gradientn(colours = terrain.colors(8), limit = c(0,0.8))+ ggtitle("predicted uncertainty")

ggplot(predata2) + coord_fixed() + scale_x_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,5),  expand = c(0.02, 0.02)) + geom_raster(aes(x = x, y = y, fill = u))  + scale_fill_gradientn(colours = terrain.colors(12), limit = c(0,0.8))+ ggtitle("predicted uncertainty (inflated error)")