INLA Mesh test

We test the mesh effect on the marginal variances of Gaussian processes on \(R^2\) and \(S^2\) using the INLA package. The reason is that for a stationary Gaussian process, we would expect the marginal variances to be constant (at least within the boundary) however, we have observed non-constant variances: for irregular mesh, the differences are random and huge; while for regular mesh, the differences show certain pattern at the boundary or where the triangles change shapes.

Based on results (need references) of the finite element methods for SPDEs, we have the following statements

The non-constant variance is caused by the shape change of the triangles.
For regular mesh, when the triangles get small enough, the mesh effect should dis sapper.
- How small is enough?
- Relationship between maximum triangle edge length and the process correlation length.
- Maximum edge < < correlation length? Or the other way round?
For different high resolution meshes, the inferences on the parameters should be similar.

Therefore, in the following we tests the above statements first on \(R^2\) then \(S^2\). The first statement is obvious and we focus on the other two. Without loss of generality, we choose to test on a the square \([0,1] \times [0,1] \subset R^2\) and to reduce the boundary effects, the square is wrapped by a convex hull.

1. Marginal variances for a stationary process

1.1 Regular grid on \(R^2\)

We first test on \(R^2\). The correlation length is fixed to be \(\rho = 0.1\) and variance to be \(\sigma^2 = 1\). The regular grid is generated by using equally spaced grid points and we increase the number of points in each dimension to have smaller triangles. To make sure the triangles are all the same within the boundary, we choose the maximum edge length to be \(\sqrt{2}/l\), where \(l\) is the number of segments in \(x\) (or \(y\)) direction.

library(INLA)
if(Sys.info()['sysname'] == "Linux"){INLA:::inla.dynload.workaround()}
library(lattice)
library(fields)
library(GEOmap)
library(ggplot2)
library(gridExtra)
library(geoR)

## a wrapper for generating mesh, evaluating the variance and plot the result
plot.mvar <- function(range0, locres, max.edge, plot.mesh = TRUE, xylim = FALSE){
  sigma0 <- 1
  loc <- as.matrix(expand.grid(seq(0, 1, locres), seq(0, 1, locres)))
  mesh <- inla.mesh.2d(loc = loc, offset = c(0.1, 0.4), max.edge = max.edge )
  lkappa0 <- log(8)/2 - log(range0)
  ltau0 <- 0.5*log(1/(4*pi)) - log(sigma0) - lkappa0
  
  spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3),
                            B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                            theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 1))
  Q <- inla.spde.precision(spde, theta = c(0,0))
  mvar <- sqrt(diag(inla.qinv(Q)))
  proj <- inla.mesh.projector(mesh, dims = c(100,100))
  
  if(xylim){
    image.plot(proj$x, proj$y, inla.mesh.project(proj, as.vector(mvar)), 
             col = topo.colors(24),
             xlim = c(0,1), ylim = c(0,1), xlab = "", ylab="",
             main = paste("edge length = ", locres))
  }else{
      image.plot(proj$x, proj$y, inla.mesh.project(proj, as.vector(mvar)), 
                 col = topo.colors(24),
                 xlab = "", ylab="",
                 main = paste("edge length = ", locres))
    }
  
  if(plot.mesh){
    plot(mesh, add = TRUE)}
}

### fix range and vary grid size
par(mfrow = c(2,3))
plot.mvar(range0=0.1, locres = 0.3, max.edge = c(0.45, 0.45))
plot.mvar(range0=0.1, locres = 0.2, max.edge = c(0.3, 0.3))
plot.mvar(range0=0.1, locres = 0.1, max.edge = c(0.15, 0.15))
plot.mvar(range0=0.1, locres = 0.08, max.edge = c(0.12, 0.12), plot.mesh = FALSE)
plot.mvar(range0=0.1, locres = 0.04, max.edge = c(0.06, 0.06), plot.mesh = FALSE)
plot.mvar(range0=0.1, locres = 0.02, max.edge = c(0.03, 0.03), plot.mesh = FALSE)

Remarks

The true marginal variance should be equal to 1 and the SPDE approximation gets more accurate when the maximum edge lenghth is close to and smaller than the correlation length of the process.

We may conclude that the best resolution of the mesh would be the one with triangle length approximately equal to or no larger than the process correlation length.

1.2 Semi-regular grid on \(S^2\)

Next we do the same test on \(S^2\). Note that on \(S^2\) it is not possible to generate a mesh that is entirely regular. We used the INLA function to generate a semi-regular mesh based on evenly segmenting the edges of an icosahedron.

## a wrapper for generating mesh on the sphere, evaluating the variance and plot the result
plot.mvarS2 <- function(range0, nseg){
  sigma0 <- 1
  mesh <- inla.mesh.create(globe = nseg)
  
  lkappa0 <- log(8)/2 - log(range0)
  ltau0 <- 0.5*log(1/(4*pi)) - log(sigma0) - lkappa0
  
  spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3),
                            B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                            theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 0.1))
  Q <- inla.spde.precision(spde, theta = c(0,0))
  mvar <- diag(inla.qinv(Q))
  proj <- inla.mesh.projector(mesh, dims = c(360,181))

  image.plot(proj$x, proj$y, inla.mesh.project(proj, as.vector(mvar)), 
             col = topo.colors(20),
             xlab = "", ylab="",
             main = paste("edge length = ", 1/nseg))
}

### fix range and vary grid size
par(mfrow = c(2,3))
plot.mvarS2(range0=0.1, nseg = 2)
plot.mvarS2(range0=0.1, nseg = 5)
plot.mvarS2(range0=0.1, nseg = 10)
plot.mvarS2(range0=0.1, nseg = 20)
plot.mvarS2(range0=0.1, nseg = 50)
plot.mvarS2(range0=0.1, nseg = 100)

Remarks

Same as the previous test. The mesh effect due to the shape change of triangles can not be eliminated but the variation becomes negligible when the edge length of the triangles gets smaller.

2. Inference on the parameters

Now we test the mesh effect on parameter inference.

2.1 On \(R^2\)

Here we compare posterior marginal distributions of the hyper parameters, \(\rho\) and \(\sigma^2\) from four different meshes using simulated data. The four meshes include three regular meshes with maximum edge length larger, approximately equal and smaller than the correlation length and an irregular mesh with maximum edge length approximately equal to the correlation length.

2.1.1 Compare the GMRF approximation and the Matern samples.

First we compare the simulated data from a Matern covarance matrix and a GMRF approximation. The Matern covariance is computed by using functions from two different R packages, fields::Matern and geoR::matern. Not that the parameterization in the matern function is different from the that in INLA!!

With the same parameter values, the GMRF approximation is slightly different from the one given by Matern covariance.

## Simulate the true process on a fine grid
set.seed(12)
loc_grid <- as.matrix(expand.grid(seq(0, 1, 0.05), seq(0, 1, 0.05)))
mesh <- inla.mesh.2d(loc = loc_grid, offset = c(0.1, 0.4), max.edge = c(0.072, 0.072) )
locdist <- as.matrix(dist(mesh$loc))
data0 <- rnorm(nrow(locdist))

## Matern covariance # to be compared with inla: smoothness(nu) = alpha - d/2, d = 2 here and default in inla alpha = 2.
Matern_cov1 <- Matern(d =locdist, range = 0.033, smoothness = 1, phi = 1) 
Matern_cov2 <- matern(u =locdist, phi = 0.033, kappa = 1) 
## GMRF precision Matrix
lkappa0 <- log(8)/2 - log(0.1)
ltau0 <- 0.5*log(1/(4*pi)) - log(1) - lkappa0
spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3),
                            B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                            theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 0.1))
Q <- inla.spde.precision(spde, theta = c(0,0))
## Matern and GMRF samples
Matern_chol1<- chol(Matern_cov1) 
Matern_chol2<- chol(Matern_cov2)
Matern_data1 <- Matern_chol1 %*% data0
Matern_data2 <- Matern_chol2 %*% data0
Q_chol <- chol(Q)
GMRF_data <- as.vector(solve(t(Q_chol), data0))
## assemble for plotting
proj <- inla.mesh.projector(mesh, dims = c(100,100))
xy <- expand.grid(proj$x, proj$y)
Mdata1 <- as.vector(inla.mesh.project(proj, Matern_data1))
Mdata2 <- as.vector(inla.mesh.project(proj, Matern_data2))
Gdata <-  as.vector(inla.mesh.project(proj, GMRF_data))
rndata <-  as.vector(inla.mesh.project(proj, data0))
simdata <- data.frame(x =xy[,1] , y = xy[,2], rndaa = rndata, Mdata1 = Mdata1, Mdata2 = Mdata2, Gdata = Gdata)

This is because:

In the Matern functions, \(\rho\) is the length scale parameter and it need to be adjusted so that the practical range is the distance at which the correlation drops to about 0.05; that is for \(\nu = 1\), \(\mbox{practical range} = 0.1\), \(\rho\) should be adjusted to 0.025.
The range parameter in INLA GMRF is the distance at which the correlation drops to about 0.13 (NOT 0.05 as usually defined in the Matern!)
The GMRF approximation cuts all long-range correlations.
We adjust \(rho\) in Matern so that correlation also drops to about 0.1 at the practical range; that is for \(\nu = 1\), \(\mbox{practical range} = 0.1\), we use \(\rho = 0.033\).
Using setting in 4, the GMRF approximation and Matern are almost the same.

randnorm_p <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = rndata)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("normal random") 
Matern_p1 <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = Mdata1)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("fields:Matern") 
Matern_p2 <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = Mdata2)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("geoR::matern") 
GMRF_p <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = Gdata)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("GMRF") 

grid.arrange(randnorm_p, GMRF_p, Matern_p1, Matern_p2, ncol = 2 )

2.1.2 Inference test

Since there is slight difference between the parameter definiation of the GMRF representation and the Matern covarance. We simulate data directly using the GMRF representation on a fine grid. In the following experiment, we assume the true process has GMRF parameters \(\rho = 0.1, \sigma^2 = 1\) and we again study the region \([0,1] \times [0,1] \subset R^2\).

set.seed(21)
loc_grid <- as.matrix(expand.grid(seq(0, 1, 0.02), seq(0, 1, 0.02)))
mesh <- inla.mesh.2d(loc = loc_grid, offset = c(0.1, 0.4), max.edge = c(0.03, 0.03) )
lkappa0 <- log(8)/2 - log(0.1)
ltau0 <- 0.5*log(1/(4*pi)) - log(1) - lkappa0
spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3),
                            B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                            theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 0.1))
Q <- inla.spde.precision(spde, theta = c(0,0))
Qsamp <- inla.qsample(n=1, Q)
proj <- inla.mesh.projector(mesh, dims = c(200,200))
xy <- expand.grid(proj$x, proj$y)
Gdata <-  as.vector(inla.mesh.project(proj, Qsamp))
simdata <- data.frame(x =xy[,1] , y = xy[,2], Gdata = Gdata)

From this grid we sample 1000 data points as our observations.

keep <- which(xy[,1] >=0.25 & xy[,1] <=0.75 & xy[,2] >=0.25 & xy[,2] <=0.75)
obs_ind <- sample(keep, 1000)
errs <- rnorm(1000)*0.1
obs_data <- simdata[obs_ind,]
obs_data$obs <- obs_data$Gdata + errs

GMRF_p <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = Gdata)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("GMRF") 
sample_data <- GMRF_p + geom_point(data=obs_data, aes(x, y), pch = 20, size = abs(errs) *10) + ggtitle("Sampled observations") 
print(sample_data)

Then we generate four different meshes: 3 regular meshes with increasing resolution and 1 irregular grid with resolution similar to the range.

## generate 4 different meshes
## Regular
mesh1 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.2), seq(0, 1, 0.2))), 
                      offset = c(0.1, 0.4), max.edge = c(0.3, 0.3))
mesh2 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.07), seq(0, 1, 0.07))), 
                      offset = c(0.1, 0.4), max.edge = c(0.1, 0.1))
mesh3 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.02), seq(0, 1, 0.02))), 
                      offset = c(0.1, 0.4), max.edge = c(0.03, 0.03))
## Irregular
data_loc <- matrix(runif(2000), ncol = 2)
mesh4 <- inla.mesh.2d(loc = data_loc, cutoff = 0.05, offset = c(0.1, 0.4), max.edge = c(0.1, 0.1))
mesh5 <- inla.mesh.2d(loc = data_loc, cutoff = 0.02, offset = c(0.1, 0.4), max.edge = c(0.03, 0.03))
mesh_names <- c("r - 0.3", "r - 0.1", "r - 0.03", "ir - 0.1", "ir - 0.03")

Now with the data and four meshes, we use INLA to do the parameter inference and prediction.

## A wrapper to do the INLA inference and prediction
INLA_infer <- function(mesh, data, S2 = FALSE){
  ## set prior values for the hyper parameters, delibrately set to be different from the truth.
  sigma0 <- 0.5
  range0 <- 0.5
  lkappa0 <- log(8)/2 - log(range0)
  ltau0 <- 0.5*log(1/(4*pi)) - log(sigma0) - lkappa0

  spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3), B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                          theta.prior.mean = c(0,0), theta.prior.prec = c(0.01, 0.01))
  
  data_loc <- as.matrix(data[,1:3])
  obs <- data$obs
  
  
  Ay <- inla.spde.make.A(mesh = mesh, loc = data_loc)
  st.est <- inla.stack(data = list(y=obs), A = list(Ay),
                     effects = list(process = 1:spde$n.spde), tag = "est")
  
  hyper <- list(prec = list(fixed = TRUE, initial = log(1)))
  formula = y ~ -1 + f(process, model = spde)
  prec_scale <- 1/(0.1^2)
  
  res_inla <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                  scale = prec_scale, control.family = list(hyper = hyper),
                   control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))
  ## the parameters
  res_pars <- inla.spde2.result(res_inla, "process", spde, do.transf=TRUE)
  
  ## the predictions
  if(S2){
    proj <- inla.mesh.projector(mesh, projection = "longlat", dims = c(360,180), xlim = c(0,360))
  }else{
    proj <- inla.mesh.projector(mesh,dims = c(500,500))
  }
  mean_post <- res_inla$summary.random$process$mean
  sd_post <- res_inla$summary.random$process$sd
  
  return(list(pars = res_pars, proj = proj, mean = mean_post, sd = sd_post))
}

res1 <- INLA_infer(mesh1, obs_data)
res2 <- INLA_infer(mesh2, obs_data)
res3 <- INLA_infer(mesh3, obs_data)
res4 <- INLA_infer(mesh4, obs_data)
res5 <- INLA_infer(mesh5, obs_data)

Compare the posteriors of the parameters from the four meshes.The inference gets accurate as the mesh resolution gets higher. To get reliable estimates, the maximum edge length of the mesh should be no larger than the correlation length of the process.

pars1 <- res1$pars
pars2 <- res2$pars
pars3 <- res3$pars
pars4 <- res4$pars
pars5 <- res5$pars
## Plot the range 
 plot(pars1$marginals.range.nominal[[1]], type = "l",
      main = "range", xlim = c(0,0.35), ylim = c(0, 50))
 lines(pars2$marginals.range.nominal[[1]], col = 2)
 lines(pars3$marginals.range.nominal[[1]], col = 4)
 lines(pars4$marginals.range.nominal[[1]], col = 2, lty = 2)
 lines(pars5$marginals.range.nominal[[1]], col = 4, lty = 2)
 legend("topright", legend = mesh_names, lwd = rep(1,5), 
        lty = c(1,1,1,2,2), col = c(1, 2, 4, 2, 4))

## Plot the marginal variance
plot(pars1$marginals.variance.nominal[[1]], type = "l",
     main = "marginal variance", xlim = c(0, 4), ylim = c(0,4.5))
lines(pars2$marginals.variance.nominal[[1]], col = 2)
lines(pars3$marginals.variance.nominal[[1]], col = 4)
lines(pars4$marginals.variance.nominal[[1]], col = 2, lty = 2)
lines(pars5$marginals.variance.nominal[[1]], col = 4, lty = 2)
legend("topright", legend = mesh_names, lwd = rep(1,5), 
       lty = c(1,1,1,2,2), col = c(1, 2, 4, 2, 4))

Finally we compare the predicted mean field. Higher resolution mesh also show better performance.

plot_mean <- function(res, title, S2 = FALSE, bound=TRUE){
  if(bound){
    lims = c(0,1)
  }else{
    lims = NULL
  }
  xygrid <- expand.grid(res$proj$x, res$proj$y)
  z <- as.vector(inla.mesh.project(res$proj, as.vector(res$mean)))
  preddata <- data.frame(x=xygrid[,1], y = xygrid[,2], z = z)
  if(S2){
    ggplot(preddata) + coord_fixed() + scale_x_continuous(limits=c(0,360),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(-90,90),  expand = c(0.02, 0.02)) +
    geom_raster(aes(x = x, y = y, fill = z)) + 
    scale_fill_gradientn(colours = terrain.colors(12)) +
    ggtitle(title)
  }else{
  ggplot(preddata) + coord_fixed() + scale_x_continuous(limits=lims,  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=lims,  expand = c(0.02, 0.02)) +
    geom_raster(aes(x = x, y = y, fill = z)) + 
    scale_fill_gradientn(colours = terrain.colors(12), limit = c(-5,5)) +
    ggtitle(title)
  }
}


m1 <- plot_mean(res1, title = mesh_names[1])
m2 <- plot_mean(res2, title = mesh_names[2])
m3 <- plot_mean(res3, title = mesh_names[3])
m4 <- plot_mean(res4, title = mesh_names[4])
m5 <- plot_mean(res5, title = mesh_names[5])

grid.arrange(GMRF_p, m1, m2, m3, m4, m5, ncol = 2)

And also compare the predicted uncertainties. The uncertainty get dominated by the measurement error of the observations in the middle.

plot_sd <- function(res, title, S2=FALSE, bound = TRUE){
  if(bound){
    lims = c(0,1)
  }else{
    lims = NULL
  }
  xygrid <- expand.grid(res$proj$x, res$proj$y)
  sd <- as.vector(inla.mesh.project(res$proj, as.vector(res$sd)))
  preddata <- data.frame(x=xygrid[,1], y = xygrid[,2], sd = sd)
  if(S2){
    ggplot(preddata) + coord_fixed() + scale_x_continuous(limits=c(0,360),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(-90, 90),  expand = c(0.02, 0.02)) +
    geom_raster(aes(x = x, y = y, fill = sd)) + 
    scale_fill_gradientn(colours = terrain.colors(12)) +
    ggtitle(title)
  }else{
    ggplot(preddata) + coord_fixed() + scale_x_continuous(limits=lims,  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=lims,  expand = c(0.02, 0.02)) +
    geom_raster(aes(x = x, y = y, fill = sd)) + 
    scale_fill_gradientn(colours = terrain.colors(12), limits = c(0.001,2)) +
    ggtitle(title)
  }
}

sd1 <- plot_sd(res1, title = mesh_names[1])
sd2 <- plot_sd(res2, title = mesh_names[2])
sd3 <- plot_sd(res3, title = mesh_names[3])
sd4 <- plot_sd(res4, title = mesh_names[4])
sd5 <- plot_sd(res5, title = mesh_names[5])
print(sd1)

grid.arrange(sd2, sd4, ncol = 2)

grid.arrange(sd3, sd5, ncol = 2)

2.1.3 Test boundary effects

We do the same test as above but use uniform random samples over the entire region rather than only in the middle. We compare the result with the previous test by using the regular mesh with highest resolution.

keep <- which(xy[,1] >=0 & xy[,1] <=1 & xy[,2] >=0 & xy[,2] <=1)
obs_ind2 <- sample(keep, 1000)
errs2 <- rnorm(1000)*0.1
obs_data2 <- simdata[obs_ind2,]
obs_data2$obs <- obs_data2$Gdata + errs2

GMRF_p <- ggplot(simdata) + coord_fixed() + scale_x_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
    scale_y_continuous(limits=c(0,1),  expand = c(0.02, 0.02)) + 
  geom_raster(aes(x = x, y = y, fill = Gdata)) + 
  scale_fill_gradientn(colours = terrain.colors(12), limit = c(-4,4)) +
  ggtitle("GMRF") 
sample_data2 <- GMRF_p + geom_point(data=obs_data2, aes(x, y), pch = 20, size = abs(errs2) *10) + ggtitle("Sampled observations") 
print(sample_data2)

Do the inference and plot the posteriors of the parameters.

res3.2 <- INLA_infer(mesh3, obs_data2)
pars3.2 <- res3.2$pars
par(mfrow = c(1,2))
 plot(pars3$marginals.range.nominal[[1]], type = "l",
      main = "range", xlim = c(0,0.25), ylim = c(0, 70))
 lines(pars3.2$marginals.range.nominal[[1]], col = 2)
 legend("topright", legend = c("within boundary", "all over"), lwd = rep(1,2), 
        lty = c(1,2), col = c(1, 2))

## Plot the marginal variance
plot(pars3$marginals.variance.nominal[[1]], type = "l",
     main = "marginal variance", xlim = c(0, 3), ylim = c(0,4.5))
lines(pars3.2$marginals.variance.nominal[[1]], col = 2)
legend("topright", legend = c("within boundary", "all over"), lwd = rep(1,2), 
       lty = c(1,2), col = c(1, 2))

Plot the prediction and uncertainties.

m3.2 <- plot_mean(res3.2, title = mesh_names[3])
sd3.2 <- plot_sd(res3.2, title = mesh_names[3])
grid.arrange(m3, m3.2, sd3, sd3.2, ncol = 2)

The results are the almost identical. By using points all over the study region, the posteriors get mor concentrated. In fact the boundary effects are taken care of when generating the meshes. See the plots of the entire mesh region below.

m3b <- plot_mean(res3, title = mesh_names[3], bound = FALSE)
sd3b <- plot_sd(res3, title = mesh_names[3], bound = FALSE)
m3.2b <- plot_mean(res3.2, title = mesh_names[3], bound = FALSE)
sd3.2b <- plot_sd(res3.2, title = mesh_names[3], bound = FALSE)
grid.arrange(m3b, m3.2b, sd3b, sd3.2b, ncol = 2)

2.2 On \(S^2\)

Finally, we do a similar test for mesh on the sphere. This time for various reasons (more details about stationary GP on the sphere and distances), we simulate the data directly using the SPDE approximation. We assume the correlation length is about 400km and this is approximately 0.06 (\(400/6371\)) for a unit ball representation. Hence the truth value is \(\rho = 0.06\) and \(\sigma^2 = 1\). As shown in the above experiments, the best approximation of such a process would be on a regular mesh with triangle edge length close to \(0.06\), so we generate a semi-regular mesh with \(nseg = 17\).

sigma_t <- 1
rho_t <- 0.06
lkappa_t <- log(8)/2 - log(rho_t)
ltau_t <- 0.5*log(1/(4*pi)) - log(sigma_t) - lkappa_t
mesh_s <- inla.mesh.create(globe = 17)
spde_s <- inla.spde2.matern(mesh_s, B.tau = matrix(c(ltau_t, -1, 1),1,3),
                          B.kappa = matrix(c(lkappa_t, 0, -1), 1,3), 
                          theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 0.1))
Q_s <- inla.spde.precision(spde_s, theta = c(0,0))
data_s <- inla.qsample(n=1,Q_s)
proj_s <- inla.mesh.projector(mesh_s, dims = c(360,181), xlim = c(0, 360))
xy_s <- expand.grid(proj_s$x, proj_s$y)
data_sproj <- as.vector(inla.mesh.project(proj_s, as.vector(data_s)))
simdataS2 <- data.frame(lon = xy_s[,1], lat = xy_s[,2], samp = data_sproj)

## 1000 uniform sample all over the vertices 
data_ind1 <- sample(1: nrow(Q_s), 1000)
samp1 <- data_s[data_ind1]
samp1_loc <- mesh_s$loc[data_ind1,]
samp1_coords <- Lxyz2ll(list(x=samp1_loc[,1], y = samp1_loc[,2], z = samp1_loc[,3]))
samp1_coords$lon <- ifelse(samp1_coords$lon < 0, samp1_coords$lon + 360, samp1_coords$lon)
data_samp1 <- data.frame(x = samp1_loc[,1], y = samp1_loc[, 2], z = samp1_loc[,3],
                         lon=samp1_coords$lon, lat = samp1_coords$lat, samp=samp1)
## 200 samples from a region
keep_s <- which(xy_s[,1] > 20 & xy_s[,1] < 60 & xy_s[,2] > 40 & xy_s[,2] < 60)
data_ind2 <- sample(keep_s, 200)
samp2 <- data_sproj[data_ind2]
samp2_coords <- xy_s[data_ind2,]
loc2 <- do.call(cbind, Lll2xyz(lat = samp2_coords[,2], lon = samp2_coords[,1]))
data_samp2 <- data.frame(x = loc2[, 1], y = loc2[, 2], z = loc2[,3], 
                         lon = samp2_coords[,1], lat = samp2_coords[,2], samp=samp2)

## Assemble the observations
errs_s <- rnorm(1200)*0.1
obs_dataS2 <- rbind(data_samp1, data_samp2)
obs_dataS2$obs <- obs_dataS2$samp + errs_s

sampS2_p <- ggplot(data=simdataS2) + geom_raster(aes(x = lon, y = lat, fill = samp)) + 
  coord_fixed() + xlab("Longitude") + ylab("Latitude") + 
  scale_x_continuous(limits=c(0,360),  expand = c(0, 0)) + scale_y_continuous(limits=c(-90,90),  expand = c(0, 0)) + 
  scale_fill_gradientn(colors = terrain.colors(12), guide = guide_colorbar(barwidth = 2, barheight = 10, label.position = "right", title.position = "bottom")) 

sampS2obs_p <- sampS2_p + geom_point(data=obs_dataS2, aes(lon, lat), pch = 20, size = abs(errs_s) *10) + ggtitle("Sampled observations")
print(sampS2obs_p)

Then do the inference using different a semi-regular mesh and a irregular mesh

## semi-regular
mesh1_s <- mesh_s
mesh2_s <- inla.mesh.create(globe = 33)

## irregular mesh
coords <- as.matrix(expand.grid(seq(-86, 86, 2),seq(-176, 176, 2)))
coords <- coords + matrix(rnorm(nrow(coords)*2)*0.3, ncol =2)
s2_loc <- inla.mesh.map(loc = coords, projection=("longlat"))
mesh3_s <- inla.mesh.2d(loc = s2_loc, cutoff = 0.06, max.edge = 0.1)
mesh4_s <- inla.mesh.2d(loc = s2_loc, cutoff = 0.03, max.edge = 0.05)

mesh_names2 <- c("r - 0.06 - S2", "r - 0.03 - S2", "ir - 0.06 - S2", "ir - 0.03 - S2")
res1s <- INLA_infer(mesh1_s, obs_dataS2, S2 = TRUE)
res2s <- INLA_infer(mesh2_s, obs_dataS2, S2 = TRUE)
res3s <- INLA_infer(mesh3_s, obs_dataS2, S2 = TRUE)
res4s <- INLA_infer(mesh4_s, obs_dataS2, S2 = TRUE)

Plot the posteriors of the parameters. Regular meshes are better than the irregular at the same resolution and high resolution meshes have more concentrated posteriors.

pars1s <- res1s$pars 
pars2s <- res2s$pars 
pars3s <- res3s$pars
pars4s <- res4s$pars

## Plot the range 
par(mfrow = c(1,2))
 plot(pars1s$marginals.range.nominal[[1]], type = "l",
      main = "range", xlim= c(0,0.15), ylim = c(0,200))
 lines(pars2s$marginals.range.nominal[[1]], col = 2)
 lines(pars3s$marginals.range.nominal[[1]], col = 1, lty = 2, lwd = 3)
 lines(pars4s$marginals.range.nominal[[1]], col = 2, lty = 2)
 legend("topright", legend = mesh_names2, 
        lty = c(1,1,2,2), col = c(1, 2, 1, 2))

## Plot the marginal variance
plot(pars1s$marginals.variance.nominal[[1]], type = "l",
     main = "marginal variance", xlim = c(0,3), ylim = c(0,10))
lines(pars2s$marginals.variance.nominal[[1]], col = 2)
lines(pars3s$marginals.variance.nominal[[1]], col = 1, lty = 2)
lines(pars4s$marginals.variance.nominal[[1]], col = 2, lty = 2)
legend("topright", legend = mesh_names2, 
       lty = c(1,1,2,2), col = c(1, 2, 1, 2))

Plot the prediction and uncertainities. The regular higher resolution mesh are better. Uncertainties get much lower at regions with more observations.

m1 <- plot_mean(res1s, title = mesh_names2[1], S2=TRUE)
m2 <- plot_mean(res2s, title = mesh_names2[2], S2=TRUE)
m3 <- plot_mean(res3s, title = mesh_names2[3], S2=TRUE)
m4 <- plot_mean(res4s, title = mesh_names2[4], S2=TRUE)
grid.arrange(m1, m2, m3, m4, ncol = 2)

sd1 <- plot_sd(res1s, title = mesh_names2[1], S2=TRUE)
sd2 <- plot_sd(res2s, title = mesh_names2[2], S2=TRUE)
sd3 <- plot_sd(res3s, title = mesh_names2[3], S2=TRUE)
sd4 <- plot_sd(res4s, title = mesh_names2[4], S2=TRUE)
grid.arrange(sd1, sd2, sd3, sd4, ncol = 2)

2.3 Test stabilities

As an illustration, we use 10 replications of the above experiment to show the stabilities of estimation results from different meshes. The code chunk below uses parallel computing for the replications and would be better to run on a cluster with more than 24 cores since the INLA precedure is also parallelled. The number of CPUs used by a single INLA procedure is not fixed and the dense mesh takes longer time and more cores.

library(INLA)
if(Sys.info()['sysname'] == "Linux"){INLA:::inla.dynload.workaround()}
library(parallel)

loc_grid <- as.matrix(expand.grid(seq(0, 1, 0.02), seq(0, 1, 0.02)))
mesh <- inla.mesh.2d(loc = loc_grid, offset = c(0.1, 0.4), max.edge = c(0.03, 0.03) )
lkappa0 <- log(8)/2 - log(0.1)
ltau0 <- 0.5*log(1/(4*pi)) - log(1) - lkappa0
spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3),
                            B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                            theta.prior.mean = c(0,0), theta.prior.prec = c(0.1, 0.1))
Q <- inla.spde.precision(spde, theta = c(0,0))
proj <- inla.mesh.projector(mesh, dims = c(200,200))
xy <- expand.grid(proj$x, proj$y)
keep <- which(xy[,1] >=0.25 & xy[,1] <=0.75 & xy[,2] >=0.25 & xy[,2] <=0.75)

INLA_infer <- function(mesh, data, S2 = FALSE){
  ## set prior values for the hyper parameters, delibrately set to be different from the truth.
  sigma0 <- 0.5
  range0 <- 0.5
  lkappa0 <- log(8)/2 - log(range0)
  ltau0 <- 0.5*log(1/(4*pi)) - log(sigma0) - lkappa0

  spde <- inla.spde2.matern(mesh, B.tau = matrix(c(ltau0, -1, 1),1,3), B.kappa = matrix(c(lkappa0, 0, -1), 1,3), 
                          theta.prior.mean = c(0,0), theta.prior.prec = c(0.01, 0.01))
  
  data_loc <- as.matrix(data[,1:3])
  obs <- data$obs
  
  
  Ay <- inla.spde.make.A(mesh = mesh, loc = data_loc)
  st.est <- inla.stack(data = list(y=obs), A = list(Ay),
                     effects = list(process = 1:spde$n.spde), tag = "est")
  
  hyper <- list(prec = list(fixed = TRUE, initial = log(1)))
  formula = y ~ -1 + f(process, model = spde)
  prec_scale <- 1/(0.1^2)
  
  res_inla <- inla(formula, data = inla.stack.data(st.est, spde = spde), family = "gaussian",
                  scale = prec_scale, control.family = list(hyper = hyper),
                   control.predictor=list(A=inla.stack.A(st.est), compute =TRUE))
  ## the parameters
  res_pars <- inla.spde2.result(res_inla, "process", spde, do.transf=TRUE)
  
  ## the predictions
  if(S2){
    proj <- inla.mesh.projector(mesh, projection = "longlat", dims = c(360,180), xlim = c(0,360))
  }else{
    proj <- inla.mesh.projector(mesh,dims = c(500,500))
  }
  mean_post <- res_inla$summary.random$process$mean
  sd_post <- res_inla$summary.random$process$sd
  
  return(list(pars = res_pars, proj = proj, mean = mean_post, sd = sd_post))
}


repFun <- function(mesh){
Qsamp <- inla.qsample(n=1, Q)
Gdata <-  as.vector(inla.mesh.project(proj, Qsamp))
simdata <- data.frame(x =xy[,1] , y = xy[,2], Gdata = Gdata)
obs_ind <- sample(keep, 1000)
errs <- rnorm(1000)*0.1
obs_data <- simdata[obs_ind,]
obs_data$obs <- obs_data$Gdata + errs
res <- INLA_infer(mesh, obs_data)
pars <- res$pars
return(pars)
}

## generate 4 different meshes
## Regular
mesh1 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.2), seq(0, 1, 0.2))), 
                      offset = c(0.1, 0.4), max.edge = c(0.3, 0.3))
mesh2 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.07), seq(0, 1, 0.07))), 
                      offset = c(0.1, 0.4), max.edge = c(0.1, 0.1))
mesh3 <- inla.mesh.2d(loc = as.matrix(expand.grid(seq(0, 1, 0.02), seq(0, 1, 0.02))), 
                      offset = c(0.1, 0.4), max.edge = c(0.03, 0.03))
## Irregular
data_loc <- matrix(runif(2000), ncol =2)
mesh4 <- inla.mesh.2d(loc = data_loc, cutoff = 0.05, offset = c(0.1, 0.4), max.edge = c(0.1, 0.1))
mesh5 <- inla.mesh.2d(loc = data_loc, cutoff = 0.02, offset = c(0.1, 0.4), max.edge = c(0.03, 0.03))
mesh_names <- c("r - 0.3", "r - 0.1", "r - 0.03", "ir - 0.1", "ir - 0.03")

set.seed(51)
reg2_pars <- mclapply(1:10, function(x) return(repFun(mesh=mesh2)), mc.cores = 10)
reg3_pars <- mclapply(1:10, function(x) return(repFun(mesh=mesh3)), mc.cores = 5)
reg4_pars <- mclapply(1:10, function(x) return(repFun(mesh=mesh4)), mc.cores = 10)
reg5_pars <- mclapply(1:10, function(x) return(repFun(mesh=mesh5)), mc.cores = 5)

Now we plot the results for the two hyper-parameters.

par(mfrow = c(2,2))
plot(reg2_pars[[1]]$marginals.range.nominal[[1]], type = "l", col = "grey",
      main = paste(mesh_names[2], " -- range"), xlim = c(0,0.3), ylim = c(0, 50))

for(i in 2:10){
 lines(reg2_pars[[i]]$marginals.range.nominal[[1]], col = "grey")
}

plot(reg3_pars[[1]]$marginals.range.nominal[[1]], type = "l", col = "grey",
      main = paste(mesh_names[3], " -- range"), xlim = c(0,0.3), ylim = c(0, 50))
for(i in 2:10){
 lines(reg3_pars[[i]]$marginals.range.nominal[[1]], col = "grey")
}

plot(reg4_pars[[1]]$marginals.range.nominal[[1]], type = "l", col = "grey",
      main = paste(mesh_names[4], " -- range"), xlim = c(0,0.3), ylim = c(0, 50))
for(i in 2:10){
 lines(reg4_pars[[i]]$marginals.range.nominal[[1]], col = "grey")
}

plot(reg5_pars[[1]]$marginals.range.nominal[[1]], type = "l", col = "grey",
      main = paste(mesh_names[5], " -- range"), xlim = c(0,0.3), ylim = c(0, 50))
for(i in 2:10){
 lines(reg5_pars[[i]]$marginals.range.nominal[[1]], col = "grey")
}

par(mfrow = c(2,2))
plot(reg2_pars[[1]]$marginals.variance.nominal[[1]], type = "l", col = "grey", 
      main = paste(mesh_names[2], " -- variance"), xlim = c(0,5), ylim = c(0, 5))
for(i in 2:10){
 lines(reg2_pars[[i]]$marginals.variance.nominal[[1]], col = "grey")
}

plot(reg3_pars[[1]]$marginals.variance.nominal[[1]], type = "l", col = "grey", 
      main = paste(mesh_names[3], " -- variance"), xlim = c(0,5), ylim = c(0, 5))
for(i in 2:10){
 lines(reg3_pars[[i]]$marginals.variance.nominal[[1]], col = "grey")
}

plot(reg4_pars[[1]]$marginals.variance.nominal[[1]], type = "l", col = "grey", 
      main = paste(mesh_names[4], " -- variance"), xlim = c(0,5), ylim = c(0, 5))
for(i in 2:10){
 lines(reg4_pars[[i]]$marginals.variance.nominal[[1]], col = "grey")
}

plot(reg5_pars[[1]]$marginals.variance.nominal[[1]], type = "l", col = "grey", 
     main = paste(mesh_names[5], " -- variance"), xlim = c(0,5), ylim = c(0, 5))
for(i in 2:10){
 lines(reg5_pars[[i]]$marginals.variance.nominal[[1]], col = "grey")
}

INLA Mesh test

Zhe Sha (zhe.sha@bristol.ac.uk)

21 August 2017

1. Marginal variances for a stationary process

1.1 Regular grid on \(R^2\)

Remarks

1.2 Semi-regular grid on \(S^2\)

Remarks

2. Inference on the parameters

2.1 On \(R^2\)

2.1.1 Compare the GMRF approximation and the Matern samples.

2.1.2 Inference test

2.1.3 Test boundary effects

2.2 On \(S^2\)

2.3 Test stabilities