1 Introduction

In the Experiment 2, our target is to separate the sea surface height change due to mass and steric based on the budget equation \[\Delta SSH = \Delta Mass + \Delta Steric + VLM\] and then produce a one degree prediction map for the mass change and steric change.

The equation is only conceptual. Each components need to be converted in to the same unit: mm/yr; and the relationship is not linear additive when written in exact physcis formula. However, we have derived that the non-linear and interaction parts can be absorbed into the residuals for the purpose of this experiment and the physics equation for the pixel wise budget is simplified to \[\Delta h \approx \frac{1}{a\rho_0} \Delta m + \frac{m_1}{a} (\frac{1}{\rho_2} - \frac{1}{\rho_1}) + vlm \] where \(a\) is area of the pixel, \(\rho_0 = 1000kg/m^3\) is the density of water, \(\rho_2\) and \(\rho_1\) are the densities for the start and end point of the period of interest, and \(m_1\) is the mass of the pixel at the beginning. Thus, the first term on the right hand side (rhs) of the above equation corresponds to the SSH change due mass and we denote it by \(\Delta h_m\). The second term corresponds to change due to steric and denote it by \(\Delta h_s\). The last term is a constant value across the ocean, since the change of ocean bottom shape has effect on the entire sea level. Details can be found in the documentations for experiment 2 – both ZS’s and RB’s. Then we can write \[\Delta h = \Delta h_m + \Delta h_s + vlm\] For our experiment 2b, we assume \(\Delta h_m\) and \(\Delta h_s\) are two independent latent Gaussian spatial processes a priori. They can be approximated by GMRF through the SPDE approach. Then the sea surface height change \(\Delta h\) can be reconstruct from the the above equation.

For the observations, we have altimetry directly measuring the sea surface height and GRACE observing the mass change, therefore we can write \[ y_{alt} = \Delta h_m + \Delta h_s + vlm + error_{alt} \\ y_{grace-gia} = \Delta h_m + error_{grace}\]

The vlm will be calculated as GIA averaged over the ocean and the GIA values are given by the ICE6G-VM5 solution.

Thus the two latent processes can be separated from the above observation equations and the posterior of the two processes will be correlated due to the sum constraint. The posterior uncertainty of \(\Delta h\) will be no larger than the sum of the two processes, since they will be negative correlated. (Find out how to calculated the marginal variance of this linear combination in INLA!)

The GRACE data will be used to learn the mass change; but to better separate the mass and steric components, we need to impose some priors on the hyper-parameters for these two processes. For the mass change process, we assume the following prior: \(\rho \sim \log\mbox{Norm}(1600/6371, 1), \sigma^2 \sim \log\mbox{Norm}(40, 80^2)\). For the steric process, we assume \(\rho \sim \log\mbox{Norm}(2000/6371, 1), \sigma^2 \sim \log\mbox{Norm}(15, 30^2)\). (Need more discussion later for setting the prior.)

In the following we do the inference in INLA step by step.

2 Data

Load the pre-processed altimetry data, GRACE data, the ocean land mask and the ICE6G-VM5 data. The data are the linear year trend estimates over the period 2005-2015 with their corresponding estimated standard errors.

The Altimetry data used here is the CCI given by WL at 1 degree long-lat grid.

The GRACE is the GIA-adjusted mascons given by RB. More input will be need on this data set since

(1) There is no error estimates for RB's data. We resampled it from MS's super mascons. The errors are highly correlated and we need to put an artifical inflation factor in the model.
(2) GRACE is best at 2.5 degree resoltuion according to BDV so the 1 degree map is only interpolation. BDV should provide better GRACE solution later and justify the choice.

The ocean mask is downloaded from Nature Earth at 110m resolution. This will be used for calculating the Ocean areas separating the triangulations in the ocean and land.

The ICE5G-VM5 data are trend of vertical land motions processed by MS and they are at 1 degree resolution. The data is used to calculated the effect of vlm averaged over the ocean.

3 Set up priors

Now we do some exploratory analysis which will be helpful in setting up the prior for hyper-parameters \(\rho\) and \(\sigma^2\) for the mass and steric processes.

For the mass process, We learn the initial values from the variogram of the the GRACE data.

For the steric process, we do not have direct observations, but we can coarsely learn the parameters from the residuals of altimetry minus GRACE.

The following chunk plot the smoothed variograms. The variograms are calculated from 5000 random samples of the entire dataset and then replicated 5 times; using the entire dataset will take too much time for a standard office desktop or laptop.

From the above variogram analysis, we set the prior means for the ranges and variances of the mass and steric processes.

## Priors for mass
mu_r1 <- 1600/6371 # mean of rho
v_r1 <- 1 # vague variace for rho
mu_s1 <- 40 # mean of sigma
v_s1 <- 80^2 # vaue variance for sigma

## Priors for steric
mu_r2 <- 2000/6371 # mean of rho
v_r2 <- 1 # vague variace for rho
mu_s2 <- 15 # mean of sigma
v_s2 <- 30^2 # vaue variance for sigma

## Transform the parameters for a lognormal distribution
## -- mass
trho1 <- Tlognorm(mu_r1, v_r1)
tsigma1 <- Tlognorm(mu_s1, v_s1)

lsigma1 <- tsigma1[1]
theta1_ss <- tsigma1[2]
lrho1 <- trho1[1]
theta1_rs <- trho1[2]
lkappa1 <- log(8)/2 - lrho1
ltau1 <- 0.5*log(1/(4*pi)) - lsigma1 - lkappa1

## -- steric
trho2 <- Tlognorm(mu_r2, v_r2)
tsigma2 <- Tlognorm(mu_s2, v_s2)

lsigma2 <- tsigma2[1]
theta2_ss <- tsigma2[2]
lrho2 <- trho2[1]
theta2_rs <- trho2[2]
lkappa2 <- log(8)/2 - lrho2
ltau2 <- 0.5*log(1/(4*pi)) - lsigma2 - lkappa2

4 Generate triangulations for the SPDE approximation

To build the spde model used for approximating the process, we need to generate a triangular mesh for the process. In general, the triangles in the mesh should have similar size and shape and the size of the triangles should be no larger than the spatial correlation length.

In our exploratory analysis, the correlation length is quite long, therefore if the purpose is estimating the hyper-parameters, then we can choose to have a sparse mesh with resolution no large than the correlation length. Here we could use say a triangle size to have approximately 5 degree resolution so that the edge length of the triangles are smaller than 500km on average which is much smaller than 2000km.

However, Since our purpose is to produce a map of the process at high resolution, we must have a mesh with at least the desired prediction resolution. For a 1 degree resolution map, we will need about \(360 \times 180 = 64800\) triangles corresponding to number of vertices of \(v \approx (f+2)*2/5= 25921\). When using the Fibonacci grid points in long lat coordinates as the starting points for generating the mesh, the number of points is about half of the triangle vertices.

## Genereate Fibonacci points on the sphere
fibo_points <- fiboSphere(N = 3e4, LL = FALSE)
mesh0 <- inla.mesh.2d(loc = fibo_points, cutoff = 0.01, max.edge = 1)
summary(mesh0)
## 
## Manifold:    S2
## CRS: N/A
## Vertices:    30000
## Triangles:   59996
## Boundary segm.:  0
## Interior segm.:  0
## xlim:    -0.999934 0.999934
## ylim:    -0.9999457 0.9998971
## zlim:    -0.9999667 0.9999667

The above triangulation can be used for the mass process; however since the steric process is only defined on the ocean, we need to remove the triangles in the land regions.

mesh0 <- dt.mesh.addon.posTri(mesh = mesh0, globe = TRUE) 
Tlonlat <- Lxyz2ll(list(x = mesh0$posTri[,1], y = mesh0$posTri[,2], z = mesh0$posTri[,3]))
mesh0$Trill <- cbind(lon = Tlonlat$lon, lat =Tlonlat$lat)
TinOcean <- unlist(over(Ocean, SpatialPoints(coords=mesh0$Trill, proj4string = CRS(proj4string(Ocean))), returnList=T))
TAll <- 1:mesh0$t
ToutOcean <- TAll[-TinOcean]
Omega = dt.Omega(list(TinOcean, 1:mesh0$t), mesh0)

mesh_ocean <- mesh.sub(mesh0, Omega, 1)
summary(mesh_ocean)
## 
## Manifold:    S2
## CRS: N/A
## Vertices:    22615
## Triangles:   42695
## Boundary segm.:  0
## Interior segm.:  0
## xlim:    -0.999934 0.999934
## ylim:    -0.9999457 0.9998971
## zlim:    -0.9968333 0.9999667

5 Build SPDE approximation for the processes

Now we can build the SPDE approximations for the two processes.

mass_spde <- inla.spde2.matern(mesh0, B.tau = matrix(c(ltau1, -1, 1),1,3), B.kappa = matrix(c(lkappa1, 0, -1), 1,3),
                              theta.prior.mean = c(0,0), theta.prior.prec = c(sqrt(1/theta1_ss), sqrt(1/theta1_rs)))

steric_spde <- inla.spde2.matern(mesh_ocean, B.tau = matrix(c(ltau2, -1, 1),1,3), B.kappa = matrix(c(lkappa2, 0, -1), 1,3),
                              theta.prior.mean = c(0,0), theta.prior.prec = c(sqrt(1/theta2_ss), sqrt(1/theta2_rs)))

7 INLA inference

Now we can run INLA for the Bayesian inference. Do not run on a desktop the process may use up to 64GB memory at peak. We ran this on a server with enough memory.

## Fix altimetry errors as they are known
hyper <- list(prec = list(fixed = TRUE, initial = 0))

## The formular -- we add the constraint that mass change sum to zero
formula = y ~ -1 + f(mass, model = mass_spde, extraconstr = list(A = matrix(1, nrow = 1, ncol = mesh0$n), e = 0)) + 
  f(steric, model = steric_spde)

## Run INLA
res_inla <- inla(formula, data = inla.stack.data(stkall), family = "gaussian",
                 scale =prec_scale, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(stkall), compute =TRUE))

Next we add a local constraint that the signal in South America should sum to zero.

## Create the A matrix for the constraint
## Find which vertices fall in to South America
continents <- readOGR(dsn = paste0(dd,"WP1-BHM/maps/Continents"), layer = "continent")
## OGR data source with driver: ESRI Shapefile 
## Source: "Z:/WP1-BHM/maps/Continents", layer: "continent"
## with 8 features
## It has 1 fields
SA <- continents[continents$CONTINENT == "South America",]
SAareas <- sapply(SA@polygons[[1]]@Polygons, function(x) x@area)
SA <- SA@polygons[[1]]@Polygons[SAareas > 10]
SA <- SpatialPolygons(list(Polygons(SA, ID = 1)))
proj4string(SA) <- CRS(proj4string(continents))
## mesh vertices in longlat
meshLL <- Lxyz2ll(list(x = mesh0$loc[,1], y = mesh0$loc[,2], z = mesh0$loc[,3]))
sp_mll <- SpatialPoints(coords = cbind(meshLL$lon, meshLL$lat), proj4string = CRS(proj4string(SA)))
TinSA <- unlist(over(SA, sp_mll, returnList=T))
A2 <- matrix(0, nrow = 2, ncol = mesh0$n)
A2[1, -TinSA] <- 1
A2[2, TinSA] <- 1
formula2 = y ~ -1 + f(mass, model = mass_spde, extraconstr = list(A = A2, e = c(0,0))) + 
  f(steric, model = steric_spde)

## Run INLA
res_inla <- inla(formula2, data = inla.stack.data(stkall), family = "gaussian",
                 scale =prec_scale, control.family = list(hyper = hyper),
                 control.predictor=list(A=inla.stack.A(stkall), compute =TRUE))

8 Results

8.1 Assemble and save results

Now assemble the inla inference and prediction results for the result from .

8.2 Plot the posteriors of the hyper parameters

## [1] "The estimated correlation length for mass is: 946.312882334319 km"
## [1] "The estimated marginal variance for mass is: 20.2517258400418 mm/yr"
## [1] "The estimated correlation length for steric is:  528.261033233394  km"
## [1] "The estimated marginal variance for steric is:  9.41046784639059  mm/yr"

8.3 Plot the predictions

8.4 Sanity checks on the updated mass

To check whether the updated mass is reasonable, we calculate the mass trend averaged over Earth sphere, ocean, and land.

## GRACE predicted trend mean over the corresponding regions: 
##  
##  Regions       | SA sum to zero    |  no local constr 
##  ---------------------------------------------------- 
##  Global        | -0.01087403       | -0.01090516 
##  Ocean         |  1.493796         | 1.559184 
##  Land          | -3.711426         | -3.872348 
##  South America | -0.01846513       | -3.890801

9 Save the output as ncdf files