Prediction using a hierarchical model and displaying as a raster

Load some needed packages.

library(arm)
library(raster)

Imagine 1500 datapoints have been collected in 10 regions where there are two environmental gradients that vary across the some landscape. At each point the presence or absence of something has been measured.

region <- factor(rep(1:10, each=150))
envvar1 <- rnorm(1500)
envvar2 <- rnorm(1500)

presence <- rbinom(1500, 1, rep(rbeta(10, 1, 1), each=150))

train.data <- data.frame(presence=presence, region=region, envvar1=envvar1, envvar2=envvar2)

We can build a model (hierarchical logitistic regression) and make simulations based on the point estimates and standard errors to obtain an approximate joint posterior distribution.

model <- lmer(presence ~ envvar1 + envvar2 + (1 | region), data=train.data, family=binomial(link=logit))

sim.model <- sim(model)

The hypothetical landscape with ten regions and two environmental gradients can be represented by a data frame.

landscape <- data.frame(region=factor(rep(1:10, each=1000)), envvar1=rnorm(10000), envvar2=rnorm(10000))

The regions look like this.

plot(raster(matrix(as.numeric(landscape$region), ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-5

The environmental gradient looks like this.

plot(raster(matrix(landscape$envvar1, ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-6

The second environmental gradient looks like this.

plot(raster(matrix(landscape$envvar2, ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-7


We can predict the probability of presence for each cell in the landscape in four steps.

predict.presence <- matrix(sim.model@fixef[, 1], nrow = 10000, ncol = 100, byrow = T)
predict.presence <- predict.presence + as.matrix(landscape[, 2:3]) %*% t(sim.model@fixef[, 2:3])
predict.presence <- predict.presence + t(sim.model@ranef$region[, , 1])[landscape[, 'region'], ]
predict.presence <- plogis(predict.presence)

We can use the posterior predictive distribution to calculate the mean and upper and lower 95% credible bounds of probability of presence across the landscape.

predict.mean <- rowMeans(predict.presence)
predict.975 <- apply(predict.presence, 1, function(x) quantile(x, 0.975))
predict.025 <- apply(predict.presence, 1, function(x) quantile(x, 0.025))

Mapping the mean and upper and lower bounds looks like this.

plot(raster(matrix(predict.mean, ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-13

plot(raster(matrix(predict.975, ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-13

plot(raster(matrix(predict.025, ncol = 100, byrow = T)))

plot of chunk unnamed-chunk-13