Boral notes

These notes show how to create simple boral output and how to modify plots produced from boral objects. 1) First, I modify the output of the boral::lvsplot function.
2) Then I use ggbiplot to make biplots with boral objects.

Prelminaries

rm(list = ls())
library(boral)
## Loading required package: coda
## If you recently updated boral, please check news(package = "boral") for the updates in the latest version.
library(mvabund)

Set up data

species x sample matrix

boral seems to like data to be in vectors and matrices, not dataframes.

data(spider)
y <- spider$abun

dim(y)
## [1] 28 12

Setting up control options

For testing code its best to make boral arrive at a solution faster than normal. This can be done by setting options for the mcmc.control arguement within the boral() command. Here are setting that cut down on the time to fit the model, but the solution provided should not be used because the algorithim won’t have time to converge

mcmc.control. <- list(n.burnin = 10, 
                    n.iteration = 400, 
                    n.thin = 30, 
                    seed = 123)

Run boral

Run boral with two latent variables (num.lv = 2) so a biplot can be made. mcmc.control = is used to set the algorith to run faster. For presence/absence data family= would b set to “binomial”.

fit.lvmbinom <- boral(y = y, 
                  family = "negative.binomial", 
                  num.lv = 2, 
                  mcmc.control = mcmc.control.,
                  row.eff = "fixed")
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 336
##    Unobserved stochastic nodes: 467
##    Total graph size: 3473
## 
## Initializing model
## Calculating Information criteria...

Deafult Plot boral output

The default biplot in boral is made with fit.lvmbinom(). The default is plot both the samples in ordination space (the rows of the data matrix, locations in many datasets) as black numbers. The species names (columns of data matrix) are plotted in red.

par(mfrow = c(1,1))
lvsplot(fit.lvmbinom,cex = 0.8)
## All latent variable coefficients included in biplot.

Removing species names

A by setting biplot = F only the sites/samples/rows of data are plotted, not the species names

lvsplot(fit.lvmbinom,cex = 0.8,
        col = c(1), biplot = F)

Using colors to code the ID of points

The spider data has habitat data assocaited with the data but no discrete categories. Let’s make some grouping variables using the continous habitat data for the sake of illustration

# The herb.layer enviornmental data
summary(spider$x[,"herb.layer"])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.6931  3.0440  3.4340  3.2550  4.4680  4.6150
herb.dat <- spider$x[,"herb.layer"]
groupings <- cut(herb.dat,breaks = c(3), labels = F)

Use these groupings to define colors in the biplot. Again, throws an error but seems to get the job done. These groups could be any category assocaited with the original data, such as location, species that the diet item came from, etc.

lvsplot(fit.lvmbinom,cex = 0.8,
        col = groupings,biplot = F)

Boral output

The main boral output is in these components of the boral object

  • $lv.coefs.median
  • $lv.coefs.iqr
  • $lv.coefs.mean
  • $lv.median
  • $lv.mean

The biplot can be approximated by plotting the lv.meadian (ore lv.mean)

par(mfrow = c(1,2), mar = c(4,4,3,1))
lvsplot(fit.lvmbinom,cex = 0.8,
        col = groupings,biplot = F,
        main = "real biplot")
plot(fit.lvmbinom$lv.median[,1] ~  
       fit.lvmbinom$lv.median[,2],
     ylab = "",xlab = "",
     main= "un-rotated biplot")

A real biplot has had a particlar scaling and rotation applied to it. The math to do this is in the lvsplot() function. THe core components are replicated below

x <- fit.lvmbinom

#alpha is an arguement in lvsplot(); relates to scaling
alpha <- 0.5 

testcov <- x$lv.median %*% t(x$lv.coefs.median[, 2:3])

#singular value decom
do.svd <- svd(testcov, x$num.lv, x$num.lv)

#svd has components d, u and v

#some math related to rescaling
choose.lvs <- scale(do.svd$u *      matrix(do.svd$d[1:x$num.lv]^alpha, 
nrow = x$n, ncol = 2, byrow = T), center = T, scale = F)

#some more math related to rescaling
choose.lv.coefs <- scale(do.svd$v * matrix(do.svd$d[1:x$num.lv]^(1 - 
alpha), nrow = x$p, ncol = 2, byrow = T), center = T, 
                             scale = F)

Compare lvsplot to my hack

par(mfrow = c(1,2), mar = c(4,4,3,1))
lvsplot(fit.lvmbinom,cex = 0.8,biplot = F,
        col = groupings,
        main = "real biplot")
plot(choose.lvs, main = "hacked biplot")