These notes show how to create simple boral output and how to modify plots produced from boral objects. 1) First, I modify the output of the boral::lvsplot function.
2) Then I use ggbiplot to make biplots with boral objects.
rm(list = ls())
library(boral)
## Loading required package: coda
## If you recently updated boral, please check news(package = "boral") for the updates in the latest version.
library(mvabund)
boral seems to like data to be in vectors and matrices, not dataframes.
data(spider)
y <- spider$abun
dim(y)
## [1] 28 12
For testing code its best to make boral arrive at a solution faster than normal. This can be done by setting options for the mcmc.control arguement within the boral() command. Here are setting that cut down on the time to fit the model, but the solution provided should not be used because the algorithim won’t have time to converge
mcmc.control. <- list(n.burnin = 10,
n.iteration = 400,
n.thin = 30,
seed = 123)
Run boral with two latent variables (num.lv = 2) so a biplot can be made. mcmc.control = is used to set the algorith to run faster. For presence/absence data family= would b set to “binomial”.
fit.lvmbinom <- boral(y = y,
family = "negative.binomial",
num.lv = 2,
mcmc.control = mcmc.control.,
row.eff = "fixed")
## module glm loaded
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 336
## Unobserved stochastic nodes: 467
## Total graph size: 3473
##
## Initializing model
## Calculating Information criteria...
The default biplot in boral is made with fit.lvmbinom(). The default is plot both the samples in ordination space (the rows of the data matrix, locations in many datasets) as black numbers. The species names (columns of data matrix) are plotted in red.
par(mfrow = c(1,1))
lvsplot(fit.lvmbinom,cex = 0.8)
## All latent variable coefficients included in biplot.
A by setting biplot = F only the sites/samples/rows of data are plotted, not the species names
lvsplot(fit.lvmbinom,cex = 0.8,
col = c(1), biplot = F)
The spider data has habitat data assocaited with the data but no discrete categories. Let’s make some grouping variables using the continous habitat data for the sake of illustration
# The herb.layer enviornmental data
summary(spider$x[,"herb.layer"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6931 3.0440 3.4340 3.2550 4.4680 4.6150
herb.dat <- spider$x[,"herb.layer"]
groupings <- cut(herb.dat,breaks = c(3), labels = F)
Use these groupings to define colors in the biplot. Again, throws an error but seems to get the job done. These groups could be any category assocaited with the original data, such as location, species that the diet item came from, etc.
lvsplot(fit.lvmbinom,cex = 0.8,
col = groupings,biplot = F)
The main boral output is in these components of the boral object
The biplot can be approximated by plotting the lv.meadian (ore lv.mean)
par(mfrow = c(1,2), mar = c(4,4,3,1))
lvsplot(fit.lvmbinom,cex = 0.8,
col = groupings,biplot = F,
main = "real biplot")
plot(fit.lvmbinom$lv.median[,1] ~
fit.lvmbinom$lv.median[,2],
ylab = "",xlab = "",
main= "un-rotated biplot")
A real biplot has had a particlar scaling and rotation applied to it. The math to do this is in the lvsplot() function. THe core components are replicated below
x <- fit.lvmbinom
#alpha is an arguement in lvsplot(); relates to scaling
alpha <- 0.5
testcov <- x$lv.median %*% t(x$lv.coefs.median[, 2:3])
#singular value decom
do.svd <- svd(testcov, x$num.lv, x$num.lv)
#svd has components d, u and v
#some math related to rescaling
choose.lvs <- scale(do.svd$u * matrix(do.svd$d[1:x$num.lv]^alpha,
nrow = x$n, ncol = 2, byrow = T), center = T, scale = F)
#some more math related to rescaling
choose.lv.coefs <- scale(do.svd$v * matrix(do.svd$d[1:x$num.lv]^(1 -
alpha), nrow = x$p, ncol = 2, byrow = T), center = T,
scale = F)
Compare lvsplot to my hack
par(mfrow = c(1,2), mar = c(4,4,3,1))
lvsplot(fit.lvmbinom,cex = 0.8,biplot = F,
col = groupings,
main = "real biplot")
plot(choose.lvs, main = "hacked biplot")