A generalized linear mixed effects model (GLMM, also known as generalized linear mixed model) is, in essence, a linear mixed effects model (LMM) that uses a non-parametric approach. Comparable to the relationship between linear models (LMs) and generalized linear models (GLMs), GLMMs do not assume a normal distribution of response variables that is assumed when using LMMs. GLMMs use both fixed effects and random effects, the latter of which accounts for groupings in the data. Therefore, GLMMs are essentially GLMs (just fixed effects) plus random effects. GLMMs will likely be the most useful to me in my research. I will assign block as a random effect term in my models because we expect each of the blocks to have its own natural variation. Using block as a random effect will reflect the variances of each block in the model. I will have count data, in which case I will use a Poisson distribution, and I will have binary data, in which case I will use a Bernoulli distribution. Here, I am analyzing one day of DENU flowering phenology data in blocks 2 and 7 between treatments, collected by myself this season. I am using the glmm package, which calculates Monte Carlo maximum likelihood estimates using Monte Carlo likelihood approximation. (I got most of my information here from Knudson 2017)
Load packages
library(ggplot2)
library(tidyverse)
library(knitr)
library(glmm)
Load data
denu.data <- read.csv("denu_flowering.csv")
denu.data$BLOCK <- as.factor(denu.data$BLOCK)
Visualize data
ggplot(denu.data, aes(TREATMENT, FLOWER_NUM)) +
geom_boxplot() +
theme_classic()
Figure 1. Number of flowers per DENU plant on one day early this summer in the two experimental treatments. C stands for control, S stands for early snowmelt.
Visualize the distribution of the response variable (definitely not a normal distribution)
hist(denu.data$FLOWER_NUM, data = denu.data)
## Warning in plot.window(xlim, ylim, "", ...): "data" is not a graphical parameter
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...): "data"
## is not a graphical parameter
## Warning in axis(1, ...): "data" is not a graphical parameter
## Warning in axis(2, ...): "data" is not a graphical parameter
Figure 2. Distribution of the response variable. This definitely does not look like a normal distribution, so I will use a Poisson distribution in the GLMM.
Generate model
[code adapted from Knudson 2017]
model <- glmm(FLOWER_NUM ~ 0 + TREATMENT, random = list(~ 0 + BLOCK),
varcomps.names = "Block", data = denu.data,
family.glmm = poisson.glmm, m = 10^4, debug = TRUE)
View model summary
summary(model)
##
## Call:
## glmm(fixed = FLOWER_NUM ~ 0 + TREATMENT, random = list(~0 + BLOCK),
## varcomps.names = "Block", data = denu.data, family.glmm = poisson.glmm,
## m = 10^4, debug = TRUE)
##
##
## Link is: "log"
##
## Fixed Effects:
## Estimate Std. Error z value Pr(>|z|)
## TREATMENTC 0.485 0.161 3.007 0.00264 **
## TREATMENTS 0.975 0.135 7.246 4.28e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Variance Components for Random Effects (P-values are one-tailed):
## Estimate Std. Error z value Pr(>|z|)/2
## Block 0.0109 0.0265 0.411 0.341
The variance component, block, is not significantly different from zero, so it could potentially be removed from this model (and, I suppose, I could just use a GLM). However, it is interesting to find that there were significantly more flowers in the early snowmelt plots than in the control! Also, this was good practice and very useful to learn about.
Knudson, C. 2017. An Introduction to Model-Fitting with the R package glmm:17.