Now that you have learned some of the most basic linear models we are going to add some complexity by introducing different error structures, mixed effects, and some basic Bayes. The Bayes stuff can be a pain to run on your computer so please reach out to me if you are having any trouble.

Like last week, you should read through this markdown and try the code. Then you should answer the “Your turn” in a fresh script and upload this to your folder. I should be able to run it all so remember to include libraries!

Libraries

library(lme4)
library(BEST) #email me if this gives you trouble
library(rjags) #email me if this gives you trouble
library(tidyverse)
#library(jagsUI) #I do not reccomend importing the full library

Set working directory.

setwd("c:/current/")

Import data.

dat <- read.csv("Complex_models_data.csv")

Examine data.

colnames(dat) #Returns column names

## [1] "elevation" "genus"     "area"      "herbivory" "diversity" "plotID"   
## [7] "surv"

head(dat) #Returns first 6 entries for each column

##   elevation genus     area herbivory diversity plotID surv
## 1      high Piper 400.0000        76         4      1    0
## 2      high Piper 133.3000         5         2      2    1
## 3      high Piper 171.5278         3         2      3    1
## 4      high Piper  38.0000         0         0      4    1
## 5      high Piper 825.0000         1         2      5    1
## 6      high Piper 900.0000         5         3      6    1

Summarize data.

summary(dat) #Returns summary of each column.

##  elevation        genus         area          herbivory       diversity    
##  high:89   Miconia   :61   Min.   :  12.5   Min.   : 0.00   Min.   :0.000  
##  low :90   Piper     :59   1st Qu.: 105.5   1st Qu.: 1.00   1st Qu.:1.000  
##            Psychotria:59   Median : 192.0   Median : 5.00   Median :2.000  
##                            Mean   : 247.4   Mean   :12.16   Mean   :2.095  
##                            3rd Qu.: 355.6   3rd Qu.:15.00   3rd Qu.:3.000  
##                            Max.   :1093.8   Max.   :87.00   Max.   :6.000  
##      plotID           surv       
##  Min.   : 1.00   Min.   :0.0000  
##  1st Qu.: 8.00   1st Qu.:0.0000  
##  Median :15.00   Median :1.0000  
##  Mean   :15.42   Mean   :0.6872  
##  3rd Qu.:23.00   3rd Qu.:1.0000  
##  Max.   :31.00   Max.   :1.0000

You should notice that these data are very similar to last week’s data. In fact they are the same except for one additional column: “surv.” This column describes the survival of these plants measured at the end of the season. Another fun fact: I simulated it. This means the values are probably complete nonsense biologically, because I am not good at real biology. Regardless, a value of 1 indicates if a plant was alive at the end of the season and 0 means it was toast. We will use these data to consider the glm() and mixed models.

The generalized linear model

As mentioned in class the GlzM is a generalization of the ordinary linear model for response variables with error distributions other than normal. Other common error distributions include the binomial, Poisson, and gamma distributions. Today we will focus on the binomial distribution.

Before going too far, first let’s take a step back and think about what an error distribution is. Last week a common assumption in all of the basic linear models was that the residuals were normally distributed. Here is what this looks like in an equation:

\[y = \alpha + \beta X' + E\] \[E \sim Normal(0,\sigma^2)\] Here the response variable (\(y\)) equals the intercept (\(\alpha\)) + coefficients (\(\beta\)) * the predictors (\(X'\)) + the residuals. The residuals are the error that remains in your model after you have accounted for the effects of the predictor variables. As you can see the ordinary linear model expects these errors (\(E\)) to be normally distributed and on average be 0. This is a problem for many types of ecological data, like survival and counts, because the nature of the response variables creates wacky error distributions.

Enter the GLzM. This generalization of the linear model allows different types of error structures to be applied through the use of “link” functions. The math behind many of these functions is beyond the scope of this lab, however I recommend that you read about some as they are important for properly interpreting the coefficients your models produce. The wiki page is pretty good. We will consider the binomial error distribution (i.e. logistic regression) in some detail below.

So let’s get on with the lab. For this first section we are interested in the effect of herbivory on plant survival at the end of the year. Thinking about what you did last week, what would you do?

Probably something like this

mod1 <- lm(surv ~ herbivory, data = dat)
summary(mod1)

## 
## Call:
## lm(formula = surv ~ herbivory, data = dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.85489  0.00212  0.10850  0.18173  0.68298 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.909813   0.030013   30.31   <2e-16 ***
## herbivory   -0.018308   0.001392  -13.15   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3316 on 177 degrees of freedom
## Multiple R-squared:  0.4943, Adjusted R-squared:  0.4914 
## F-statistic:   173 on 1 and 177 DF,  p-value: < 2.2e-16

Let’s examine the residuals

They look a little wonky. Let’s also examine what the model thinks using predict(). Predict() simply takes a value for your predictor variables (your real predictors by default) and produces the value that your model predicts. This is usually a great gut check to perform on models. You do not want your model predicting crazy things.

That’s real weird. Our model is predicting negative values. This is a problem because you cannot have a less that 0 chance of survival. Something is wrong here. Let’s try a generalized linear model using the glm() function. The glm function has the argument “family” that we use to specify the error distribution. We would use “Poisson” for count data and will use “binomial” for survival data.

mod2 <- glm(surv ~ herbivory, family = "binomial", data = dat)
summary(mod2)

## 
## Call:
## glm(formula = surv ~ herbivory, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.60833  -0.00373   0.18643   0.30723   2.22146  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.38193    0.64283   6.817 9.32e-12 ***
## herbivory   -0.33804    0.05422  -6.234 4.55e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 222.447  on 178  degrees of freedom
## Residual deviance:  77.003  on 177  degrees of freedom
## AIC: 81.003
## 
## Number of Fisher Scoring iterations: 7

Let’s examine the residuals

and the predictions

This is MUCH better. The residuals are improved AND the predictions are all between 0 and 1. But how do we interpret our results? What does this mean?

Before moving on to interpretation, let’s take a quick look at what these model look like. Model 1 is the line and model 2 is the sigmoid shape. Which looks better?

Interpreting glm()

Recall that the coefficient for a continuous variable in an ordinary linear model should be interpreted as the predicted change in the response variable for every unit of change in the predictor. However this becomes more complicated when using link functions. The first (and worse) model can be straight forwardly interpreted because we do not use a link.

coef(mod1)

## (Intercept)   herbivory 
##    0.909813   -0.018308

In this case model 1 is saying that for every increase in 1 unit of herbivory the model predicts a 1.8% decrease in survival. The intercept indicates that if there is no herbivory (i.e. the slope = 0) then we expect 90% of the plants to survive. BUT remember that this model is not a good model.

This is the better model, but unfortunately the logit link makes these coefficients more difficult to understand.

coef(mod2)

## (Intercept)   herbivory 
##   4.3819318  -0.3380374

To interpret them we need to consider what the logit link is doing. Here is the equation for a logistic regression:

\[P = \frac{e^{\alpha + \beta X'}}{1 + e^{\alpha + \beta X'}}\]

This equation is what gives model 2 its sigmoid shape, but you can see that the coefficients associated with the model (the alphas and betas) are also going through a transformation. This means that the coefficients the model produces are no longer on the easy linear scale we used before. In a binomial regression, coefficients are on the log odds scale.

So looking at the coefficients for model 2 we know that every unit of herbivory causes a log odds decrease in survival of 0.33. WTF does that mean? We can get this to odds using exp()

exp(0.33)

## [1] 1.390968

This tells us that for every 1 unit increase of herbivory that plant is 1.39 times less likely to survive. That’s more useful. But this is still not great because “more likely” is a relative term. You can extract the real probability from the model by using the predict() function. What is the probability of survival if you receive 20 units of herbivory?

predict(mod2, newdata = data.frame(herbivory = 20), type = "response")

##          1 
## 0.08480241

Under the hood, this function is just plugging in the predicted value into the logistic regression equation (with the coefficients filled in). Here is what that looks like:

1/(1 + exp(-(4.3819318 + -0.3380374*20)))

## [1] 0.0848024

We can generate the full curve by predicting the full range of values and plotting them. Here This is the full relationship between herbivory and survival.

plot(0:87,predict(mod2, newdata = data.frame(herbivory= 0:87), type = "response"), xlab = "Herbivory", ylab = "Probability of survival")

This might be too much detail, but the overall point is that when you use links like the logit or log, a transformation is occurring under the hood that makes the data more difficult (but not impossible) to interpret.

Your turn: Look at the diversity variable in these data. Do you think a regular lm is appropriate for these data? Can diversity be less than zero? Try to find a more suitable distribution and run a glm(). Take a shot at it’s proper interpretation! (Don’t feel the need to go too crazy, I chose logistic to demonstrate because it is the easiest).

Mixed models

Mixed models are very useful tools in ecology because they help us account for many of the pitfalls that belie ecological data. I will not get into pseudoreplication and non-independence among observations in detail here, but these are issues that should always be considered. Many datasets are nested and mixed models can help us out with this. The lme4() package and its functions lmer() and glmer() are commonly used to construct random effects models.

One of the most crucial considerations when building mixed models is deciding what is a random effect and what isn’t. Often people treat fixed effects as “what we care about” and random effects as “other potential data tom foolery to account for” but this isn’t quite right. Consider an example that I stole from Harrison et al 2018 (which it a SUPER useful reading for mixed models and glms). Here we are interested in the difference in mass between 5 measured groups. Lets say they are ents (that part wasn’t in Harrison et al). We measured 25 ents in 5 groups for 125 total individuals. Let’s look at a couple of ways to model this:

mod1 <- lm(mass ~ ent_group, data = data_mines_of_moria)
mod2 <- lmer(mass ~ 1 + (1|ent_group), data = data_mines_of_moria)

Mod 1 treats ent groups as a fixed effect. It assumes that they were sampled independently. Mod 2 treats ent groups as a random effect. It says that we sampled 5 ent groups drawn from a larger population. If we did it again we might get different groups.

Let’s add another layer of complexity. Let’s now say we measured 5 ent groups in 3 forests (for 15 total ent groups). Yes I know ents are only in Fangorn forest. But let’s say that we found them in 3 forests. Are the 5 groups (125 individuals) we measured in each forest truly independent? It seems likely that ents from the same forest will be more similar than ents from other forests, so while we have 15 groups, it is really more like 3. This is one type of pseudoreplication and we need to account for it. In mixed models we can specify how data are nested. A model like this is appropriate.

mod3 <- lmer(mass ~ 1 + (1|forest/ent_group), data = data_mines_of_moria)

It says that we want to examine the differences in mass between individuals who were measured in groups that are nested in forests.

Both of the mixed models presented thus far have been a little unusual in that are only intercept models (ANOVAs essentially). Let’s get back to our dataset.

I want to answer this question: how does herbivory impact plant survival at the end of the growing season? Remember that we measured three genera of plants in plots at two different elevations. So what do you think is fixed and what is random?

mod4 <- glmer(surv ~ herbivory + (1|genus), family = "binomial", data = dat)

## boundary (singular) fit: see ?isSingular

summary(mod4)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: surv ~ herbivory + (1 | genus)
##    Data: dat
## 
##      AIC      BIC   logLik deviance df.resid 
##     83.0     92.6    -38.5     77.0      176 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.3866 -0.0026  0.1324  0.2198  3.2851 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  genus  (Intercept) 0        0       
## Number of obs: 179, groups:  genus, 3
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.38193    0.64284   6.817 9.33e-12 ***
## herbivory   -0.33804    0.05423  -6.234 4.55e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##           (Intr)
## herbivory -0.885
## convergence code: 0
## boundary (singular) fit: see ?isSingular

This model treats genera as a random effect. A debate for sure. One could argue that this is “something we care about” while another says that if our question is about plants broadly these three genera are just samples of a larger plant species pool. If our question was how does survival vary among Miconia, Piper, and Psychotria it would be fixed.

Another consideration of mixed effect models is whether a random effect should be a random slope, random intercept, or random slope and a random intercept. For the sake of laziness I will reproduce a figure from Harrison et al. Essentially the difference is if we believe only the intercept changes between groups or if the slope (which is the effect of a predictor variable) also varies between groups. The random slope and random intercept model is the most conservative but also takes the most data to fit.

Here we see that the random intercept model fixes the slope to be the same between groups, while the random slope and random intercept model does not. There is no answer for all situations and which to use should be a decision made by thinking about your data and question. In R code it looks like this:

#Random intercept
mod4 <- glmer(surv ~ herbivory + (1|genus), family = "binomial", data = dat)
#Random slope & random intercept
mod5 <- glmer(surv ~ herbivory + (herbivory|genus), family = "binomial", data = dat)

Your turn: Now that I have rambled for what seems like an eternity you can play with the data. Considering the models of the herbivory data and the nesting structure of the Ent example, consider if there are any pseudoreplication issues in the herbivory data. Write a model to account for this and interpret the results.

Bayes

I know this has been a hell of lab so far so this last look at Bayes will be very cursory. The Bayesians among you will probably be a little disappointed. This section will just demonstrate that frequentist and Bayesian approaches often give the same answer (especially with a fair amount of data) and show how the interpretation of their outputs varies. I use Bayes frequently in my own work and would happily expand on Bayesian models in my office hours if some of you want to try hierarchical models when analyzing each others data.

Recall that Bayesian and frequentist analyses vary in a couple of ways. Bayesian results give us the probability of a hypothesis given the data, while frequency based statistics give us the probability of the data given a hypothesis. This has some interesting implications when thinking about interpreting p-values and confidence/credible intervals. Bayesian analysis also uses priors, which is a common (and in my opinion a bit overblown) criticism. Finally, Bayesian analysis treats all variables as random samples drawn from a larger distribution. While frequentist analyses return single coefficient estimates, Bayesian analyses return probability distributions. We will see all of these below.

Bayesian t-test

Let’s return to the t-test we performed last week. We were interested in the difference in herbivory between high and low elevation plots. We also used a sqrt() transformation.

mod6 <- t.test(sqrt(herbivory) ~ elevation, data = dat)
mod6

## 
##  Welch Two Sample t-test
## 
## data:  sqrt(herbivory) by elevation
## t = 0.91792, df = 169.95, p-value = 0.36
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.3515798  0.9627352
## sample estimates:
## mean in group high  mean in group low 
##           2.845379           2.539801

Now let’s do a Bayesian t-test. You will notice right away that you need to include priors. Here we are saying that based on our prior information, we think the mean of each group is somewhere in the range of this distribution.

plot(density(rnorm(1000, 10, 10)), main ="")

This is actually a pretty informative prior. Often Bayesian analyses will reduce the influence of their prior by setting “vaguely informative” or uniform priors. Vaguely informative priors are when the prior is something like normal(0, 10000). Essentially we are saying that any value is possible, however less extreme values are slightly more likely. Uniform priors state that all values are equally probable. Priors are important and should be carefully considered in any analysis. Personally I think they are a strength of Bayesian analyses.

splt_herb <- split(dat$herbivory, dat$elevation)
priors <- list(muM = 10, muSD = 10)
BESTout <- BESTmcmc(y1 = sqrt(splt_herb$high), 
                    y2 = sqrt(splt_herb$low),
                    priors = priors,
                    parallel = F,
                    verbose = F)

You will notice that this takes much longer to run. One of the actual downsides of Bayesian models. This is because these models are taking thousands of random samples to build probability distributions. Lets take a look at one of these.

plot(BESTout, which="mean")

This shows a distribution of the estimated difference in herbivory between high and low elevations. So while t.test() gives us one value, the bayes analysis gives thousands of values. The first thing to notice is that this mean of this distribution (0.32) is very close to the mean difference from the frequentist t-test (0.31). The best thing about this is that we can say so much more! For instance since 83% of the distribution is above 0 we can say that there is an 83& chance that herbivory (well sqrt herbivory) is higher in high elevations than low elevations. These continuous statements about probability are one of the best things about about bayes.

A frequentist p-value of 0.05 is often misinterpreted as a 5% chance that the null hypothesis is true and thus a 95% chance that it is false (and that the alternate is true). There are a couple of problems with this. In this example, the p-value of 0.36 means that assuming that the null hypothesis is true (that there is no difference between high and low), we would expect results this extreme 36% of the time. Frequentist p-values are NOT a probability statement about the “truth” of the alternate hypothesis. In Bayes land, we can use these distributions to make probability statements. If 83% of the mean differences distribution are above 0, we can say that, given these data, there is an 83% chance there is a difference between groups.

A Bayesian analysis produces a distribution for each variable in the model. Here you can see the summary statistics for each distribution that was calculated. Mu1 is the mean of the high group and sigma 1 is the variance around that mean. So yes we have a distribution (with variance) for the estimated variance.

summary(BESTout)

##             mean median  mode HDI%    HDIlo  HDIup compVal %>compVal
## mu1        2.717  2.716 2.713   95  2.18218  3.254                  
## mu2        2.400  2.398 2.394   95  1.96750  2.821                  
## muDiff     0.317  0.316 0.311   95 -0.34161  0.961       0      83.1
## sigma1     2.298  2.293 2.281   95  1.86942  2.732                  
## sigma2     1.797  1.797 1.827   95  1.39762  2.188                  
## sigmaDiff  0.501  0.497 0.478   95  0.00843  0.995       0      97.9
## nu        21.561 13.703 7.283   95  2.59244 65.816                  
## log10nu    1.180  1.137 1.003   95  0.58308  1.874                  
## effSz      0.154  0.154 0.135   95 -0.15959  0.472       0      83.1

Above we used a specific wrapper function, but often Bayesian models are more complex. For example, Bayesian hierarchical models are super useful tools to account for inherent nested structure within our data, but models like these are typically implemented using jags or stan (although you can code your own sampler). These tools require you to write out the full model. This can be difficult to do at first, BUT when you do this the model becomes MUCH more understandable, because you wrote it. Nothing is happening under the hood (at least when it comes to the model).

Here is the same model above, but coded in jags. All jags models follow this format. You make a list of the data then write everything in a for loop. Here we say that sqrt(herb) is drawn from a normal distribution, with a mean (mu) and a variance (tau). We then say that this mean is predicted by our linear model (which is the same as the t-test above). Pay attention to how I coded the data. I said that high elevation = 1 and low elevation = 0. So, alpha represents the mean of low elevation. This is because when an observations is “low” in the data, the elevation = 0, so the model is mu[i] <- alpha + beta1 * 0, so beta1 goes away. When an observation is “high” in the data, the elevation = 1, so the model is mu[i] <- alpha + beta1 * 1, so the beta1 here is difference between low and high elevations. Recall that is the exact same interpretation that aov or glm gives us.

model_data <- list(herb = sqrt(dat$herbivory), #sqrt herbivory
                   elevation = ifelse(dat$elevation == "high", 1, 0), #elevation transformed to 1 or 0
                   N = length(dat$elevation)) #Total number of observations

sim_model <- "model {

          for(i in 1:N) {  
              herb[i] ~ dnorm(mu[i], tau)
              mu[i] <- alpha + beta1 * elevation[i]
          }   

          #Uninformative priors
          
          alpha ~  dnorm(0, 0.001)
          beta1 ~ dnorm(0, 0.001)
          tau ~ dgamma(2, 0.1)
          
}   
"

writeLines(sim_model, con="t_model.txt")

mod <- jagsUI::jags(data = model_data,
                    n.adapt = 2000,
                    n.burnin = 1000,
                    n.iter = 10000,
                    n.chains = 4,
                    modules = "glm",
                    model.file = "t_model.txt",
                    parameters.to.save = c("alpha", "beta1", "tau"),
                    verbose = FALSE)

Before doing any else, it is best to make sure the Bayesian model actually worked. Recall that these models are not producing a single coefficient estimate. They are producing thousands of estimates and we want to make sure that all of these estimates converged on one answer. We can do this by examine the traceplots, which should look like fuzzy caterpillars. WE can also examine the gelman rubin diagnostic (rhat in jagsUI). We want this to be very close to 1 and I mean VERY close. Values like 0.99 or 1.01 might not be terrible, but they indicate something is probably wrong. Values like 0.95 are bad. Here I show the traceplot and rhat for alpha only. You should examine these for all of the variables.

plot(mod$sims.list$alpha, type = "l")

mod$Rhat$alpha

## [1] 1.000159

Lets plot the both posterior distributions (low is black, high is red). Remember that these posteriors are one of the best things about Bayesian analyses. They are scaled so that the integral of each of curve sums to 1 and thus can be interpreted in probabilistic terms. For instance, if an entire posterior distribution is above 1, as both are here, we can that given the data, there is a ~100% probability that the true mean sqrt(herbivory) of both elevations is above 1.

plot(density(mod$sims.list$alpha), xlim = c(0,5), ylim = c(0,2))
lines(density((mod$sims.list$alpha +
              mod$sims.list$beta1)), col = "red")

We can also make relative statements. For instance, we can say that there is a 82% chance that sqrt herbivory is greater at high elevations. We can do this by taking all the samples for low elevation (alpha) and comparing them will of of the samples of high elevation (alpha + beta). The below code makes instances where high > low = 1 and instance where low > high = 0, then adds them all up, and divides it my the total samples. This returns the probability that high is greater than low.

low_samples <- mod$sims.list$alpha
high_samples <- mod$sims.list$alpha + mod$sims.list$beta1

sum(ifelse(low_samples < high_samples, 1, 0))/length(low_samples)

## [1] 0.8226944

Bayesian ANOVA

We can make minor changes to the above model to make this a Bayesian ANOVA. I will write a model below to examine the difference in sqrt herbivory between the three different genera of plants we measured.

Unlike the previous functions we have used for frequentist models (aov, glm, etc.), where we can simply code genus as a factor and write “sqrt(herb) ~ genus”, when we write our own model we have to dummy code our data. This is what functions like glm are actually doing under the hood when we give them a categorical variable with more than 2 levels. We need to make a new column for every level in our categorical variable. In each column a 1 means that species is present and a 0 means it is absent. This is probably easier to demonstrate. Take a look at the data before and after running the following code.

aov_dat <- dat %>% 
  mutate(ones = 1, fill = 0) %>% 
  spread(key = genus, value = ones, fill = 0) %>%  
  select(herbivory, Miconia, Piper, Psychotria)

Now that we have reformatted our data, let’s put it in the form that jags wants.

model_data <- list(herb = sqrt(aov_dat$herbivory), #sqrt herbivory
                   Miconia = aov_dat$Miconia,
                   Piper = aov_dat$Piper,
                   Psychotria = aov_dat$Psychotria,
                   N = length(aov_dat$herbivory)) #Total number of observations

Now let’s write an ANOVA. Like before we can that sqrt(herbivory) is drawn from a normal distribution, with a mean (mu) and variance (sigma). We then predict the mean using the three plant species. Beta1 corresponds to Miconia, beta2 corresponds to Piper, and beta3 corresponds to Psychotria. So if an observation in the data is of Miconia the model is mu <- beta1 * 1 + beta2 * 0 + beta3 * 0, thus beta1 is the coefficient for the mean herbivory of Miconia. The other two plants work the same way.

sim_model <- "model {

          for (i in 1:N) {  
              herb[i] ~ dnorm(mu[i], sigma)
              mu[i] <- beta1 * Miconia[i] + beta2 * Piper[i] + beta3 * Psychotria[i]
          }   

          #Uninformative priors
          
          beta1 ~ dnorm(0, 0.001)
          beta2 ~ dnorm(0, 0.001)
          beta3 ~ dnorm(0, 0.001)
          sigma ~ dgamma(2, 0.1)
          
}   
"

writeLines(sim_model, con="aov_model.txt")

mod <- jagsUI::jags(data = model_data,
                    n.adapt = 2000,
                    n.burnin = 1000,
                    n.iter = 10000,
                    n.chains = 4,
                    modules = "glm",
                    model.file = "aov_model.txt",
                    parameters.to.save = c("beta1", "beta2", "beta3", "sigma"),
                    verbose = FALSE)

Sometimes plotting more than two posteriors together can be overwhelming to look at, so we can represent posteriors like this as well. Here we see the median and 95% credible intervals. Credible intervals are not the same thing as confidence intervals! Often if 95% conductance intervals do not overlap people say that there is a greater than 95% chance that those groups are different BUT THIS IS WRONG. We can only make probability statements like this using Bayesian analysis. For example, we can say that There is a greater than 95% probability that Piper and Psychotria are different because the credible intervals do not overlap.

plot_dat <- data.frame(genus = c("Miconia", "Piper", "Psychotria"),
                       median = c(mod$q50[[1]], mod$q50[[2]], mod$q50[[3]]),
                       q2.5 = c(mod$q2.5[[1]], mod$q2.5[[2]], mod$q2.5[[3]]),
                       q97.5 = c(mod$q97.5[[1]], mod$q97.5[[2]], mod$q97.5[[3]]))

ggplot(data=plot_dat) +
  geom_errorbar(aes(genus, ymin = q2.5, ymax = q97.5), width = 0) +
  geom_point(aes(genus, median), pch = 21, fill = "gray65") +
  labs(y="Sqrt herbivory") +
  theme_classic()

This was very brief introduction to priors and posterior distributions, but it is important for both performing Bayesian analyses and reading papers that use these methods. I am always happy to talk about implementing these models. I have added a fairly thorough demonstration of a hierarchical model in this lab’s folder. I think that Bayesian hierarchical models are rad and a great way of accounting for many of the problems in ecological data.

Your turn: Change the priors on one of these models, plot the prior distribution, and then re-run the model. How does this change the results and inference?

More complex models