Case study 5.3: Fifty shades of Random Effect (or Introduction to mixed model)

Introduction

As promised, we continue our topic of Generalised linear model. In the two previous tutorials we have been introduced to Linear modelling in the most basic form which is simple linear model. Then we discovered the close relationship between linear model and correlation analysis. This time we will take a long journey to discover more interesting aspects of Linear regression analysis, from the analysis of variance to mixed modelling.

This tutorial is a part of the ESMS project. Our objective is to help Vietnamese medical students in developing their skills in statistics and encourage them to use R, a powerful tool for basic and extended data analysis.

Study problem

Our study problem consists of predicting Oxygen uptake based on Workrate in 50 athletes during a 119 kilometer human powered flight from the island of Crete. The data were taken from Daedalus study Bussolari (1987, 1988). Data were recorded for five subgroups of athletes: female hockey player, male amateur tri-athlete, female amateur triathlete, male wrestler and male cyclist. However, only 4 groups of athletes are included in our analysis.

Data preparation

library(tidyverse)

df1=read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/DAAG/humanpower1.csv")

df2=read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/DAAG/humanpower2.csv")

df=rbind(df1,df2)%>%as_tibble()%>%subset(.,id!=5)

names(df)=c("Id","Workrate","O2_Uptake","Group")
df$Group=as.factor(df$Group)

df%>%knitr::kable(caption = "Table 1:Oxygen uptake versus mechanical power")
Table 1:Oxygen uptake versus mechanical power
Id Workrate O2_Uptake Group
1 2.02 28.56 1
2 2.45 36.22 1
3 2.83 46.04 1
4 3.24 45.90 1
5 3.64 55.14 1
6 4.04 62.21 1
7 4.23 63.42 1
8 4.44 66.08 1
9 2.12 30.81 2
10 2.64 35.45 2
11 3.18 46.17 2
12 3.45 51.98 2
13 3.71 52.88 2
14 3.96 58.38 2
15 4.23 55.59 2
16 2.51 35.18 3
17 3.06 45.00 3
18 3.62 57.84 3
19 3.89 59.95 3
22 2.14 31.80 4
23 2.49 38.38 4
24 2.85 44.32 4
25 3.21 44.64 4
26 3.38 47.48 4
1 2.72 41.09 1
2 3.20 46.61 1
3 3.66 52.22 1
4 4.10 59.51 1
5 4.56 67.32 1
6 5.03 69.05 1
7 2.91 33.75 2
8 3.48 41.22 2
9 4.07 48.86 2
10 4.65 53.91 2
11 5.23 59.86 2
12 5.86 66.12 2
13 6.43 69.48 2
14 3.18 41.83 3
21 3.65 45.79 3
31 4.12 50.62 3
41 4.56 55.48 3
51 5.02 59.96 3
61 5.47 67.04 3
71 5.93 68.17 3
15 2.59 37.27 4
22 3.10 45.18 4
32 3.62 51.08 4
42 4.14 57.40 4
52 4.67 64.15 4
62 4.81 67.63 4
Hmisc::describe(df)
## df 
## 
##  4  Variables      50  Observations
## ---------------------------------------------------------------------------
## Id 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##       50        0       34    0.999    18.32     17.4     2.00     3.00 
##      .25      .50      .75      .90      .95 
##     7.00    13.00    22.75    42.90    56.95 
## 
## lowest :  1  2  3  4  5, highest: 51 52 61 62 71
## ---------------------------------------------------------------------------
## Workrate 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##       50        0       46        1    3.762    1.175    2.280    2.508 
##      .25      .50      .75      .90      .95 
##    3.070    3.645    4.388    5.050    5.684 
## 
## lowest : 2.02 2.12 2.14 2.45 2.49, highest: 5.23 5.47 5.86 5.93 6.43
## ---------------------------------------------------------------------------
## O2_Uptake 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##       50        0       50        1     51.2    13.26    32.68    35.42 
##      .25      .50      .75      .90      .95 
##    44.40    51.53    59.93    67.07    67.93 
## 
## lowest : 28.56 30.81 31.80 33.75 35.18, highest: 67.32 67.63 68.17 69.05 69.48
## ---------------------------------------------------------------------------
## Group 
##        n  missing distinct 
##       50        0        4 
##                               
## Value         1    2    3    4
## Frequency    14   14   11   11
## Proportion 0.28 0.28 0.22 0.22
## ---------------------------------------------------------------------------

Data visualisation

library(ggExtra)

p=ggplot(data=df,aes(x=Workrate,y=O2_Uptake))+geom_point(color="red",size=3,show.legend = F,alpha=0.5)+theme_bw()

ggExtra::ggMarginal(p, type = "histogram",bins=40,xparams=list(fill="red"),yparams = list(fill="red"),alpha=0.5)

ggplot(df,aes(x=Group,y=O2_Uptake,fill=Group))+geom_jitter(shape=21,size=3,color="black")+geom_hline(yintercept=mean(df$O2_Uptake),linetype=2,color="blue3",size=1)+theme_bw()

ggplot(df,aes(x=Group,y=O2_Uptake,fill=Group))+geom_boxplot(alpha=0.6)+theme_bw()

ggplot(df,aes(x=Workrate,fill=Group))+geom_density(alpha=0.6)+theme_bw()

ggplot(df,aes(x=O2_Uptake,fill=Group))+geom_density(alpha=0.6)+theme_bw()

From Simple linear regression analysis to ANOVA

First, we will fit a simple linear model on our data. The study question in normal language would be: “To find out whether there is a linear relationship between Oxygen uptake and Workrate during the experiment in 50 athletes” or “Evaluating the effect of Workrate on the variation of Oxygen uptake during physical effort”.

This question could be translated into algebraic model language as follows:

\[ Y_{ij}= \beta _{0} + \beta _{1}\ X _{i}+\varepsilon _{ij} \]

Where:

  1. Xi is the value of Workrate for ith subject

  2. Yij is the measured value of Oxygen uptake for the ith person within the jth group

  3. B0 and B1 are respectively two (constant) parameters to be estimated by regression analysis, defining the association between predictor X and response Y

  4. Eij is the residual effect (error) or the deviation of response Y from the predicted value based on Xi, B0 and B1

We assume that the Residual E is an independent variable and normally distributed with zero mean and defined value of variance.

#Simple lm

mod1=lm(df,formula=O2_Uptake~Workrate)

mod1%>%summary()
## 
## Call:
## lm(formula = O2_Uptake ~ Workrate, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8958 -3.8956  0.0331  3.8904  8.2157 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   13.421      2.612   5.138 5.02e-06 ***
## Workrate      10.043      0.670  14.990  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.845 on 48 degrees of freedom
## Multiple R-squared:  0.824,  Adjusted R-squared:  0.8203 
## F-statistic: 224.7 on 1 and 48 DF,  p-value: < 2.2e-16

The model is presented graphically as below:

plot(y=df$O2_Uptake,x=df$Workrate,pch=16)
abline(mod1,col="red4")
segments(x0=df$Workrate,y0=df$O2_Uptake,x1=df$Workrate,y1=predict(mod1),col="red",lty=3)

As described in the last tutorials, linear model consists of a sloping linear curve that indicates the linear relationship between Workrate (X) and O2uptake (Y). The two parameters B0 and B1 define the intercept and slope of this line, respectively. Based on data, we can determine the optimal estimates of B0 and B1 and those values allow us to draw the best fit line through our data.

A 3rd estimate will be obtained for Residual (Eij), indicating the vertical deviation (red dotted line in the figure) of an ith observation (black dots) from the best fit line. The Ordinal least square algorithm aims to determine the B0 and B1 values that minimize the sum of squares (SS) of residual E.

The output shows us the fitted term of the model: Any linear model includes not only the predictor (B1 for Workrate) but also an Intercept (B0).

Next, we have the ANOVA table (analysis of variance) that explains the compositions of variance in O2uptake amongst the terms in model, or the effect of each term, including the main effect of Workrate and residuals effect.

mod1%>%anova()
## Analysis of Variance Table
## 
## Response: O2_Uptake
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## Workrate   1 5274.6  5274.6  224.69 < 2.2e-16 ***
## Residuals 48 1126.8    23.5                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The 2nd column of ANOVA table indicates degrees of freedom (DF). DF refers to the quantity of information represented by each term. As Workrate is an independent variable, its effect consists of a single piece of information so DF of workrate is 1. There are 50 observations in the dataset, each of these points mutually contributes to Residual error, so the residual term would include 50 pieces of information. However, our model already used up 2 elements which are Intercept and Workrate effect, so the quantity of information contained by Residual effect is reduced by 2 (DF residual = 50 -2 = 48)

The 2 next columns show the values of Sum of square and Mean square of each effect term. Sum of square (SS) has been explained in the 5.2 tutotial. The mean square is defined as SS/Df for each term and indicates the averaged variation that could be accounted for that term.

Now imagine if there was no significant effect of Workrate on O2 uptake (or there is no relationship between two variables), expected values of MS workrate and MS residual would be the same, so the expected ratio of MS workrate/MS residual (variance ratio) would be 1. This forms a null hypothesis (H0) for our statistical inference.

As the residual error is assumed to be normally distributed, variance ratio could be described by F distribution and therefore equivalent to F statistic. Given a significantly large F value we can reject our null hypothesis. The distribution of F depends on the DF in numerator (main effect) and denominator (residual effect) of the ratio. That’s why those two DF values must always be described when reporting F test: In our case, F(1,48)=224.69

The F distribution allows to estimate the probability P of getting by chance a F value larger than that observed. In this case, the p value was very small , indicating that there is a significant relationship between O2uptake and workrate.

The summary() function will get the output of linear regression analysis. It provides the value of intercept and slope of the best fit line or regression coefficients of model, with their standard error, t statistic value and significance (p value).

The regression coefficients could be interpreted as follows:

The slope (Workrate coefficient) was positive, indicating a positive correlation between Workrate and O2 uptake, or O2uptake is proportional to Workrate. Its value was estimated as 10.043, indicating that O2uptake would increase 10 unit (L/min) for each unit (Watt) increased in Workrate.

Std Error indicates how precise our estimation could be, lower SE shows higher precision. A SE value higher than the estimated value might indicate that our estimation is not reliable.

The t statistic is defined as ratio between Estimate and its standard error (t for workrate = 10.043/0.67 = 14.99). Note that t^2 is exactly F value: (10.043/0.67)^2 = 224.69, and either F-test on variance ratio or t-test on estimated coefficient points to the same goal (the p values are also equivalent for both t-test and F-test).

Why do we need to add Random Effect term ?

Though a simple linear model fitted our data fairly well, there is obviously a problem: Looking at the scatter-dot plot with the linear curve, we can see that: the observation points that belong to the same group lie on the same side of the line, indicating that the residuals within each Group are not mutually independent. For example: the points in Group 2 and 3 are systematically below the regression line, while the points in Group 1 are above this line.

Each subgroup (1,2,3,4) could be considered as a sample randomly taken from a larger population of athlete types, so the between groups variation should also be analysed as a part of unexplained variance in O2uptake.

ggplot(data=df,aes(x=Workrate,y=O2_Uptake))+geom_point(aes(color=Group),size=3.5,show.legend = T)+geom_smooth(method="lm",se=T,color="red4",alpha=0.1,fill="red")+theme_bw()

ggplot(data=df,aes(x=Workrate,y=O2_Uptake))+geom_point(color="black",size=3,show.legend = F,alpha=0.5)+geom_smooth(method="lm",se=T,color="red4",alpha=0.1,fill="red")+theme_bw()+geom_vline(xintercept=c(3.2,3.65,4.1),linetype=2)

Another question is why there were so many different Y values at a single value of X; for example at Workrate ~ 3.65 we can observe 4 different values of O2 uptake ? It seems that natural groups have been formed within the data.

Extending the basic Model with Group effect

We can add the Athlete group as a term beside Workrate in a new linear model :This model includes the variation among Groups that was not accounted for by Workrate.

\[ Y_{ij}= \beta _{0} + \beta _{1}\ X _{i}+ G_{j}+\varepsilon _{ij} \]

Gj is the mean deviation from the regression line of data points that belong to the jth Group

Note that the simple linear regression model (mod1) also includes a random term, which is the residual variation or the deviation of Y values from the regression line. This term was not explicitely represented in the models’ content but will be taken into account in ANOVA. We just added Group as a 2nd random term into our model.

mod1b=lm(df,formula=O2_Uptake~Workrate+Group)

mod1b%>%summary()
## 
## Call:
## lm(formula = O2_Uptake ~ Workrate + Group, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.6717 -2.9908 -0.6577  2.4892  9.4946 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  14.8013     2.4200   6.116 2.10e-07 ***
## Workrate     10.6091     0.5999  17.686  < 2e-16 ***
## Group2       -6.8585     1.5925  -4.307 8.86e-05 ***
## Group3       -4.8608     1.7046  -2.852  0.00655 ** 
## Group4       -2.3655     1.6822  -1.406  0.16654    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.162 on 45 degrees of freedom
## Multiple R-squared:  0.8782, Adjusted R-squared:  0.8674 
## F-statistic: 81.12 on 4 and 45 DF,  p-value: < 2.2e-16
mod1b%>%anova()
## Analysis of Variance Table
## 
## Response: O2_Uptake
##           Df Sum Sq Mean Sq  F value    Pr(>F)    
## Workrate   1 5274.6  5274.6 304.4426 < 2.2e-16 ***
## Group      3  347.2   115.7   6.6793 0.0007953 ***
## Residuals 45  779.7    17.3                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that the Group term cannot be fully described by the model 1b, as the Intercept term already “absorps” some information from Group 1 that aliased with some variations in Workrate. Parameters were estimated for Workrate as main covariate and for Groups 2,3 and 4 as independent factors.

The estimated O2 uptake could be expressed as:

Mean(O2 uptake) = 14.80 + 10.61*Workrate + Group effect

The ANOVA table shows that:

the Mean square of Workrate is the same in two models (=5274.6), thus the amount of variation that could be explained by Workrate did not change from model 1 to model 1B. We can also see that the Sum of squares are also the same between two models:

Model 1: 5274.6+1126.8 = 6401.4

Model 1B: 5274.6+347.2+779.7 = 6401.4

we could say that the new model 1B (with Group term) represents a partitioning of model 1’s residual term. In fact, Group term took 3 pieces of information from the model 1’s Residual term (so only 45 pieces remain within Residual term in model 1B).

We also see that MS of new residual is reduced from 23.5 to 17.3. This means by adding more terms into a model, we can explain more variation in the response variable than with the model that includes only one explanatory variable. A part of old residual variation can be explained now by Group.

MS for both Group and Workrate are tested against the New residual MS

F group= MS group / MS new residual = 115.7/17.3 = 6.68

Group term effect was also significant (p=0.00079), indicating that a part of O2uptake variation can be explained by the variation among the groups, beside the main effect of Workrate.

When compared against MS of Residual term, F value for Workrate is now much larger than the old F value (304.44 vs 224.69). The estimation of F workrate could also be adapted to the new model as:

F workrate (1,45)= MS workrate / MS new residual = 5274.6/17.3 = 304.4

New F workrate (1,3)= new MS Workrate / MS Group = 5274.6/115.7=45.58859

The p value of new F test for Workrate could also be estimated, its value is still highly significative (result is not shown).

When doing such decomposition, we are considering the Group as a random term beside the residual random term. So the Gj term is similar to E (residual) term, they are both have zero mean, normally distributed, except for difference in their variance (sigma^2).

The random variation of G could change the estimated O2 uptake based on Workrate; as any modification in grouping structure would have an effect on the correlation coefficient (slope of regression line). Such random effect would be larger than that due to randomly resampling within the same group (the between group variation is higher than within group variation, as indicated by the significance of F test for Group).

We call the effect of each group in our model is a Random Effect (and Group is the second random effect term beside the Residual term which is also random).

The effect of Workrate is not random so we call it “Fixed effect”.

A model that includes both Fixed and Random effects is called “Mixed model”.

Application: Data structure with natural groups, nested or crossed blocks, split slot designs, longitudinal study (panel data) are some situations when the Mixed model must be considered.

Mixed Modelling using lme4 package

The mixed model analysis can also be performed using a more robust method based on REML criterion (Residual Maximum likelihood). This algorithm is supported by some packages in R, such as nlmer (Pinheiro and Bates, 2000) and lme4 (Bates, 2005).

Here we demonstrate a mixed model analysis on this data using lme4 package. Note that we are limiting our mixed model with random intercept. We will introduce later the Random Slope model which is a little bit more complicated.

library(lme4)

mod3a=lmer(data=df,O2_Uptake~Workrate+(1|Group))
summary(mod3a)
## Linear mixed model fit by REML ['lmerMod']
## Formula: O2_Uptake ~ Workrate + (1 | Group)
##    Data: df
## 
## REML criterion at convergence: 286.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7607 -0.7470 -0.1848  0.5569  2.2178 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Group    (Intercept)  7.645   2.765   
##  Residual             17.313   4.161   
## Number of obs: 50, groups:  Group, 4
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  11.6233     2.6967    4.31
## Workrate     10.5182     0.5955   17.66
## 
## Correlation of Fixed Effects:
##          (Intr)
## Workrate -0.830
anova(mod3a)
## Analysis of Variance Table
##          Df Sum Sq Mean Sq F value
## Workrate  1 5401.1  5401.1  311.97

About the syntax: our model contains both fixed and random terms, the fixed term is Workrate, the random effect is introduced as (1|Group) indicating that the data is grouped by Group factor and 1 indicating that the random effect is assumed to be constant within each group (random intercept model). The parentheses are mandatory for a correct order.

By default, the fitting method is REML. The maximum likelihood estimates can also be done by setting REML=FALSE in the syntax. Std. deviation is lower when using Max Likelihood method as this method biases the estimates towards zero. Fixed effect will be the same between REML and ML methods.

As our data has unbalanced structure (different number of observations among groups), the REML based estimation and ANOVA decomposition are not equivalent.

Standard deviations are defined as square roots of variance

statistical inferences

For a statistical inference, we can set a null hypothesis and test it on either fixed or random effects. We can fit two models, a null H0 model which does not contain our target component(s) and an alternative H1 model that includes the target component beside the other terms.

F-test is the first solution for testing hypothesis on fixed effects. F statistic is based on residual sum of square and DF as described above. The nlme package uses systematically this method. Unfortunately, there are problems when applying this method to mixed effect models, as the DF changes by adding random terms, and the statistic is not totally F-distributed.

LRT is an alternative solution. Using standard likelihood theory, we can perform a test to compare H0 and H1 based on LR test statistic. Proposed by Welham and Thompson in 1997, LRT is considered more conservative than F test and Wald statistics in preventing the over-estimation of significance of fixed-effect terms. However, the LRT method does not allow using the REML estimation method.

Hypothesis testing could also be done using sum of squares and mean square given by ANOVA. This method is sometimes more powerful than the LR test. However, this method can only work for simple model and balanced data.

Another method introduced by Kenward and Roger (1997) is also reliable as this method adjusts the degree of freedom for the mixed model. However this method is only relevant for testing of fixed effect.

For all methods, a powerful approach for statistical inference is bootstrapping. The usual bootstrap method as we know is nonparametric, with no prior assumed distribution. Here we must assume normality for the errors and the random effects so we will apply a parametric bootstrap. This method consists of simulating some data under the null model using the fitted parameter estimates, then computing the likelihood ratio statistic for those generated data. By repeating so many times this calculation, we can determine the confidence interval for the target statistic and can even compute a bootstraped p value.

Testing random effect

Now we apply the likelihood ratio method to test the null hypothesis that no difference between groups in the data (variance between groups is equal to zero).

First, we fit a null model using lm() function, as there is no random effect Since lm() and lme4() based models do not belong to the same class, we must compute the Likelihood ratio test manually:

mod0=lm(formula=O2_Uptake~1,df)

mod3b=lmer(data=df,O2_Uptake~Workrate+(1|Group),REML=F)

#Likelihood ratio test statistic

lrtstat=as.numeric(2*(logLik(mod3b)-logLik(mod0)))

pvalue=pchisq(lrtstat,1,lower=FALSE)

cbind(lrtstat, pvalue)
##      lrtstat       pvalue
## [1,] 94.7267 2.185712e-22

The p value was highly significant, however we might doubt that this result is not totally correct as the test is simply based on Chisquared approximation. A parametric bootstrapping approach is more robust :

We want to estimate the probability given that the null hypothesis is true, for getting a LR equal or higher than observed value of 94.7267

#Bootstraping on LRT

y=simulate(mod0)

lrstat=numeric(1000)
set.seed(123)
for(i in 1:1000){
  y=unlist(simulate(mod0))
  H0=lm(y ~ Workrate,df)
  H1=lmer(y ~ Workrate + (1|Group), df, REML=FALSE)
  lrstat[i] <- as.numeric(2*(logLik(H1)-logLik(H0)))
}

mean(lrstat < 0.001)
## [1] 0.737
mean(lrstat > 94.7267)
## [1] 0

We simulate data from the null model (H0), refit the null and alternative models then compute the LRT statistic. Such process is replicated 1000 times, giving a large sample of possible LRT statistics. The proportion of LRT that exceed the observed value allows to estimate bootstrapped p value. Note that a seed number should be specified for reproducibility of result.

Examining the distribution of bootstrapped LRTs, we can see that there is 73.7 % chance for the equal value in likelihoods of H0 and H1 model, giving a LRT statistic very close to zero. p value was estimated to be zero.

We the result is not conclusive, we just make more replications (10 thousands, for example) to be sure about the significance of our Random Effect.

We can also use the RLRsim package (Scheipl, 2008) for testing the random effects: Here we perform two tests: one for REML based model (mod3a) and one for H1 vs H0 models (mod3b vs mod0). This method is easier and costs less time than our bootstrap method. As indicated by the output, the Random effect was significative based on 10,000 bootstrap iterations.

#The RLRsim package of Scheipl et al. (2008) for REML method

library(RLRsim)
exactLRT(mod3b,mod0)
## 
##  simulated finite sample distribution of LRT. (p-value based on
##  10000 simulated values)
## 
## data:  
## LRT = 94.727, p-value < 2.2e-16
exactRLRT(mod3a)
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 9.0838, p-value = 9e-04

Estimating the Fixed and Random Effects

For interpreting the mixed model, we might also need to construct the confidence interval for the parameters (or regression coefficients) of both fixed and random effect. This can be easily done using the confint() function with method set to “bootstrap”.

confint(mod3a, method="boot")
##                2.5 %    97.5 %
## .sig01      0.000000  5.396374
## .sigma      3.321125  4.971109
## (Intercept) 6.194923 17.186530
## Workrate    9.403491 11.644600

Here, the sigma01 indicates the random effect of between group variation, sigma indicates the residual random effect, Intercept and Workrate refer to model parameters of main effects. Note that their CI were constructed by bootstraping rather than from assumption of normal distribution.

Unlike the fixed effects that could be directly estimated from parameters, the random effects are just random variables rather than parameters. By assuming that those variable is normally distributed with zero mean and defined variance, their expected value would be zero so there is nothing to estimate. However, the estimation of random effect will make sense from a Bayesian point of view. The estimation becomes determining the mean of the posterior density of a random variable (with prior is our previous assumption). Such computation could be done using ranef() function:

#Estimating random effect

ranef(mod3a)
## $Group
##   (Intercept)
## 1   3.0157215
## 2  -2.8556660
## 3  -1.0872708
## 4   0.9272153

The 95%CI for posterior distribution of random effect could be graphically displayed as below:

lattice::dotplot(ranef(mod3a,condVar=TRUE))
## $Group

Testing fixed effect

The pbkrtest package (Halekoh and Højsgaard (2014) supports the F test with adjustment for degree of freedom (K&R method ):

We fit 2 models, a H0 model does not contain fixed effect and a H1 model with full effects, then we apply the KRmodcomp() function

library(pbkrtest)

nullmod=lmer(data=df,O2_Uptake~1+(1|Group),REML=F)
KRmodcomp(mod3b,nullmod)
## F-test with Kenward-Roger approximation; computing time: 0.19 sec.
## large : O2_Uptake ~ Workrate + (1 | Group)
## small : O2_Uptake ~ 1 + (1 | Group)
##          stat     ndf     ddf F.scaling   p.value    
## Ftest 306.137   1.000  46.123         1 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fixed effect could also be tested using the ML method. We will fit 2 models with and without fixed effect term. Then we estimate the LRT statistics and use the Chi2 approximation to calculate the p value.

#Likelihood

as.numeric(2*(logLik(mod3b)-logLik(nullmod)))
## [1] 94.7267
1-pchisq(94.7267,3)
## [1] 0

The accuracy of LRT based inference could be improved by using bootstrap: First we simulate the response from H0 model then use this to estimate LRT, then by replicating this process many time we can get a random, larger sample of LRT. P value could be computed from this sample

lrstat <- numeric(1000)
for(i in 1:1000){
  ds=unlist(simulate(nullmod))
  nmodr <- refit(nullmod,ds)
  modr <- refit(mod3b,ds)
  lrstat[i] <- 2*(logLik(modr)-logLik(nmodr))
}

mean(lrstat > 94.7267)
## [1] 0
quantile(lrstat, c(0.025, 0.975))
##        2.5%       97.5% 
## 0.001675655 4.881886353

The parametric bootstrap could also be performed using pbkrtest package. The output by pbkrtest package provides more information:

pbkrtest::PBmodcomp(mod3b, nullmod)%>%summary()
## Parametric bootstrap test; time: 24.57 sec; samples: 1000 extremes: 0;
## large : O2_Uptake ~ Workrate + (1 | Group)
## small : O2_Uptake ~ 1 + (1 | Group)
##            stat     df    ddf   p.value    
## PBtest   94.727                0.000999 ***
## Gamma    94.727               < 2.2e-16 ***
## Bartlett 82.052  1.000        < 2.2e-16 ***
## F        94.727  1.000 14.947 7.354e-08 ***
## LRT      94.727  1.000        < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In brief, the analysis showed that Fixed effect of Workrate was significative.

Model selection

It has been recommended that LR testing should not be used for model selection, due to the limitation of p value in multiple testing. It would be better to select the model based on criteria such as Akaike Information Criterion (AIC), Bayes Information Criterion (BIC) or criteria of model’s performance such as RMSE, MSE or R2. The estimation of those criteria will be demonstrated below:

r=predict(mod3a,newdata=df)%>%as.vector()

pd=cbind((df$O2_Uptake-r),r,df$O2_Uptake,df$Workrate,df$Group)%>%as_tibble()

names(pd)=c("Residual","Fitted","Y","X","Group")
pd$Group=as.factor(pd$Group)

true=pd$Y
predicted=pd$Fitted
error=(true-predicted)
rmse=sqrt(mean(error^2))

mse=mean((predicted-true)^2)

R2=1-(sum((true-predicted)^2)/sum((true-mean(true[[1]]))^2))

cbind(R2,rmse,mse)
##           R2     rmse      mse
## [1,] 0.97542 3.968168 15.74636
#Model selection

AIC(mod3a,mod3b,nullmod,mod0)
##         df      AIC
## mod3a    4 294.4890
## mod3b    4 297.7801
## nullmod  3 390.5068
## mod0     2 388.5068
BIC(mod3a,mod3b,nullmod,mod0)
##         df      BIC
## mod3a    4 302.1371
## mod3b    4 305.4282
## nullmod  3 396.2429
## mod0     2 392.3308

The mixed model fitted by REML method has a high R2 value of 0.975, its RMSE value was 3.968 and MSE was 15.746.

Based on AIC and BIC criteria, this is the best fit model compared to the simple linear model without random effect (mod0), the null model (with workrate effect omitted) and the mixed model withou using REML method (mod3b).

Note that fitting mixed model with REML method is better for testing random effect, however REML does not allow testing the fixed effect.

Bonus: Fitting mixed model by nlme package

library(nlme)

mod2=lme(data=df,O2_Uptake~Workrate, random = ~ 1|Group)
summary(mod2)
## Linear mixed-effects model fit by REML
##  Data: df 
##       AIC      BIC    logLik
##   294.489 301.9738 -143.2445
## 
## Random effects:
##  Formula: ~1 | Group
##         (Intercept) Residual
## StdDev:    2.765007 4.160836
## 
## Fixed effects: O2_Uptake ~ Workrate 
##                Value Std.Error DF  t-value p-value
## (Intercept) 11.62328 2.6966534 45  4.31026   1e-04
## Workrate    10.51824 0.5955025 45 17.66280   0e+00
##  Correlation: 
##          (Intr)
## Workrate -0.83 
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.7606667 -0.7469620 -0.1848076  0.5569110  2.2178140 
## 
## Number of Observations: 50
## Number of Groups: 4
anova(mod2)
##             numDF denDF   F-value p-value
## (Intercept)     1    45 1157.1007  <.0001
## Workrate        1    45  311.9745  <.0001

Diagnostics

Likes other linear regression, the mixed modelling also needs a diagnostic process. The most important diagnostic consists of verifying the important assumptions on residuals. Residuals are still considered as difference between actually observed and predicted values. The QQ plot allows to check the normality of residual and detect outliers which have bad influence on the estimation of random effects. The residual vs fitted value plot can verify our assumption of constant error variance.

pd%>%ggplot(aes(x=Residual))+geom_density(fill="gold",alpha=0.8)+theme_bw()+ggtitle("Residual error distribution")

pd%>%ggplot(aes(y=Residual,x=Fitted))+geom_point(shape=21,alpha=0.8,size=3,aes(fill=Residual))+geom_hline(yintercept=0,linetype=2,color="blue4",size=1)+theme_bw()+scale_fill_gradient2(low="green",mid="gold",high="orangered")+ggtitle("Residual error distribution")

pd%>%ggplot(aes(y=Residual,x=Fitted,fill=Group))+geom_point(shape=21,alpha=0.8,size=3)+geom_hline(yintercept=0,linetype=2,color="blue4",size=1)+theme_bw()+ggtitle("Residual error distribution")

pd%>%ggplot(aes(sample=Residual))+stat_qq(color="orangered",size=2,alpha=0.7)+geom_vline(xintercept=0)+geom_hline(yintercept=0)+scale_y_continuous("Residuals")+scale_x_continuous("Expected normal quantiles")+theme_bw()+ggtitle("Normal plot")

pd%>%ggplot(aes(sample=abs(Residual)))+stat_qq(color="orangered")+scale_y_continuous("Absolute Residuals")+scale_x_continuous("Expected normal quantiles")+theme_bw()+ggtitle("Half Normal plot")

pd%>%ggplot()+geom_point(shape=21,alpha=0.5,size=3,aes(x=X,y=Y),fill="black")+geom_smooth(aes(x=X,y=Fitted),color="red3",se=T,fill="red",alpha=0.3,linetype=2)+theme_bw()+ggtitle("Fitted value")

Reference:

J.J. Faraway. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, Second Edition. CRC Press (2017)

See you next time and Thank for joining us