Using RMarkdown, we explore, review and analyze the Rail data in the first chapter of Mixed Effects Models in S and S-Plus by Pinheiro and Bates (2000). Following Pinheiro and Bates, the Rail data are modeled with a mixed effects model, one containing both fixed and random factors, as part of the nlme package.
This document reflects initial attempts using RMarkdown for Pinheiro and Bates’ Rail example. Much of the text is directly copied from Pinheiro and Bates (2000), designated as PB or P&B throughout. Commentary is provided in connection with certain RMarkdown commands, etc. The RMarkdown (.Rmd) files for this example (both HTML and PDF) are available on request from W. Greg Alvord, greg.alvord@nih.gov.
We examine the first mixed models example discussed by Pinheiro and Bates (2000). ‘Six rails were chosen at random; each rail was tested three times by measuring the time it took for a certain type of ultrasonic wave to travel the length of the rail. … The engineers were interested in: (1) the average travel time for a ’typical’ rail (the expected travel time), (2) the variation in average travel time among the rails (between-rail variability), and (3) the variation in the observed travel times for a single rail (the within-rail variability).’ (P&B, p. 4) We see Figure 1.1 in P&B, that there is considerable variability in the mean travel times among the different rails.
plot(Rail, xlab = 'travel (nanoseconds)', main = 'PB Figure 1.1 - Rail groupedData data frame')
Figure 1.1 in Pinheiro and Bates. Travel time in nanoseconds for ultrasonic head-waves in a sample of six railroad rails. The times shown are the result of subtracting 36,100 nanoseconds from the original observation.
Here we list the Rail data frame, which is a groupedData data frame. (Option ‘echo = TRUE’ is used in the code chunk.)
Rail
## Grouped Data: travel ~ 1 | Rail
## Rail travel
## 1 1 55
## 2 1 53
## 3 1 54
## 4 2 26
## 5 2 37
## 6 2 32
## 7 3 78
## 8 3 91
## 9 3 85
## 10 4 92
## 11 4 100
## 12 4 96
## 13 5 49
## 14 5 51
## 15 5 50
## 16 6 80
## 17 6 85
## 18 6 83
This fixed-effects model for the one-way classification may be written
\[ \hspace{40 mm}y_{ij} = \beta_{i} + \epsilon_{ij}, \hspace{5 mm} i = 1, ..., M, \hspace{5 mm} j = 1, ...,n_{i} \hspace{20 mm} \mbox{PB (1.2)} \]
where the \(\beta_{i}\) represents the mean travel time of rail \(i\) and the errors \(\epsilon_{ij}\) are assumed to be independently distributed as \(\mathcal{N} (0, \sigma^{2})\).
Here is the model for P&B equation (1.2).
fm2Rail.lm <- lm(travel ~ as.factor(Rail) - 1, data = Rail)
summary(fm2Rail.lm)
##
## Call:
## lm(formula = travel ~ as.factor(Rail) - 1, data = Rail)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.6667 -1.0000 0.1667 1.0000 6.3333
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## as.factor(Rail)2 31.667 2.321 13.64 1.15e-08
## as.factor(Rail)5 50.000 2.321 21.54 5.86e-11
## as.factor(Rail)1 54.000 2.321 23.26 2.37e-11
## as.factor(Rail)6 82.667 2.321 35.61 1.54e-13
## as.factor(Rail)3 84.667 2.321 36.47 1.16e-13
## as.factor(Rail)4 96.000 2.321 41.35 2.59e-14
##
## Residual standard error: 4.021 on 12 degrees of freedom
## Multiple R-squared: 0.9978, Adjusted R-squared: 0.9967
## F-statistic: 916.6 on 6 and 12 DF, p-value: 2.971e-15
The estimates here show the mean travel values for each of the rails. The means could also be obtained with command
round(with(Rail, tapply(travel, Rail, mean)), 2)
## 2 5 1 6 3 4
## 31.67 50.00 54.00 82.67 84.67 96.00
‘Even though the fixed-effects model (1.2) accounts for the rail effects, it does not provide a useful representation of the rails data.’ (P&B, p. 7) Its only models the specific sample of rails used in this particular experiment (italics mine), while the main interest is in the population of rails from which the sample was drawn. In particular, fm2Rail.lm does not provide an estimate of the between-rail variability, which is one of the central quantities of interest in the rails experiment. Another drawback of this fixed-effects model is that the number of parameters in the model increases linearly with the number of rails.
A random-effects model circumvents these problems by treating the rail effects as random variations around a population mean. The following reparameterization of model (1.2) helps motivate the random-effects model for the rails data. P&B write
\[ \hspace{51 mm} y_{ij} = \bar{\beta} + (\beta_{i} - \bar{\beta}) + \epsilon_{ij} \hspace{51 mm} \mbox{PB (1.3)} \]
where \(\bar{\beta} = \sum_{i = 1}^6\beta_{i}/6\) represents the average travel time for the rails in the experiment. The random-effects model replaces \(\bar{\beta}\) with the mean travel time over the population of rails and replaces the deviations \(\beta_{i} - \bar{\beta}\) by random variables whose distribution is to be estimated.
P&B write ’the random effects model for the one-way classification used in the rails experiment as
\[ \hspace{55 mm} y_{ij} = \beta + b_{i} + \epsilon_{ij}, \hspace{55 mm} \mbox{PB (1.4)} \]
where \(\beta\) is the mean travel time across the population of rails being sampled, \(b_{i}\) is a random variable representing the deviation from the population mean of the mean travel time for the \(i\)th rail, and \(\epsilon_{ij}\) is a random variable representing the deviation in travel time for observation \(j\) on rail \(i\) from the mean travel time for all rail \(i\).’ (P&B, p. 7)
P&B (p. 8): ’To complete the statistical model, we must specify the distribution of the random variables \(b_{i}, i = 1, ..., M\) and \(\epsilon_{ij}, i = 1, ..., M; j = 1, ..., n_{i}\). We begin by modeling both of these as independent, constant variance, denoted by \(\sigma^{2}_{b}\) for the \(b_{i}\) or between-rail variability, and \(\sigma^{2}\) for the \(\epsilon_{ij}\) or within-rail variability. That is
\[ \hspace{40 mm} b_{i} \sim \mathcal{N}(0, \sigma^{2}_{b}), \hspace{4 mm} \epsilon_{ij} \sim \mathcal{N} (0, \sigma^{2}). \hspace{40 mm} \mbox{PB (1.5)} \]
This model may be modified if it does not seem appropriate.’ As described in Chapter 4, P&B encourage using graphical and numerical diagnostic tools to assess the validity of the model and to suggest ways in which it could be modified.
We fit the lme model for the Rail data. Here is the code chunk with option ‘echo = TRUE’.
fm1Rail.lme <- lme(travel ~ 1, data = Rail, random = ~ 1 | Rail)
Here, we show the print() for the object fm1Rail.lme.
fm1Rail.lme
## Linear mixed-effects model fit by REML
## Data: Rail
## Log-restricted-likelihood: -61.0885
## Fixed: travel ~ 1
## (Intercept)
## 66.5
##
## Random effects:
## Formula: ~1 | Rail
## (Intercept) Residual
## StdDev: 24.80547 4.020779
##
## Number of Observations: 18
## Number of Groups: 6
Now,we show the summary() function for the same item. Here it is.
summary(fm1Rail.lme)
## Linear mixed-effects model fit by REML
## Data: Rail
## AIC BIC logLik
## 128.177 130.6766 -61.0885
##
## Random effects:
## Formula: ~1 | Rail
## (Intercept) Residual
## StdDev: 24.80547 4.020779
##
## Fixed effects: travel ~ 1
## Value Std.Error DF t-value p-value
## (Intercept) 66.5 10.17104 12 6.538173 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.61882658 -0.28217671 0.03569328 0.21955784 1.61437744
##
## Number of Observations: 18
## Number of Groups: 6
‘This model, which has two sources of random variation, \(b_{i}\) and \(\epsilon_{ij}\), is sometimes called a hierarchical model (references, etc.) or a multilevel model. The \(b_{i}\) are called random effects because they are associated with the particular experimental units - rails in this case - that are selected at random from the population of interest. They are effects because they represent a deviation from an overall mean. That is, the “effect” of choosing rail \(i\) is to shift the mean travel time from \(\beta\) to \(\beta + b_{i}\). Because observations made on the same rail share the same random effect \(b_{i}\), they are correlated. The covariance between observations on the same rail is \(\sigma^{2}_{b}\) corresponding to a correlation of \(\sigma^{2}_{b}/(\sigma^{2}_{b} + \sigma^{2})\).’ (P&B, p. 8)
‘The parameters of the statistical model created by combining (P&B (1.4)) and (P&B (1.5)) are \(\beta\), \(\sigma^{2}_{b}\), and \(\sigma^{2}\). Note that the number of parameters for this particular problem will always be three, irrespective of the number of rails in the experiment. Although the random effects, \(b_{i}, i = 1, ..., M\) may behave like parameters, formally, they just represent another level of random variation in the model - so we do not “estimate” them as such. We will, however, form predictions \(\hat{\beta}\) of the values of these random variables, given the data we observed.’ (P&B, p. 8)
The REML estimates for the parameters are calculated as
\[ \hat{\beta} = 66.5, \hspace{5 mm} \hat{\sigma_{b}} = 24.805, \hspace{5 mm} \hat{\sigma} = 4.0208 \]
representing the mean travel time, \(\hat{\beta}\), the estimate between-rails standard deviation, \(\hat{\sigma_{b}}\), and the within-rail standard deviation, \(\hat{\sigma}\).
fm1Rail.lme fitFrom the estimates we have obtained, we will now check that the components indeed add up to the original data. [We jump ahead to Chapter in the text to re-write the model in matrix form.]
In matrix form, the model for the rails example, can be written as.
\[ \boldsymbol{y_{i}} = \boldsymbol{X_{i}\beta} + \boldsymbol{Z_{i}}b_{i} + \boldsymbol{\epsilon_{i}} \]
We want to reproduce the observed travel data for rail No. 1, re-displayed here
truth.vector.for.Rail.1 <- with(Rail, Rail == '1')
Rail[truth.vector.for.Rail.1,]
## Grouped Data: travel ~ 1 | Rail
## Rail travel
## 1 1 55
## 2 1 53
## 3 1 54
showing values of 55, 53, and 54.
Create the design matrix, \(\boldsymbol{X}\), for the fixed effects, X1, the estimate for \(\hat{\beta}\), beta.hat, the ‘estimate’ for the random effect \(b_{1}\), b1, the design matrix, \(\boldsymbol{Z}\), for the random effects, Z1, the residuals \(\boldsymbol{\epsilon_{1,j}}\), epsilon1, and display them.
X1 <- matrix(1, nrow = 3, ncol = 1)
X1
## [,1]
## [1,] 1
## [2,] 1
## [3,] 1
beta.hat <- fixef(fm1Rail.lme)
beta.hat
## (Intercept)
## 66.5
b1 <- ranef(fm1Rail.lme)[3,1]
b1
## [1] -12.39148
Z1 <- X1
epsilon1 <- resid(fm1Rail.lme)[1:3]
epsilon1
## 1 1 1
## 0.8914756 -1.1085244 -0.1085244
We re-display Rail$travel[1:3] and \(\boldsymbol{y_{1}}\) as computed from the matrix equation, and perform a check to 12 decimal places to show that the original data have been reproduced exactly.
## display y1
y1 <- Rail$travel[1:3]
y1
## [1] 55 53 54
##
## Compute y1 as predicted from lme model
y1.computed <- X1 %*% beta.hat + Z1 %*% b1 + epsilon1
y1.computed
## [,1]
## [1,] 55
## [2,] 53
## [3,] 54
##
round(y1, 12) == round(y1.computed, 12)
## [,1]
## [1,] TRUE
## [2,] TRUE
## [3,] TRUE
We have successfully reconstructed the data travel data for rail No. 1, using the estimates obtained from the fm1Rail.lme fit.
P&B state: ‘The fitted model can, and should, be examined using graphical and numerical summaries. One graphical summary that should be examined rutinely is a plot of the residuals versus the fitted responses from the model. This plot is used to assess the assumption of constant variance of the \(\epsilon_{ij}\). Because this plot is a common diagnostic, it is the default plot method for a fitted lme model.’ (p. 11)
This next plot reproduces Figure 1.4 in P&B, (p. 11).
plot( fm1Rail.lme, main = 'PB Figure 1.4')
Figure 1.4 in Pinheiro and Bates. Standardized residuals versus the fitted values for the REML fit of a random-effects model to the data from the rail experiment.
‘The standardized residuals shown on the vertical axis … are the raw residuals, \(e_{ij} = y_{ij} - \hat{\beta} - \hat{b_{i}}\), divided by the estimated standard deviation, \(\hat{\sigma}\), of the \(\epsilon_{ij}\).’ (P&B, p. 11) The plot does not show any systematic increase or decrease in the variance of the \(e_{ij}\). The \(e_{ij}\) appear to be homogeneously distributed, indicating good fit of the model.
Pinheiro JC & Bates DM (2000) Mixed Effects Models in S and S-Plus Springer