Many repeated measures studies sample data over fixed time points. Under this setting, the time variable could be treated as a factor when the interest is to compare time points to each other. Visually, it is helpful to plot the means of each time group along with additional predictor information. For plotting purposes, it is helpful to create a data set just with summary statistics first.
The following code computes the mean IFN level for each time point and status (symptomatic and asymptomatic). Additional columns include the SE for the mean and a rough CI for each group.
library(tidyverse)
data<-read.csv(file="FluStudy.csv",stringsAsFactors = T)
#View(data)
data<-na.omit(data)
#Cleaning up variable names
names(data)[6]<-"Status"
#Summary stats by time and status
statsIFN<-data %>%
group_by(Status,Hours) %>%
summarise(
count=n(),
IFIT=mean(IFITM3),
sdIFIT=sd(IFITM3)/sqrt(count),
lower=IFIT-2*sdIFIT,
upper=IFIT+2*sdIFIT
)
statsIFN
## # A tibble: 16 × 7
## # Groups: Status [2]
## Status Hours count IFIT sdIFIT lower upper
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl>
## 1 Asymp 0 8 11.4 0.208 11.0 11.9
## 2 Asymp 12 8 11.3 0.183 11.0 11.7
## 3 Asymp 29 8 11.3 0.231 10.9 11.8
## 4 Asymp 45 8 11.6 0.198 11.2 12.0
## 5 Asymp 53 8 11.4 0.218 11.0 11.9
## 6 Asymp 60 8 11.6 0.338 11.0 12.3
## 7 Asymp 84 8 11.3 0.231 10.8 11.8
## 8 Asymp 108 8 11.1 0.254 10.6 11.6
## 9 Symp 0 9 11.3 0.323 10.6 11.9
## 10 Symp 12 9 11.3 0.307 10.7 11.9
## 11 Symp 29 9 11.5 0.334 10.9 12.2
## 12 Symp 45 9 12.6 0.387 11.8 13.4
## 13 Symp 53 9 13.1 0.313 12.5 13.8
## 14 Symp 60 9 13.4 0.250 12.9 13.9
## 15 Symp 84 9 13.6 0.118 13.3 13.8
## 16 Symp 108 9 13.4 0.111 13.2 13.6
With the mean table, we can use ggplot
to get a sense of
the trend over time by status. The second graph includes plots of
individual measurements that highlight repeated measures on the same
subject.
library(ggplot2)
prof.mean<-ggplot(statsIFN,aes(x=Hours,y=IFIT,color=Status,shape=Status))+
geom_line()+
geom_point(size=2.5)+
labs(y="Mean IFITM3 Expression Values")+
scale_color_manual(values=c("black","skyblue"))+
scale_shape_manual(values=c(15,16))+
ylim(9,15.5)+
guides(shape="none")+
theme_bw()
prof.mean+geom_errorbar(aes(ymin=lower,ymax=upper,color=Status),width=2)
ggplot(statsIFN, aes(x=Hours, y=IFIT, color=Status, shape=Status)) +
geom_line(aes(x=Hours, y=IFITM3, group=Subject, color=" Asymp Patients"), data=data[data$Status == 'Asymp',]) +
geom_line(aes(x=Hours, y=IFITM3, group=Subject, color=" Symp Patients"), data=data[data$Status == 'Symp',]) +
geom_line() +
geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.5) +
geom_point(size = 2.5) +
labs(color="Status", x="Hours", y = "Mean IFITM3 Expression Values") +
ylim(9, 15.5) +
scale_color_manual(values = c("grey90", "grey70", "black", "skyblue")) +
scale_shape_manual(values=c(15,16)) +
guides(shape="none") +
facet_wrap(~Status, nrow = 1) +
theme_bw()
To help get a sense of the importance of repeated measures, I want you to compare the results of performing an analysis on this data set using traditional MLR (ignoring that the independence assumption is broken) versus a model that takes into account the fact that residuals are correlated. The two code chunks below provides two analyses and report some specific time comparisons as well as make residual plots. Compare the output and write a short summary of any differences in the results that exist. You will not see many, but the few that are there are important.
Repeated Measures Model
library(emmeans)
library(nlme)
data$Hours.Fac<-factor(data$Hours)
UN<-gls(IFITM3~Hours.Fac*Status,data=data,correlation=corSymm(form=~1|Subject))
#Comparing changes from time point to baseline
results<-emmeans(UN,~Hours.Fac|Status,mode="df.error")
comparisons<-contrast(results, "trt.vs.ctrl",ref=1,combine = TRUE, adjust = "bonferroni",interaction=T)
#Comparing changes between status at each time point
results2<-emmeans(UN,~Status|Hours.Fac,mode="df.error")
comparisons2<-contrast(results2, "pairwise", combine = TRUE, adjust = "bonferroni")
#Aggregating results
comparisons2.update<-update(comparisons2[1]+comparisons2[2]+comparisons2[3]+comparisons2[4]+
comparisons2[5]+comparisons2[6]+comparisons2[7] )
comparisons
## Status = Asymp:
## Hours.Fac_trt.vs.ctrl estimate SE df t.ratio p.value
## 12 - 0 -0.09083 0.119 92 -0.762 1.0000
## 29 - 0 -0.11191 0.168 92 -0.666 1.0000
## 45 - 0 0.17295 0.298 92 0.580 1.0000
## 53 - 0 -0.01926 0.333 92 -0.058 1.0000
## 60 - 0 0.20063 0.373 92 0.537 1.0000
## 84 - 0 -0.12700 0.332 92 -0.382 1.0000
## 108 - 0 -0.36878 0.328 92 -1.126 1.0000
##
## Status = Symp:
## Hours.Fac_trt.vs.ctrl estimate SE df t.ratio p.value
## 12 - 0 0.00446 0.112 92 0.040 1.0000
## 29 - 0 0.26819 0.158 92 1.692 0.6580
## 45 - 0 1.34953 0.281 92 4.804 <.0001
## 53 - 0 1.87027 0.314 92 5.949 <.0001
## 60 - 0 2.10672 0.352 92 5.986 <.0001
## 84 - 0 2.29410 0.313 92 7.327 <.0001
## 108 - 0 2.16169 0.309 92 6.998 <.0001
##
## Degrees-of-freedom method: df.error
## P value adjustment: bonferroni method for 7 tests
comparisons2.update
## contrast Hours.Fac estimate SE df t.ratio p.value
## Asymp - Symp 0 0.1790 0.382 92 0.469 1.0000
## Asymp - Symp 12 0.0837 0.382 92 0.219 1.0000
## Asymp - Symp 29 -0.2011 0.382 92 -0.526 1.0000
## Asymp - Symp 45 -0.9976 0.382 92 -2.611 0.0737
## Asymp - Symp 53 -1.7106 0.382 92 -4.478 0.0002
## Asymp - Symp 60 -1.7271 0.382 92 -4.521 0.0001
## Asymp - Symp 84 -2.2421 0.382 92 -5.869 <.0001
##
## Degrees-of-freedom method: df.error
## P value adjustment: bonferroni method for 7 tests
MLR Model
MLR<-lm(IFITM3~Hours.Fac*Status,data=data)
#Comparing changes from time point to baseline
results<-emmeans(MLR,~Hours.Fac|Status,mode="df.error")
comparisons<-contrast(results, "trt.vs.ctrl",ref=1,combine = TRUE, adjust = "bonferroni",interaction=T)
#Comparing changes between status at each time point
results2<-emmeans(MLR,~Status|Hours.Fac,mode="df.error")
comparisons2<-contrast(results2, "pairwise", combine = TRUE, adjust = "bonferroni")
#Aggregating results
comparisons2.update<-update(comparisons2[1]+comparisons2[2]+comparisons2[3]+comparisons2[4]+
comparisons2[5]+comparisons2[6]+comparisons2[7] )
comparisons
## Status = Asymp:
## Hours.Fac_trt.vs.ctrl estimate SE df t.ratio p.value
## 12 - 0 -0.09083 0.386 120 -0.235 1.0000
## 29 - 0 -0.11191 0.386 120 -0.290 1.0000
## 45 - 0 0.17295 0.386 120 0.448 1.0000
## 53 - 0 -0.01926 0.386 120 -0.050 1.0000
## 60 - 0 0.20063 0.386 120 0.519 1.0000
## 84 - 0 -0.12700 0.386 120 -0.329 1.0000
## 108 - 0 -0.36878 0.386 120 -0.954 1.0000
##
## Status = Symp:
## Hours.Fac_trt.vs.ctrl estimate SE df t.ratio p.value
## 12 - 0 0.00446 0.364 120 0.012 1.0000
## 29 - 0 0.26819 0.364 120 0.736 1.0000
## 45 - 0 1.34953 0.364 120 3.704 0.0023
## 53 - 0 1.87027 0.364 120 5.133 <.0001
## 60 - 0 2.10672 0.364 120 5.782 <.0001
## 84 - 0 2.29410 0.364 120 6.296 <.0001
## 108 - 0 2.16169 0.364 120 5.933 <.0001
##
## P value adjustment: bonferroni method for 7 tests
comparisons2.update
## contrast Hours.Fac estimate SE df t.ratio p.value
## Asymp - Symp 0 0.1790 0.376 120 0.477 1.0000
## Asymp - Symp 12 0.0837 0.376 120 0.223 1.0000
## Asymp - Symp 29 -0.2011 0.376 120 -0.535 1.0000
## Asymp - Symp 45 -0.9976 0.376 120 -2.656 0.0629
## Asymp - Symp 53 -1.7106 0.376 120 -4.554 0.0001
## Asymp - Symp 60 -1.7271 0.376 120 -4.598 0.0001
## Asymp - Symp 84 -2.2421 0.376 120 -5.970 <.0001
##
## P value adjustment: bonferroni method for 7 tests
Plot of residuals are included below:
ANSWER - Discussion 1:
Comparison of Changes from Time Point to Baseline (within each status group):
Both the Generalized Least Squares (GLS) model and the Multiple Linear Regression (MLR) model yielded identical estimates for changes from the baseline within each status group, indicating agreement on the magnitude of time effects. However, a notable difference was observed in the precision of these estimates. The GLS model, which accounts for within-subject correlations, provided more precise estimates as reflected by smaller standard errors compared to the MLR model. This difference underscores the impact of accounting for the repeated measures aspect in statistical analysis.
Comparison of Changes Between Status at Each Time Point:
Estimates for the differences between symptomatic and asymptomatic groups at each time point remained consistent across both models. However, in contrast to the expected outcome, both models reported identical standard errors and degrees of freedom. This similarity could prompt further investigation into the correlation structure assumed by the GLS model and how the data is structured.
Significance Levels (p-values):
A significant divergence between the models emerged when examining the p-values, especially after applying the Bonferroni adjustment for multiple comparisons. The GLS model demonstrated a more conservative approach, leading to higher p-values for the symptomatic group when compared to the MLR model. Nonetheless, significant effects were still observed in the symptomatic group from the 45-hour time point onwards in both models, indicating a strong and consistent time effect.
Implications of Model Selection:
The presence of significant p-values in the symptomatic group across both models suggests notable changes in IFITM3 expression levels over time. However, the differences in p-values and standard errors highlight the crucial role of model selection in repeated measures data. The GLS model is more appropriate for such data due to its consideration of within-subject correlations, which leads to more reliable inferences. In contrast, the MLR model’s ignorance of this correlation may result in optimistic p-values and underestimation of standard errors.
Consideration of Residuals:
Although not shown, the distribution of residuals from both models would provide additional insights into model fit. The residuals from the GLS model are expected to show less pattern, indicating a better fit due to the consideration of correlation. In contrast, the MLR model may exhibit patterns in the residuals, hinting at potential violations of independence. QQ plots would further reveal deviations from normality, particularly in the MLR model, due to the neglect of the repeated measures structure.
Conclusion:
The analysis confirms the importance of using appropriate statistical methods for repeated measures data. The GLS model’s ability to account for within-subject correlation provides more reliable standard error estimates and p-values, leading to more accurate inferences. The significant differences detected by both models from the 45-hour time point onward affirm the presence of differences in IFITM3 expression over time, particularly between symptomatic and asymptomatic subjects. These results emphasize the necessity of methodological rigor in statistical analyses to ensure the validity of conclusions drawn from the data.
Plot interpretation
Summary
In summary, the Repeated Measures GLS model is more suitable for this type of data. It accounts for within-subject correlation, providing more reliable and valid statistical inferences. The MLR model’s shortcomings are evident in the patterns observed in the residuals plot, which suggests it is not the best fit for repeated measures data.
As mentioned during the videos, repeated measures often gets lumped
into ANOVA type settings where all of the predictors are categorical.
Repeated measures can exist in almost any setting a normal MLR model
could be utilized. The following data set is the results of an
experiment studying the weight gain of rats on different diets. It is
actually available in the nlme package where the
gls
function lives.
rats<-data.frame(BodyWeight)
head(rats)
## weight Time Rat Diet
## 1 240 1 1 1
## 2 250 8 1 1
## 3 255 15 1 1
## 4 260 22 1 1
## 5 262 29 1 1
## 6 258 36 1 1
For this data set, it is expected that the weight gain (or loss) should behave in a linear way over time. The graph below provides a plot weight over time with an MLR fit for each diet group
ggplot(rats, aes(x=Time, y=weight,color=Diet)) +geom_point(size=.5)+
geom_smooth(method="lm",formula="y~x")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
#Compound Symmetry
CS<-gls(weight~Time*Diet,data=rats,correlation=corCompSymm(form=~1|Rat))
#Autoregressive
AR1<-gls(weight~Time*Diet,data=rats,correlation=corExp(form=~Time|Rat))
AIC(CS)
## [1] 1248.245
AIC(AR1)
## [1] 1168.088
lm
function and obtain a summary of the coefficients,
t-stats, and p-values. Compare the two results and comment on the
differences you see.coef(summary(AR1))
## Value Std.Error t-value p-value
## (Intercept) 250.3364321 13.3425214 18.762303 2.850308e-43
## Time 0.3670501 0.1042628 3.520431 5.530914e-04
## Diet2 202.2653798 23.1099250 8.752317 2.008209e-15
## Diet3 257.4918656 23.1099250 11.142047 5.375365e-22
## Time:Diet2 0.6606475 0.1805885 3.658303 3.383995e-04
## Time:Diet3 0.2917069 0.1805885 1.615312 1.080974e-01
ANSWER Discussion 2:
A. Independence of Residuals in MLR Model: The first plot provided is a correlogram of the residuals from an MLR model that includes both diet and time as predictors for rat weight. The correlogram shows the correlation between residuals as a function of the time distance between observations. We see a clear decay in the correlation as the distance increases, which suggests that residuals closer in time are more correlated with each other than residuals further apart. This is a violation of the independence assumption of ordinary least squares regression, which assumes that the residuals are independent of each other.
B. Appropriate Correlation Structure: Given that the correlogram suggests a decay in correlation as time increases, an autoregressive correlation structure might be more appropriate than compound symmetry. Compound symmetry assumes constant correlation between all pairs of observations within a subject, which does not seem to fit the observed pattern of decay in correlation over time.
Indeed, when comparing the AIC (Akaike Information Criterion) values for the two correlation structures (CS and AR1), the AR1 model has a significantly lower AIC value (1168.088) compared to the CS model (1248.245). A lower AIC value indicates a better model fit when comparing models fitted to the same data. Therefore, the AR1 model is a better fit for the data, confirming the initial intuition based on the correlogram.
C. Model Comparison: Comparing the results of the gls model with the correlation structure and the traditional lm model, we can look at the coefficients, t-statistics, and p-values. The lm model results would not account for the correlation structure and thus may give different standard errors and potentially different significance levels for the predictors.
From the summary output of the AR1 model (which includes the autoregressive correlation structure), we see that the coefficients for the interaction terms (Time:Diet2 and Time:Diet3) are significant for Diet2 (p < 0.001) but not significant for Diet3 (p > 0.1). This suggests that the effect of time on weight is different for rats on Diet2 compared to those on Diet1, but there is not a statistically significant difference in the time effect between rats on Diet3 and Diet1.
In summary, the correlogram indicates that the MLR model’s assumption of independent residuals is violated, and an AR correlation structure is more appropriate for this data. When using gls with an AR structure, the results suggest that the rate of weight gain over time differs by diet, specifically between Diet1 and Diet2. The difference in AIC values between the CS and AR models further confirms that the AR model is a better fit for this data set.
Points of Consideration:
1.(Intercept): The baseline weight is estimated at 250.34 when time is zero and the rat is on diet 1.
2.Time: There is an average weight gain of 0.367 units per time unit for rats on diet 1.
3. Diet2 and Diet3: Rats on diets 2 and 3 have a higher baseline weight compared to diet 1 by 202.27 and 257.49 units, respectively.
4. Time:Diet2 and Time:Diet3: The interaction terms suggest that the rate of weight gain over time differs between diets. For diet 2, weight gain increases by an additional 0.661 units per time unit over the baseline rate, while for diet 3, the increase is an additional 0.292 units per time unit. However, the interaction term for diet 3 is not statistically significant (p > 0.05), suggesting that the rate of weight gain over time for diet 3 is not significantly different from diet 1.