| id | group | time |
|---|---|---|
| 1 | P | 6546 |
| 1 | CR | 5264 |
| 1 | CHO | 6370 |
| 1 | CR-CHO | 6240 |
| 2 | P | 4515 |
| 2 | CR | 7320 |
Crossover experiment
From client:
Our study looks at the effects of creatine, carbohydrates and creatine-carbohydrate mix on endurance performance. We have 10 participants take a placebo, and the four treatments listed above with a 3 week washout [period] between each trial. Each trial was a ride to exhaustion, meaning a fixed workload of 65% of Vo2 max until they couldn’t ride any more.
Data provided:
Request:
- Assistance with model specification to test for a treatment effect
- Review provided analysis codes for accuracy
Assessment
The treatments have factorial structure, so can be encoded in terms of two factors (instead of a single “treatment” variable). This will facilitate estimating effects of CR and CHO separately as well as their interaction. I’d also suggest recording time in minutes instead of seconds for ease of interpretation.
| subj.id | cr | cho | time.min |
|---|---|---|---|
| 1 | N | N | 109.1 |
| 1 | Y | N | 87.73 |
| 1 | N | Y | 106.2 |
| 1 | Y | Y | 104 |
| 2 | N | N | 75.25 |
| 2 | Y | N | 122 |
Visually there’s no obvious treatment effect – most rides last around 120 minutes regardless of treatment group.
Ignoring crossover
The usual model without considering the crossover design would be:
\[ \text{time}_{ijk} = \mu + \text{cr}_i + \text{cho}_j + (\text{cr}\times\text{cho})_{ij} + \epsilon_{ijk} \quad \begin{cases} \text{treatments } i, j = 1, 2 \\ \text{subjects } k = 1, \dots, 10 \\ \epsilon_{ijk} \stackrel{\text{iid}}{\sim} N(0, \sigma^2) \end{cases} \] It’s often helpful to fit this for assessment purposes.
Code
# fit
fit.lm <- lm(time.min ~ cr*cho, data = clean)
fit.lm
Call:
lm(formula = time.min ~ cr * cho, data = clean)
Coefficients:
(Intercept) crY choY crY:choY
116.028 -9.208 2.905 10.220
The coefficient estimates give comparisons with the control:
(Intercept): mean ride time for the control groupcrY: the difference in mean ride time on the Creatine treatment relative to the controlchoY: the difference in mean ride time on the Carbohydrate treatment relative to the controlcrY:choY: the difference in mean ride time on the combination treatment relative to the control
Coding the treatments in a two-way structure allows one to check factorial contrasts, which are the average effects of each treatment.
cr: estimated effect of CR averaged over levels of CHOcho: estimated effect of CHO, averaged over levels of CRcr.cho: estimated difference in the estimated effect of CHO between levels of CR
Code
# factorial contrasts
fac.eff.emmc <- function(levels, ...){
data_grid(clean, cr, cho) |>
bind_cols(cr.contr = 0.5*rep(c(-1, 1), each = 2),
cho.contr = 0.5*rep(c(-1, 1), 2)) |>
mutate(cr.cho.contr = (2*cr.contr)*(2*cho.contr)) |>
arrange(cho, cr) |>
select(ends_with('contr')) |>
rename_with(~str_remove_all(.x, '.contr'))
}
fit.lm |>
emmeans(spec = ~ cr*cho) |>
contrast('fac.eff') |>
broom::tidy() |>
select(contrast, estimate, std.error, df, statistic, p.value) |>
pander()| contrast | estimate | std.error | df | statistic | p.value |
|---|---|---|---|---|---|
| cr | -4.098 | 13.19 | 36 | -0.3107 | 0.7578 |
| cho | 8.015 | 13.19 | 36 | 0.6077 | 0.5472 |
| cr.cho | 10.22 | 26.38 | 36 | 0.3875 | 0.7007 |
The standard errors don’t account for the crossover design so these tests aren’t appropriate, but they give a preliminary indication that there are likely no treatment effects. Creatine is associated with a slight decrease in mean ride time (estimate: -4m) and carbohydrate with a slight decrease (estimate: +8m). The interaction effect suggests that if carbohydrate is added to the creatine treatment, then the effect on mean ride time changes from -11m to -1m.
Adjusting for within-subject correlations
As a result of the crossover design, one would expect that trials on the same subject would be correlated. We can check this by looking at residuals by subject using the assessment model above. It’s fairly clear that there is some within-subject correlation, since the residuals for each subject are fairly consistent across treatments – e.g., subjects 4 and 5 are consistently riding about 50min faster than average regardless of treatment.
The usual approach here is to add subject as a random effect:
\[ \text{time}_{ijk} = \mu + \text{cr}_i + \text{cho}_j + (\text{cr}\times\text{cho})_{ij} + \gamma_k + \epsilon_{ijk} \quad \begin{cases} \text{treatments } i, j = 1, 2 \\ \text{subjects } k = 1, \dots, 10 \\ \gamma_k \stackrel{\text{iid}}{\sim} N(0, \sigma^2_\gamma) \\ \epsilon_{ijk} \stackrel{\text{iid}}{\sim} N(0, \sigma^2) \\ \gamma_k \perp \epsilon_{ijk} \end{cases} \]
This is the model that the GPT-generated code provided by the client fits. In this model the variance component \(\sigma^2_\gamma\) captures within-subject correlations.
Code
# client-provided GPT-generated code
fit.lme <- lmer(time.min ~ cr*cho + (1|subj.id), data = clean)
summary(fit.lme)Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: time.min ~ cr * cho + (1 | subj.id)
Data: clean
REML criterion at convergence: 346.1
Scaled residuals:
Min 1Q Median 3Q Max
-1.7197 -0.4575 -0.0550 0.6184 1.9747
Random effects:
Groups Name Variance Std.Dev.
subj.id (Intercept) 1411.0 37.56
Residual 328.4 18.12
Number of obs: 40, groups: subj.id, 10
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 116.028 13.189 12.104 8.798 1.32e-06 ***
crY -9.208 8.104 27.000 -1.136 0.266
choY 2.905 8.104 27.000 0.358 0.723
crY:choY 10.220 11.461 27.000 0.892 0.380
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) crY choY
crY -0.307
choY -0.307 0.500
crY:choY 0.217 -0.707 -0.707
What you see under Fixed effects in the output provides model estimates that match lm but with different standard errors that have now been adjusted for the within-subject correlation. What you see under Random effects are the variance component estimates \(\hat{\sigma}^2_\gamma\) (subj.id) and \(\hat{\sigma}^2\) (Residual). The estimated within-subject correlation is:
\[ \frac{\hat{\sigma}^2_\gamma}{\hat{\sigma}^2_\gamma + \hat{\sigma}^2} = 0.81 \]
Notice that without adjusting for within-subject correlations the estimated error standard deviation is \(\hat{\sigma} = 41.71\) but that after adjustment using the mixed-effects model the estimate is \(\hat{\sigma} = 18.12\). Since these estimates form the denominators for \(F\) tests in the ANOVA, including the random effect makes a considerable difference for the statistical inference.
Nonetheless, the proper analysis still does not identify any treatment effects:
Code
# anova using mixed effects model
anova(fit.lme) |>
pander()| Sum Sq | Mean Sq | NumDF | DenDF | F value | Pr(>F) | |
|---|---|---|---|---|---|---|
| cr | 168 | 168 | 1 | 27 | 0.5115 | 0.4806 |
| cho | 642.4 | 642.4 | 1 | 27 | 1.956 | 0.1733 |
| cr:cho | 261.1 | 261.1 | 1 | 27 | 0.7952 | 0.3804 |
The factorial effects are similarly insignificant:
| contrast | estimate | SE | df | t.ratio | p.value |
|---|---|---|---|---|---|
| cr | -4.098 | 5.73 | 27 | -0.7152 | 0.4806 |
| cho | 8.015 | 5.73 | 27 | 1.399 | 0.1733 |
| cr.cho | 10.22 | 11.46 | 27 | 0.8917 | 0.3804 |
The only thing that comes out significant in this analysis is the effect of subject in the crossover design – data indicate a strong within-subject correlation (\(\chi^2_1 = 33.91\), \(p < 0.0001\)).
Code
# test for effect of subject
ranova(fit.lme) ANOVA-like table for random-effects: Single term deletions
Model:
time.min ~ cr + cho + (1 | subj.id) + cr:cho
npar logLik AIC LRT Df Pr(>Chisq)
<none> 6 -173.03 358.07
(1 | subj.id) 5 -189.99 389.98 33.91 1 5.772e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Summary
The codes provided fit a reasonable model that should indeed be used instead of a standard ANOVA assuming independent observations. The specification is explained in detail above. None of the treatment effects are significant, but there is a strong effect of subject in the crossover design.