Exercise 1: Conceptual questions

  1. What is the primary impact on running a regression model and ignoring the fact that the data has repeated measures in it?

Answer A:

  1. Biased estimates: The model’s estimates of the relationship between variables may be inaccurate, either underestimating or overestimating the true effect.
  2. Incorrect standard errors: The standard errors, which indicate the precision of the estimates, will be incorrect. This leads to:
    • Invalid p-values: These reflect the significance of the results, but will be unreliable with incorrect standard errors.
    • Unreliable confidence intervals: These represent the range within which the true effect is likely to lie, but will be inaccurate if standard errors are wrong.
  1. If the pattern of a correlogram shows that the correlation stay relatively constant across the difference in time of your residuals, what correlation structure would be appropriate to model that behavior?

Answer B:

  1. If the correlogram shows: constant correlation across time lags for your residuals, the appropriate correlation structure to model this behavior would be:: simple correlation structure or uniform correlation.

  2. This structure assumes that the correlation (ρ) between any two residuals is constant regardless of the time difference (lag) between them.

  1. Load in the nlme package and type ?corStruct. Provide a list of the correlation structure names. Keep this in mind that while we’ve explored AR1 (corExp) and CS (corCompSymm), we can always try others to get a better fit.
# Load the nlme package
library(nlme)

# Get help for corStruct function
?corStruct
## starting httpd help server ... done
#  manually list the commonly used correlation structures.
correlation_structures <- c("corAR1", "corARMA", "corCAR1", "corCompSymm", "corExp",
                            "corGaus", "corLin", "corRatio", "corSpher", "corSymm")

# Print  list 
print(correlation_structures)
##  [1] "corAR1"      "corARMA"     "corCAR1"     "corCompSymm" "corExp"     
##  [6] "corGaus"     "corLin"      "corRatio"    "corSpher"    "corSymm"

Exercise 2: Brain Sites of Short and Long Term Memory

In the late 1980s and early 1990s, it was hypothesized that the hippocompal region of brain is a potential site for short term memory and not long term memory. To test this, 18 monkeys were given training to discriminate 100 pairs of objects. After their training, 11 monkey were randomly selected and went through a procedure that had access to their hippocampal formations blocked. The remaining 7 were left untreated.

The interesting bit here is that not all 100 pairs of objects were trained at the same time. 20 were trained to all the monkeys 2 weeks prior to the blockage, 20 were trained 4 weeks prior, and 20 at 8,12, and 16 weeks prior. After the blockage to the hippocampus all monkeys were then given a test to see how many pairs of objects they could discriminate. If there hypothesis was true, there should be a difference between the two groups of monkeys when looking at their performance of objecst trained 2 and 4 weeks prior (short term memory) while there should not be much of a difference for the objects for later times prior (long term memory).

A plot of the data set is below. The reponse is the NewPercentCorrect variables which is the percentage of correct objects for each of the 20 items at each training week. Use the data set to answer the following questions

  1. Based on the graph, it doesn’t appear that the trend is linear to model Week as a numeric variable. Additionally the question of interest is to examine what is going on between the treatment versus control at each training week (prior to hippocampal blockage) so treating it as a factor will be helpful. Based on the graph do you think an interaction term is needed between Treatment and Week? Explain your view point.

  2. Use the modeling codes from the Prelive Assignment to obtain a correlogram of the residuals fitted by a basic MLR model. Be sure to include Week as a factor variable and not numeric. Comment on whether it is obvious on whether a specific correlation structure would work well or not.

  3. Fit two repeated measures models, one compound symmetry and one AR1. Compare the AIC values. You can optionally try the unstructured one if it will run.

  4. Based on the best AIC fit, use similar codes from the pre live assignment to compare the treatment groups at each Week using a Bonferroni correction. Does the data conclude that the researchers hypothesis appears reasonable? Point to which tests support your answer.

Answers Exercise 2:

A. I analyzed the plot and observed that: * Non-linear trend: Both groups’ performance slopes change, indicating a non-linear relationship between weeks and scores. * Treated group: Performance drops significantly around Week 2 and fluctuates afterwards. * Control group: Performance stays relatively stable.

These contrasting patterns suggest the treatment’s effect varies across weeks. The difference in slopes, especially the treated group’s drop, highlights a week-dependent relationship with performance. Therefore, including an interaction term between Treatment and Week in the statistical model is crucial to capture this complex relationship.

B: Comparison of Repeated Measures and MLR Models: Similarities:

While estimates are similar, the GLS model provides more precise estimates due to smaller standard errors for within-group changes. Notably, accounting for repeated measures can potentially alter conclusions about statistical significance in specific cases. Therefore, using the appropriate model like GLS is crucial for reliable and valid statistical inferences in repeated measures data.

C:

```{r} monkey <- read.csv(“monkey.csv”, stringsAsFactors = TRUE) print(names(monkey))

monkey\(Week <- factor(monkey\)Week) cs_model <- lme(NewPerCorrect ~ Treatment * Week, random = ~1 | Monkey, data = monkey, correlation = corCompSymm(form = ~1 | Monkey)) summary(cs_model) ar1_model <- lme(NewPerCorrect ~ Treatment * Week, random = ~1 | Monkey, data = monkey, correlation = corAR1(form = ~1 | Monkey)) summary(ar1_model) aic_cs <- AIC(cs_model) aic_ar1 <- AIC(ar1_model) print(paste(“AIC for CS model:”, aic_cs)) print(paste(“AIC for AR1 model:”, aic_ar1)) try({ us_model <- lme(NewPerCorrect ~ Treatment * Week, random = ~1 | Monkey, data = monkey, correlation = corSymm(form = ~1 | Monkey)) summary(us_model) aic_us <- AIC(us_model) print(paste(“AIC for Unstructured model:”, aic_us)) }, silent = TRUE) ``` * Repeated measures vs. MLR: * * Similarities: * * * ** Similar estimates for within-group changes and group differences. * * * **Mostly identical p-values after correction.

Differences:

Takeaway:

GLS provides more precise estimates and might influence significance in some cases. Using the appropriate model (like GLS) is crucial for reliable inferences in repeated measures data.

Week = 4: contrast estimate SE df t.ratio p.value Control - Treated 19.08 6.4 16 2.981 0.0088

Week = 8: contrast estimate SE df t.ratio p.value Control - Treated 6.29 6.4 16 0.982 0.3407

Week = 12: contrast estimate SE df t.ratio p.value Control - Treated -9.10 6.4 16 -1.423 0.1740

Week = 16: contrast estimate SE df t.ratio p.value Control - Treated 3.75 6.4 16 0.586 0.5658

Degrees-of-freedom method: containment

INTERPRETATION: Bonferroni-corrected pairwise comparisons reveal:

  1. Week 2: The control group outperformed the treated group by 14.26%, significantly (p = 0.0406).
  2. Week 4: The control group’s lead increased to 19.08%, still significant (p = 0.0088).
  3. Weeks 8, 12, 16: No significant differences (p-values: 0.3407, 0.1740, 0.5658, respectively).

Interpretation: - Significant disparities in the early weeks (2 and 4) align with the idea that the hippocampal region influences short-term memory. The treated group, with impaired hippocampal function, lagged in performance initially. - The negligible differences in later weeks (8, 12, 16) imply a reduced role of the hippocampus in long-term memory.

Conclusion: These findings lend credence to the hypothesis that the hippocampus is more crucial for short-term than long-term memory. The early weeks’ significant differences and the later weeks’ lack of them back this theory.