Designing Experiments and Analyzing Data: A Model Comparison Perspective (3rd edition)

R Code and Instructions to Accompany Chapter 11

These notes build on the prior chapters’ R Code and Instructions, where we discuss some operational aspects of R and detail the use of many functions and how to install packages. Here, we skip details that have already been discussed in the prior R Code and Instructions.

Load the AMCP package.

library(AMCP)
data(chapter_11_table_1)
chapter_11_table_1

##   YCondition1 YCondition2
## 1           8          10
## 2           3           6
## 3          12          13
## 4           5           9
## 5           7           8
## 6          13          14

str(chapter_11_table_1)

## 'data.frame':    6 obs. of  2 variables:
##  $ YCondition1: int  8 3 12 5 7 13
##  $ YCondition2: int  10 6 13 9 8 14

Paired samples \(t\)-test. Note that the square of the resulting t-value below is equal to the F-value shown just after Table 11.3.

t.test(x=chapter_11_table_1$YCondition1, paired=TRUE, y=chapter_11_table_1$YCondition2)

## 
##  Paired t-test
## 
## data:  chapter_11_table_1$YCondition1 and chapter_11_table_1$YCondition2
## t = -3.873, df = 5, p-value = 0.01172
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.3274428 -0.6725572
## sample estimates:
## mean of the differences 
##                      -2

data(chapter_11_table_5)
chapter_11_table_5

##    Months30 Months36 Months42 Months48
## 1       108       96      110      122
## 2       103      117      127      133
## 3        96      107      106      107
## 4        84       85       92       99
## 5       118      125      125      116
## 6       110      107       96       91
## 7       129      128      123      128
## 8        90       84      101      113
## 9        84      104      100       88
## 10       96      100      103      105
## 11      105      114      105      112
## 12      113      117      132      130

str(chapter_11_table_5)

## 'data.frame':    12 obs. of  4 variables:
##  $ Months30: int  108 103 96 84 118 110 129 90 84 96 ...
##  $ Months36: int  96 117 107 85 125 107 128 84 104 100 ...
##  $ Months42: int  110 127 106 92 125 96 123 101 100 103 ...
##  $ Months48: int  122 133 107 99 116 91 128 113 88 105 ...

Marginal means

# Means by column across rows. 
apply(chapter_11_table_5, MARGIN=1, FUN=mean)

##  [1] 109 120 104  90 121 101 127  97  94 101 109 123

# Means by row across columns. 
apply(chapter_11_table_5, MARGIN=2, FUN=mean)

## Months30 Months36 Months42 Months48 
##      103      107      110      112

First, we need to change the structure of the data. We use PP here to denote “person-period”, which means that the data will be in the “univariate” (or “long”) form from the “multivariate” form it is currently in. We find the easiest way to do this is using the tidyr and dplyr packages.

library(tidyr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# Add an ID variable (by row number)
C11_T5_PP <- chapter_11_table_5
C11_T5_PP$ID <- as.factor(1:nrow(C11_T5_PP))

# Note that the "sep=-3" partitions between the 3rd and 2nd value from the end. 
# The "-ID" here means to leave ID as it is (i.e., without "gathering" it)
C11_T5_PP = C11_T5_PP %>%
gather(key=Unit.Month, value=Values, -ID) %>%
separate(col=Unit.Month, into=c("Unit", "Month"), sep=-3) %>%
arrange(ID, Month)

# Display the data
C11_T5_PP

##    ID   Unit Month Values
## 1   1 Months    30    108
## 2   1 Months    36     96
## 3   1 Months    42    110
## 4   1 Months    48    122
## 5   2 Months    30    103
## 6   2 Months    36    117
## 7   2 Months    42    127
## 8   2 Months    48    133
## 9   3 Months    30     96
## 10  3 Months    36    107
## 11  3 Months    42    106
## 12  3 Months    48    107
## 13  4 Months    30     84
## 14  4 Months    36     85
## 15  4 Months    42     92
## 16  4 Months    48     99
## 17  5 Months    30    118
## 18  5 Months    36    125
## 19  5 Months    42    125
## 20  5 Months    48    116
## 21  6 Months    30    110
## 22  6 Months    36    107
## 23  6 Months    42     96
## 24  6 Months    48     91
## 25  7 Months    30    129
## 26  7 Months    36    128
## 27  7 Months    42    123
## 28  7 Months    48    128
## 29  8 Months    30     90
## 30  8 Months    36     84
## 31  8 Months    42    101
## 32  8 Months    48    113
## 33  9 Months    30     84
## 34  9 Months    36    104
## 35  9 Months    42    100
## 36  9 Months    48     88
## 37 10 Months    30     96
## 38 10 Months    36    100
## 39 10 Months    42    103
## 40 10 Months    48    105
## 41 11 Months    30    105
## 42 11 Months    36    114
## 43 11 Months    42    105
## 44 11 Months    48    112
## 45 12 Months    30    113
## 46 12 Months    36    117
## 47 12 Months    42    132
## 48 12 Months    48    130

#Note that we do not need "Unit" variable, so we drop it.  
C11_T5_PP$Unit <- NULL

# Examining the structure of the data.
str(C11_T5_PP)

## 'data.frame':    48 obs. of  3 variables:
##  $ ID    : Factor w/ 12 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ Month : chr  "30" "36" "42" "48" ...
##  $ Values: int  108 96 110 122 103 117 127 133 96 107 ...

# Examining the structure of the data we saw that "Month" is being treated as a character variable,
# yet we want it to be a factor:
C11_T5_PP$Month <- as.factor(C11_T5_PP$Month)
C11_T5_PP$ID <- as.factor(C11_T5_PP$ID)

str(C11_T5_PP)

## 'data.frame':    48 obs. of  3 variables:
##  $ ID    : Factor w/ 12 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ Month : Factor w/ 4 levels "30","36","42",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ Values: int  108 96 110 122 103 117 127 133 96 107 ...

Importantly, ID and Month must both be treated as factors. If not, the incorrect model will be fitted. This gives a basic (uncorrected for sphericity) repeated measures ANOVA. The car package provides methods, using the Anova() function for obtaining corrected output.

Repeated.Measures.ANOVA <- aov(Values ~ Month + Error(ID/Month), data=C11_T5_PP)
summary(Repeated.Measures.ANOVA)

## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 11   6624   602.2               
## 
## Error: ID:Month
##           Df Sum Sq Mean Sq F value Pr(>F)  
## Month      3    552  184.00   3.027 0.0432 *
## Residuals 33   2006   60.79                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1