These notes build on the prior chapters’ R Code and Instructions, where we discuss some operational aspects of R and detail the use of many functions and how to install packages. Here, we skip details that have already been discussed in the prior R Code and Instructions.
Load the AMCP package.
library(AMCP)
data(chapter_9_table_1)
chapter_9_table_1
## Group X Y
## 1 1 1 4
## 2 1 2 9
## 3 1 3 8
## 4 2 3 12
## 5 2 4 11
## 6 2 5 16
chapter_9_table_1$Group <- as.factor(chapter_9_table_1$Group)
str(chapter_9_table_1)
## 'data.frame': 6 obs. of 3 variables:
## $ Group: Factor w/ 2 levels "1","2": 1 1 1 2 2 2
## $ X : int 1 2 3 3 4 5
## $ Y : int 4 9 8 12 11 16
Here we partition the data and perform two regressions, one for Group=1 and one for Group=1. Compare this output to Table 9.4.
lm.1 <- lm(Y ~ X, data=chapter_9_table_1[chapter_9_table_1$Group=="1",])
lm.2 <- lm(Y ~ X, data=chapter_9_table_1[chapter_9_table_1$Group=="2",])
summary(lm.1)
##
## Call:
## lm(formula = Y ~ X, data = chapter_9_table_1[chapter_9_table_1$Group ==
## "1", ])
##
## Residuals:
## 1 2 3
## -1 2 -1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.000 3.742 0.802 0.570
## X 2.000 1.732 1.155 0.454
##
## Residual standard error: 2.449 on 1 degrees of freedom
## Multiple R-squared: 0.5714, Adjusted R-squared: 0.1429
## F-statistic: 1.333 on 1 and 1 DF, p-value: 0.4544
summary(lm.2)
##
## Call:
## lm(formula = Y ~ X, data = chapter_9_table_1[chapter_9_table_1$Group ==
## "2", ])
##
## Residuals:
## 4 5 6
## 1 -2 1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.000 7.071 0.707 0.608
## X 2.000 1.732 1.155 0.454
##
## Residual standard error: 2.449 on 1 degrees of freedom
## Multiple R-squared: 0.5714, Adjusted R-squared: 0.1429
## F-statistic: 1.333 on 1 and 1 DF, p-value: 0.4544
Here we perform a linear model and ask for the ANOVA-based output (via anova()) and the regression-based output (via summary()). This yields, per the R default, Type I sums of squares.
ANOVA.9.1 <- lm(Y ~ Group, data=chapter_9_table_1)
ANCOVA.9.1 <- lm(Y ~ X+Group, data=chapter_9_table_1)
# ANOVA
anova(ANOVA.9.1)
## Analysis of Variance Table
##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 1 54 54 7.7143 0.04995 *
## Residuals 4 28 7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ANOVA.9.1)
##
## Call:
## lm(formula = Y ~ Group, data = chapter_9_table_1)
##
## Residuals:
## 1 2 3 4 5 6
## -3 2 1 -1 -2 3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.000 1.528 4.583 0.0102 *
## Group2 6.000 2.160 2.777 0.0499 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.646 on 4 degrees of freedom
## Multiple R-squared: 0.6585, Adjusted R-squared: 0.5732
## F-statistic: 7.714 on 1 and 4 DF, p-value: 0.04995
# ANCOVA
anova(ANCOVA.9.1)
## Analysis of Variance Table
##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X 1 67.6 67.6 16.9 0.02607 *
## Group 1 2.4 2.4 0.6 0.49503
## Residuals 3 12.0 4.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ANCOVA.9.1)
##
## Call:
## lm(formula = Y ~ X + Group, data = chapter_9_table_1)
##
## Residuals:
## 1 2 3 4 5 6
## -1 2 -1 1 -2 1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.000 2.309 1.299 0.285
## X 2.000 1.000 2.000 0.139
## Group2 2.000 2.582 0.775 0.495
##
## Residual standard error: 2 on 3 degrees of freedom
## Multiple R-squared: 0.8537, Adjusted R-squared: 0.7561
## F-statistic: 8.75 on 2 and 3 DF, p-value: 0.05598
Using the Anova() function from the car package (see the Chapter 7 R code), Type III sums of squares can be obtained.
library(car)
Anova(ANCOVA.9.1, type="III")
## Anova Table (Type III tests)
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 6.75 1 1.6875 0.2848
## X 16.00 1 4.0000 0.1393
## Group 2.40 1 0.6000 0.4950
## Residuals 12.00 3
Because the ANOVA is nested within the ANCOVA model, a model comparison can be done using the anova() function, where two different (but nested) linear models are provided as input. This output evaluates if the extra term in the ANCOVA (i.e., the slope) has a statistically significant effect.
anova(ANOVA.9.1, ANCOVA.9.1)
## Analysis of Variance Table
##
## Model 1: Y ~ Group
## Model 2: Y ~ X + Group
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 4 28
## 2 3 12 1 16 4 0.1393
It is useful to find the adjusted means. We can compute adjusted means using the full model shown in Equation 9.25. The predict() function can be used, which we demonstrate in a step-by-step fasion, but this is only one way to obtain the adjusted means.
#Compute grand mean across groups on the covariate; duplicate the grand mean for the number of groups (here two).
X.Grand.Mean <- mean(chapter_9_table_1$X)
X.Grand.Mean
## [1] 3
# Create factor with levels equal to group number
Group.Predict <- factor(c(1,2), levels=c(1,2))
Group.Predict
## [1] 1 2
## Levels: 1 2
# Create a data frame with same factor and predictor variable structure as used in ANCOVA model previously
Group.Predict <- data.frame(Group = Group.Predict, X = rep(X.Grand.Mean, 2))
Group.Predict
## Group X
## 1 1 3
## 2 2 3
# Use the ANCOVA model solution to make predictions of `Y` at the grand mean of `X`.
Adj.Means <- predict(ANCOVA.9.1, Group.Predict)
Adj.Means
## 1 2
## 9 11
# Observed means on X (for comparison purposes)
Obs.Mean.X <- c(mean(chapter_9_table_1[chapter_9_table_1$Group==1, "X"]),
mean(chapter_9_table_1[chapter_9_table_1$Group==2, "X"]))
Obs.Mean.X
## [1] 2 4
# Observed means on DV (for comparison purposes)
Obs.Mean.Y <- c(mean(chapter_9_table_1[chapter_9_table_1$Group==1, "Y"]),
mean(chapter_9_table_1[chapter_9_table_1$Group==2, "Y"]))
Obs.Mean.Y
## [1] 7 13
Summary.of.Means = data.frame(Obs.Mean.X=Obs.Mean.X, X.Grand.Mean=X.Grand.Mean,
Adjusted.Group.Means.Y=Adj.Means, Obs.Mean.Y=Obs.Mean.Y)
Summary.of.Means
## Obs.Mean.X X.Grand.Mean Adjusted.Group.Means.Y Obs.Mean.Y
## 1 2 3 9 7
## 2 4 3 11 13
Here we perform the numerical example with the Beck DePression data from Table 9.7
data(chapter_9_table_7)
chapter_9_table_7
## Condition Pre Post
## 1 1 18 12
## 2 1 16 0
## 3 1 16 10
## 4 1 15 9
## 5 1 14 0
## 6 1 20 11
## 7 1 14 2
## 8 1 21 4
## 9 1 25 15
## 10 1 11 10
## 11 2 18 11
## 12 2 16 4
## 13 2 15 19
## 14 2 14 15
## 15 2 20 3
## 16 2 25 14
## 17 2 11 10
## 18 2 25 16
## 19 2 11 10
## 20 2 22 20
## 21 3 15 17
## 22 3 19 25
## 23 3 10 10
## 24 3 29 22
## 25 3 24 23
## 26 3 15 10
## 27 3 9 2
## 28 3 18 10
## 29 3 22 14
## 30 3 13 7
str(chapter_9_table_7)
## 'data.frame': 30 obs. of 3 variables:
## $ Condition: int 1 1 1 1 1 1 1 1 1 1 ...
## $ Pre : int 18 16 16 15 14 20 14 21 25 11 ...
## $ Post : int 12 0 10 9 0 11 2 4 15 10 ...
chapter_9_table_7$Group <- chapter_9_table_7$Condition
chapter_9_table_7$Group[chapter_9_table_7$Group==1] <- "SSRI"
chapter_9_table_7$Group[chapter_9_table_7$Group==2] <- "Placebo"
chapter_9_table_7$Group[chapter_9_table_7$Group==3] <- "Wait List"
# Make Group a factor
chapter_9_table_7$Group <- as.factor(chapter_9_table_7$Group)
chapter_9_table_7
## Condition Pre Post Group
## 1 1 18 12 SSRI
## 2 1 16 0 SSRI
## 3 1 16 10 SSRI
## 4 1 15 9 SSRI
## 5 1 14 0 SSRI
## 6 1 20 11 SSRI
## 7 1 14 2 SSRI
## 8 1 21 4 SSRI
## 9 1 25 15 SSRI
## 10 1 11 10 SSRI
## 11 2 18 11 Placebo
## 12 2 16 4 Placebo
## 13 2 15 19 Placebo
## 14 2 14 15 Placebo
## 15 2 20 3 Placebo
## 16 2 25 14 Placebo
## 17 2 11 10 Placebo
## 18 2 25 16 Placebo
## 19 2 11 10 Placebo
## 20 2 22 20 Placebo
## 21 3 15 17 Wait List
## 22 3 19 25 Wait List
## 23 3 10 10 Wait List
## 24 3 29 22 Wait List
## 25 3 24 23 Wait List
## 26 3 15 10 Wait List
## 27 3 9 2 Wait List
## 28 3 18 10 Wait List
## 29 3 22 14 Wait List
## 30 3 13 7 Wait List
str(chapter_9_table_7)
## 'data.frame': 30 obs. of 4 variables:
## $ Condition: int 1 1 1 1 1 1 1 1 1 1 ...
## $ Pre : int 18 16 16 15 14 20 14 21 25 11 ...
## $ Post : int 12 0 10 9 0 11 2 4 15 10 ...
## $ Group : Factor w/ 3 levels "Placebo","SSRI",..: 2 2 2 2 2 2 2 2 2 2 ...
Descriptives.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
group_by(chapter_9_table_7, Group) %>%
summarise(mean=mean(Pre), sd=sd(Pre), n=n())
## # A tibble: 3 × 4
## Group mean sd n
## <fctr> <dbl> <dbl> <int>
## 1 Placebo 17.7 5.207900 10
## 2 SSRI 17.0 4.082483 10
## 3 Wait List 17.4 6.310485 10
group_by(chapter_9_table_7, Group) %>%
summarise(mean=mean(Post), sd=sd(Post), n=n())
## # A tibble: 3 × 4
## Group mean sd n
## <fctr> <dbl> <dbl> <int>
## 1 Placebo 12.2 5.731007 10
## 2 SSRI 7.3 5.355164 10
## 3 Wait List 14.0 7.571878 10
Here we perform a linear model and ask for the ANOVA-based output (via anova()) and the regression-based output (via summary()). This yields, per the R default, Type I sums of squares. See the first part of Table 9.9.
Beck_ANOVA.Model <- lm(Post ~ Group, data=chapter_9_table_7)
Beck_ANCOVA.Model <- lm(Post~ Group + Pre, data=chapter_9_table_7)
# ANOVA
anova(Beck_ANOVA.Model)
## Analysis of Variance Table
##
## Response: Post
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 2 240.47 120.233 3.0348 0.06473 .
## Residuals 27 1069.70 39.619
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ANCOVA
anova(Beck_ANCOVA.Model)
## Analysis of Variance Table
##
## Response: Post
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 2 240.47 120.23 4.1332 0.027629 *
## Pre 1 313.37 313.37 10.7723 0.002937 **
## Residuals 26 756.33 29.09
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now, we perform the ANOVA using the Type III sums of squares. See the seCondition part of Table 9.9. Note that the model output from summary() differs from the book because of the way in which R vs. SAS code the factors.
Anova(Beck_ANCOVA.Model, type="III")
## Anova Table (Type III tests)
##
## Response: Post
## Sum Sq Df F value Pr(>F)
## (Intercept) 1.17 1 0.0403 0.842477
## Group 217.15 2 3.7324 0.037584 *
## Pre 313.37 1 10.7723 0.002937 **
## Residuals 756.33 26
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(Beck_ANCOVA.Model)
##
## Call:
## lm(formula = Post ~ Group + Pre, data = chapter_9_table_7)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.6842 -3.9615 0.6448 3.8773 9.9675
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.7779 3.8756 0.201 0.84248
## GroupSSRI -4.4483 2.4160 -1.841 0.07703 .
## GroupWait List 1.9936 2.4128 0.826 0.41617
## Pre 0.6453 0.1966 3.282 0.00294 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.393 on 26 degrees of freedom
## Multiple R-squared: 0.4227, Adjusted R-squared: 0.3561
## F-statistic: 6.346 on 3 and 26 DF, p-value: 0.002252
Note that the reference level can be changed using the relevel() function, which we demonstrate here.
chapter_9_table_7.B <- chapter_9_table_7
chapter_9_table_7.B$Group <- relevel(chapter_9_table_7.B$Group, ref= "Wait List")
Beck_ANCOVA.Model2 <- lm(Post~ Group + Pre, data=chapter_9_table_7.B)
summary(Beck_ANCOVA.Model2)
##
## Call:
## lm(formula = Post ~ Group + Pre, data = chapter_9_table_7.B)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.6842 -3.9615 0.6448 3.8773 9.9675
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7715 3.8227 0.725 0.47492
## GroupPlacebo -1.9936 2.4128 -0.826 0.41617
## GroupSSRI -6.4419 2.4133 -2.669 0.01292 *
## Pre 0.6453 0.1966 3.282 0.00294 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.393 on 26 degrees of freedom
## Multiple R-squared: 0.4227, Adjusted R-squared: 0.3561
## F-statistic: 6.346 on 3 and 26 DF, p-value: 0.002252
Here we show the code that was the basis for the cover of the book cover. The data below uses Rosenthal’s Pygmalion data like that used for Ch. 9, exercise 15, but include the mean of two Post measures.
library(plotrix)
Data <- rbind(
c(0,61.00,86.00,25.00),
c(0,79.00,82.00,3.00),
c(0,86.00,103.00,17.00),
c(0,91.00,90.00,-1.00),
c(0,92.00,105.00,13.00),
c(0,100.00,120.00,20.00),
c(0,111.00,105.00,-6.00),
c(1,60.00,97.00,37.00),
c(1,61.00,106.00,45.00),
c(1,84.00,107.00,23.00),
c(1,88.00,128.00,40.00),
c(1,95.00,113.00,18.00),
c(1,100.00,108.00,8.00),
c(1,116.00,137.00,21.00))
colnames(Data) <- c("Treatment", "IQ.Pre", "IQ.8", "IQ.Gain")
Data <- as.data.frame(Data)
# To make adjustments to figure region
par(mar= c(5, 4, 4, 2) + 0.1, omi=c(.5, .5, .5, 1))
plot(IQ.8 ~ IQ.Pre, data=Data, type="n", ylab="Posttest", xlab="Pretest", xlim=c(60, 120), ylim=c(80, 140))
axis.break(axis=1, style="zigzag")
axis.break(axis=2, style="zigzag")
points(Data[Data$Treatment==0,"IQ.8"] ~ Data[Data$Treatment==0,"IQ.Pre"], pch=1)
points(Data[Data$Treatment==1,"IQ.8"] ~ Data[Data$Treatment==1,"IQ.Pre"], pch=16)
Result <- summary(lm(IQ.8 ~ as.factor(Treatment) + IQ.Pre, data=Data))
b.0 <- coef(Result)[1,1]
b.Treatment <- coef(Result)[2,1]
b.W <- coef(Result)[3,1]
y.hat <- b.0 + b.Treatment*Data["Treatment"] + b.W*Data["IQ.Pre"]
colnames(y.hat) <- NULL
Data.Fitted <- cbind(Data, y.hat=y.hat)
Data.Fitted[Data.Fitted$Treatment==0,"y.hat"]
## [1] 83.76818 93.52574 97.32035 100.03078 100.57287 104.90956 110.87252
Data.Fitted[Data.Fitted$Treatment==1,"y.hat"]
## [1] 99.46515 100.00723 112.47523 114.64358 118.43819 121.14862 129.82201
abline(a=b.0+b.Treatment, b=b.W)
abline(a=b.0, b=b.W)
Result.R <- summary(lm(IQ.8 ~ IQ.Pre, data=Data))
abline(Result.R, lty=2, lwd=2)
## Warning in abline(Result.R, lty = 2, lwd = 2): only using the first two of
## 8 regression coefficients
# mtext(side=4, line=.5, adj=1, expression(paste(widehat(italic(Y))[1]==50, sep="")))
text(125.3, b.0 + b.Treatment*1 + b.W*123, expression(paste(widehat(italic(Y))["T"], sep="")), xpd=NA)
text(125.3, b.0 + b.Treatment*0 + b.W*123, expression(paste(widehat(italic(Y))["C"], sep="")), xpd=NA)
text(126.3, 61.6504 + 0.5097*123, expression(paste(widehat(italic(Y))["Restricted"], sep="")), xpd=NA)
legend(60, 135, legend=c("Treatment, T", "Control, C"), pch=c(16, 1), bty="n")
text(125.3, mean(Data[Data$Treatment==0,"IQ.8"]), expression(paste(italic(bar(Y))["C"], sep="")), xpd=NA)
text(125.3, mean(Data[Data$Treatment==1,"IQ.8"]), expression(paste(italic(bar(Y))["T"], sep="")), xpd=NA)
text(123.8, mean(Data$IQ.8), expression(paste(bar(italic(Y)), sep="")), xpd=NA)
abline(h=mean(Data[Data$Treatment==0,"IQ.8"]))
abline(h=mean(Data[Data$Treatment==1,"IQ.8"]))
abline(h=mean(Data$IQ.8))
The above plot has details that may not be necessary. Here we show just the ANCOVA output for the plot (i.e., this plot does not include the simpler ANOVA model lines [had we ignored the pretest]). Note too that the “Restricted” line could be dropped if desired.
# To make adjustments to figure region
par(mar= c(5, 4, 4, 2) + 0.1, omi=c(.5, .5, .5, 1))
plot(IQ.8 ~ IQ.Pre, data=Data, type="n", ylab="Posttest", xlab="Pretest", xlim=c(60, 120), ylim=c(80, 140), main="ANCOVA Model Visually")
axis.break(axis=1, style="zigzag")
axis.break(axis=2, style="zigzag")
points(Data[Data$Treatment==0,"IQ.8"] ~ Data[Data$Treatment==0,"IQ.Pre"], pch=1)
points(Data[Data$Treatment==1,"IQ.8"] ~ Data[Data$Treatment==1,"IQ.Pre"], pch=16)
Result <- summary(lm(IQ.8 ~ as.factor(Treatment) + IQ.Pre, data=Data))
b.0 <- coef(Result)[1,1]
b.Treatment <- coef(Result)[2,1]
b.W <- coef(Result)[3,1]
y.hat <- b.0 + b.Treatment*Data["Treatment"] + b.W*Data["IQ.Pre"]
colnames(y.hat) <- NULL
Data.Fitted <- cbind(Data, y.hat=y.hat)
Data.Fitted[Data.Fitted$Treatment==0,"y.hat"]
## [1] 83.76818 93.52574 97.32035 100.03078 100.57287 104.90956 110.87252
Data.Fitted[Data.Fitted$Treatment==1,"y.hat"]
## [1] 99.46515 100.00723 112.47523 114.64358 118.43819 121.14862 129.82201
abline(a=b.0+b.Treatment, b=b.W)
abline(a=b.0, b=b.W)
Result.R <- summary(lm(IQ.8 ~ IQ.Pre, data=Data))
abline(Result.R, lty=2, lwd=2)
## Warning in abline(Result.R, lty = 2, lwd = 2): only using the first two of
## 8 regression coefficients
# mtext(side=4, line=.5, adj=1, expression(paste(widehat(italic(Y))[1]==50, sep="")))
text(125.3, b.0 + b.Treatment*1 + b.W*123, expression(paste(widehat(italic(Y))["T"], sep="")), xpd=NA)
text(125.3, b.0 + b.Treatment*0 + b.W*123, expression(paste(widehat(italic(Y))["C"], sep="")), xpd=NA)
text(126.3, 61.6504 + 0.5097*123, expression(paste(widehat(italic(Y))["Restricted"], sep="")), xpd=NA)
legend(60, 135, legend=c("Treatment, T", "Control, C"), pch=c(16, 1), bty="n")
Here we show how to perform ANCOVA with heterogenious slopes. Note that we use Type III Sums of Squares (using Anova() from the car package).
data(chapter_9_extension_table_1)
# To make the syntax a bit easier, we specify that Group is a gropuing variable (i.e., a factor)
chapter_9_extension_table_1$Group <- as.factor(chapter_9_extension_table_1$Group)
ANOVA.model <- lm(Y ~ Group, data=chapter_9_extension_table_1)
Anova(ANOVA.model, type="III")
## Anova Table (Type III tests)
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 147 1 18.375 0.01278 *
## Group 54 1 6.750 0.06017 .
## Residuals 32 4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ANOVA.model)
##
## Call:
## lm(formula = Y ~ Group, data = chapter_9_extension_table_1)
##
## Residuals:
## 1 2 3 4 5 6
## -2.000e+00 2.000e+00 -1.887e-15 -2.000e+00 -2.000e+00 4.000e+00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.000 1.633 4.287 0.0128 *
## Group2 6.000 2.309 2.598 0.0602 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.828 on 4 degrees of freedom
## Multiple R-squared: 0.6279, Adjusted R-squared: 0.5349
## F-statistic: 6.75 on 1 and 4 DF, p-value: 0.06017
ANOCVA.model <- lm(Y ~ Group + X, data=chapter_9_extension_table_1)
Anova(ANOCVA.model, type="III")
## Anova Table (Type III tests)
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 6.75 1 1.2656 0.3425
## Group 2.40 1 0.4500 0.5504
## X 16.00 1 3.0000 0.1817
## Residuals 16.00 3
summary(ANOCVA.model)
##
## Call:
## lm(formula = Y ~ Group + X, data = chapter_9_extension_table_1)
##
## Residuals:
## 1 2 3 4 5 6
## 1.221e-15 2.000e+00 -2.000e+00 -3.121e-16 -2.000e+00 2.000e+00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.000 2.667 1.125 0.342
## Group2 2.000 2.981 0.671 0.550
## X 2.000 1.155 1.732 0.182
##
## Residual standard error: 2.309 on 3 degrees of freedom
## Multiple R-squared: 0.814, Adjusted R-squared: 0.6899
## F-statistic: 6.563 on 2 and 3 DF, p-value: 0.08025
ANCOVA.Heterogeneity.model <- lm(Y ~ Group*X, data=chapter_9_extension_table_1)
Anova(ANCOVA.Heterogeneity.model, type="III")
## Anova Table (Type III tests)
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 10.714 1 1.7857 0.3132
## Group 1.500 1 0.2500 0.6667
## X 2.000 1 0.3333 0.6220
## Group:X 4.000 1 0.6667 0.5000
## Residuals 12.000 2
summary(ANCOVA.Heterogeneity.model)
##
## Call:
## lm(formula = Y ~ Group * X, data = chapter_9_extension_table_1)
##
## Residuals:
## 1 2 3 4 5 6
## -1 2 -1 1 -2 1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.000 3.742 1.336 0.313
## Group2 -4.000 8.000 -0.500 0.667
## X 1.000 1.732 0.577 0.622
## Group2:X 2.000 2.450 0.816 0.500
##
## Residual standard error: 2.449 on 2 degrees of freedom
## Multiple R-squared: 0.8605, Adjusted R-squared: 0.6512
## F-statistic: 4.111 on 3 and 2 DF, p-value: 0.2018
The code below provides a figure like Figure 9E.4. Notice that (a) we have used Type III sums of squares (via Anova).
data(C9ExtFigs4and5)
C9ExtFigs4and5$Treatment <- as.factor(C9ExtFigs4and5$Treatment)
ANCOVA.Heterogeneity.IQ <- lm(AvPost ~ IQPre*Treatment, data=C9ExtFigs4and5)
Anova(ANCOVA.Heterogeneity.IQ, type="III")
## Anova Table (Type III tests)
##
## Response: AvPost
## Sum Sq Df F value Pr(>F)
## (Intercept) 8984 1 68.4661 4.026e-15 ***
## IQPre 43650 1 332.6688 < 2.2e-16 ***
## Treatment 133 1 1.0144 0.3146
## IQPre:Treatment 277 1 2.1128 0.1471
## Residuals 40151 306
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The code below provides a figure like Figure 9E.4.
plot(AvPost ~ IQPre, data=C9ExtFigs4and5, type="n", xlim=c(30, 160), ylim=c(60, 180), ylab="IQ Average of Mid-year and End-of-year Tests", xlab="IQ at Start of School Year")
# Seperate groups
C9ExtFigs4and5.Control <- Controls <- C9ExtFigs4and5[C9ExtFigs4and5[,"Treatment"]==0, ]
C9ExtFigs4and5.Treatment <- Bloomers <- C9ExtFigs4and5[C9ExtFigs4and5[,"Treatment"]==1, ]
# plot points by groups
points(AvPost ~ IQPre, data=C9ExtFigs4and5.Control, pch=1)
points(AvPost ~ IQPre, data=C9ExtFigs4and5.Treatment, pch=5)
# Descriptives for DV
group_by(C9ExtFigs4and5, Treatment) %>%
summarise(mean.AvPost=mean(AvPost), var.AvPost=var(AvPost), sd.AvPost=sd(AvPost), SS=var(AvPost)*(n()-1))
## # A tibble: 2 × 5
## Treatment mean.AvPost var.AvPost sd.AvPost SS
## <fctr> <dbl> <dbl> <dbl> <dbl>
## 1 0 103.5102 299.6805 17.31128 73421.72
## 2 1 109.6953 438.1319 20.93160 27602.31
# Descriptives for covariate
group_by(C9ExtFigs4and5, Treatment) %>%
summarise(mean.IQPre=mean(IQPre), var.IQPre=var(IQPre), sd.IQPre=sd(IQPre), n=n())
## # A tibble: 2 × 5
## Treatment mean.IQPre var.IQPre sd.IQPre n
## <fctr> <dbl> <dbl> <dbl> <int>
## 1 0 97.7561 339.4750 18.42485 246
## 2 1 101.1875 381.6468 19.53578 64
IQPre.Hypothetical = seq(30, 160, .1)
To.Predict.for.Control <- data.frame(IQPre=IQPre.Hypothetical, Treatment=as.factor(rep(0, length(IQPre.Hypothetical))))
To.Predict.for.Treatment <- data.frame(IQPre=IQPre.Hypothetical, Treatment=as.factor(rep(1, length(IQPre.Hypothetical))))
lines(IQPre.Hypothetical, predict(ANCOVA.Heterogeneity.IQ, newdata=To.Predict.for.Control), lty=1)
lines(IQPre.Hypothetical, predict(ANCOVA.Heterogeneity.IQ, newdata=To.Predict.for.Treatment), lty=2)
legend(40, 170, c("Control", "Bloomers"), pch=c(1, 5), lty=c(1, 2), bty="n")
This show the prediction at the (grand) mean of the IQ Pretest.
To.Predict <- data.frame(IQPre=mean(C9ExtFigs4and5$IQPre), Treatment=as.factor(c(0, 1)))
predict(ANCOVA.Heterogeneity.IQ, newdata=To.Predict)
## 1 2
## 104.0234 107.3907
Summary of the IV
Summary.IV <- group_by(C9ExtFigs4and5, Treatment) %>%
summarise(mean.IV=mean(IQPre), var.IV=var(IQPre), sd.IV=sd(IQPre), SS.IV=n()*var(IQPre), n=n())
For some values of interest (e.g., for the variance of the vertical distance, Equation E.24)
# One way of obtaining summary statistics of interest.
IV.1 <- Summary.IV[Summary.IV$Treatment==0, ]
n.1 <- IV.1$n
SS.IV.1 <- IV.1$SS.IV
IV.2 <- Summary.IV[Summary.IV$Treatment==1, ]
n.2 <- IV.2$n
SS.IV.2 <- IV.2$SS.IV
SS.Residual <- last(Anova(ANCOVA.Heterogeneity.IQ, type="III")$Sum)
df.Residual <- last(Anova(ANCOVA.Heterogeneity.IQ, type="III")$Df)
s.squared <- SS.Residual/df.Residual
# Here is a function that returns the variance of the predicted difference.
s.squared.Y1p.minus.Y2p <- function(Xp, X1, X2, model)
{
n1 <- length(X1)
n2 <- length(X2)
Xbar.1 <- mean(X1)
Xbar.2 <- mean(X2)
SS1 <- sum((X1-Xbar.1)^2)
SS2 <- sum((X2-Xbar.2)^2)
Sq.Diff.Xp.Xbar.1 <- (Xp-Xbar.1)^2
Sq.Diff.Xp.Xbar.2 <- (Xp-Xbar.2)^2
SS.Residual <- last(Anova(model, type="III")$Sum)
df.Residual <- last(Anova(model, type="III")$Df)
s.squared <- SS.Residual/df.Residual
return(s.squared*(1/n1 + 1/n2 + Sq.Diff.Xp.Xbar.1/SS1 + Sq.Diff.Xp.Xbar.2/SS2))
}
# The IV of interest seperated by group (for convenience)
IV.1 <- C9ExtFigs4and5[C9ExtFigs4and5$Treat==0, "IQPre"]
IV.2 <- C9ExtFigs4and5[C9ExtFigs4and5$Treat==1, "IQPre"]
# Application of the function for the variance of the predicted difference for a given value of Xp, where Xp=100
s.squared.Y1p.minus.Y2p(Xp=100, X1=IV.1, X2=IV.2, model=ANCOVA.Heterogeneity.IQ)
## [1] 2.599205
s.Y1p.minus.Y2p <- sqrt(s.squared.Y1p.minus.Y2p(Xp=100, X1=IV.1, X2=IV.2, model=ANCOVA.Heterogeneity.IQ))
s.Y1p.minus.Y2p
## [1] 1.612205
# Now the standard deviation at the mean of the IV.
s.Y1p.minus.Y2p <- sqrt(s.squared.Y1p.minus.Y2p(Xp=mean(C9ExtFigs4and5$IQPre), X1=IV.1, X2=IV.2, model=ANCOVA.Heterogeneity.IQ))
s.Y1p.minus.Y2p
## [1] 1.62013
Center of Accuracy
C.a <- (sum((IV.1 - mean(IV.1))^2)*mean(IV.2) + sum((IV.2 - mean(IV.2))^2)*mean(IV.1))/(sum((IV.1 - mean(IV.1))^2) + sum((IV.2 - mean(IV.2))^2))
C.a
## [1] 100.418
# Standard error of the Center of Accuracy (eq. Chapter 9, Extention 28)
sqrt(s.squared*sqrt(1/n.1 + 1/n.2 + ((mean(IV.1)-mean(IV.2))^2)/(sum((IV.1 - mean(IV.1))^2) + sum((IV.2 - mean(IV.2))^2))))
## [1] 4.296868
s.Y1p.minus.Y2p <- sqrt(s.squared.Y1p.minus.Y2p(Xp=C.a, X1=IV.1, X2=IV.2, model=ANCOVA.Heterogeneity.IQ))
s.Y1p.minus.Y2p
## [1] 1.611824
# The test of C.a
C.a/s.Y1p.minus.Y2p
## [1] 62.30084
Function for the vertical distance; and application.
D.X <- function(ancova.model, Xp)
{
Reg.Coefs <- coef(ancova.model)
# Result <- Reg.Coefs["Treatment1"] + (Reg.Coefs["Treatment1:IQPre"])*Xp # If input in reverse order
Result <- Reg.Coefs["Treatment1"] + (Reg.Coefs["IQPre:Treatment1"])*Xp
return(Result)
}
# At X=100
D.X(ancova.model=ANCOVA.Heterogeneity.IQ, X=100)
## Treatment1
## 3.554513
# or at the mean:
D.X(ancova.model=ANCOVA.Heterogeneity.IQ, X=mean(C9ExtFigs4and5$IQPre))
## Treatment1
## 3.367316
Recall the way in which we used the model to predict an outcome (e.g., for the plot above). Here we do the same but at the mean of the covariate.
To.Predict.at.Mean <- data.frame(IQPre=mean(C9ExtFigs4and5$IQPre), Treatment=as.factor(c(0, 1)))
predict(ANCOVA.Heterogeneity.IQ, newdata=To.Predict.at.Mean)
## 1 2
## 104.0234 107.3907
Obtain \(W^2\)
N <- dim(C9ExtFigs4and5)[1]
2*qf(.95, 2, N-4)
## [1] 6.050506