This is an example of using ANCOVA using data on a trigonmetry test before and after some instruction— the analysis isn't mine.
The data contains the observation number, the classtype, the pretest score, the posttest score and the IQ. The students were assigned to three different classes. Obviously you can alter these as you wish to suit the particular circumstances….and we probably wont have access to IQ scores. But something else, such as numbers of years of education would be interesting. Something tocontrolfor the effect. Think of it asgiven thatthis person had this many years of education, perhaps this increase in score isnt that surprising.
trig <- read.table("C:/Users/Stephen/Desktop/my.datafile.txt", quote="\"")
colnames(trig)=c("OBS", "CLASSTYPE", "PRE", "POST", "IQ")
attach(trig)
head(trig)
## OBS CLASSTYPE PRE POST IQ
## 1 1 1 3 10 122
## 2 2 2 24 34 129
## 3 3 3 10 21 114
## 4 4 1 5 10 121
## 5 5 2 18 27 114
## 6 6 3 3 18 114
Here you can see the headings of each column in the dataset.
There are three 'classtypes' which we can plot separately
plot(c(0, 25), c(0, 35), type = "n", ylab = "Post-class Score", xlab = "Pre-class score")
lines(PRE[CLASSTYPE == 1], POST[CLASSTYPE == 1], type = "p", col = "blue", pch = 1,
cex = 1.2)
lines(PRE[CLASSTYPE == 2], POST[CLASSTYPE == 2], type = "p", col = "red", pch = 20,
cex = 1.2)
lines(PRE[CLASSTYPE == 3], POST[CLASSTYPE == 3], type = "p", col = "green",
pch = 8, cex = 1.2)
legend("topleft", c("CLASSTYPE 1", "CLASSTYPE 2", "CLASSTYPE 3"), pch = c(1,
20, 8), col = c("blue", "red", "green"), cex = 1)
Next we use linear regression with post score as the dependent variable and classtype and pre score as the independent variables. There are three class types, and R has picked class type 1 as the referencelevel. We could of course change that.
CLASSTYPE <- factor(CLASSTYPE)
trig.fit <- lm(POST ~ CLASSTYPE + PRE)
summary(trig.fit)
##
## Call:
## lm(formula = POST ~ CLASSTYPE + PRE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.785 -3.549 -0.247 2.100 17.038
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.372 1.810 5.73 5.1e-07 ***
## CLASSTYPE2 -0.957 1.582 -0.60 0.548
## CLASSTYPE3 4.058 1.632 2.49 0.016 *
## PRE 0.773 0.170 4.54 3.4e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.9 on 52 degrees of freedom
## Multiple R-squared: 0.328, Adjusted R-squared: 0.289
## F-statistic: 8.46 on 3 and 52 DF, p-value: 0.000112
Next we can get the ANOVA table
anova(trig.fit)
## Analysis of Variance Table
##
## Response: POST
## Df Sum Sq Mean Sq F value Pr(>F)
## CLASSTYPE 2 116 58 2.41 0.1 .
## PRE 1 493 493 20.57 3.4e-05 ***
## Residuals 52 1247 24
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1