ANCOVA example

This is an example of using ANCOVA using data on a trigonmetry test before and after some instruction— the analysis isn't mine. The data contains the observation number, the classtype, the pretest score, the posttest score and the IQ. The students were assigned to three different classes. Obviously you can alter these as you wish to suit the particular circumstances….and we probably wont have access to IQ scores. But something else, such as numbers of years of education would be interesting. Something tocontrolfor the effect. Think of it asgiven thatthis person had this many years of education, perhaps this increase in score isnt that surprising.

trig <- read.table("C:/Users/Stephen/Desktop/my.datafile.txt", quote="\"")
colnames(trig)=c("OBS", "CLASSTYPE", "PRE", "POST", "IQ")
attach(trig)
head(trig)
##   OBS CLASSTYPE PRE POST  IQ
## 1   1         1   3   10 122
## 2   2         2  24   34 129
## 3   3         3  10   21 114
## 4   4         1   5   10 121
## 5   5         2  18   27 114
## 6   6         3   3   18 114

Here you can see the headings of each column in the dataset.

There are three 'classtypes' which we can plot separately

plot(c(0, 25), c(0, 35), type = "n", ylab = "Post-class Score", xlab = "Pre-class score")
lines(PRE[CLASSTYPE == 1], POST[CLASSTYPE == 1], type = "p", col = "blue", pch = 1, 
    cex = 1.2)
lines(PRE[CLASSTYPE == 2], POST[CLASSTYPE == 2], type = "p", col = "red", pch = 20, 
    cex = 1.2)
lines(PRE[CLASSTYPE == 3], POST[CLASSTYPE == 3], type = "p", col = "green", 
    pch = 8, cex = 1.2)
legend("topleft", c("CLASSTYPE 1", "CLASSTYPE 2", "CLASSTYPE 3"), pch = c(1, 
    20, 8), col = c("blue", "red", "green"), cex = 1)

plot of chunk unnamed-chunk-2

Next we use linear regression with post score as the dependent variable and classtype and pre score as the independent variables. There are three class types, and R has picked class type 1 as the referencelevel. We could of course change that.

CLASSTYPE <- factor(CLASSTYPE)
trig.fit <- lm(POST ~ CLASSTYPE + PRE)
summary(trig.fit)
## 
## Call:
## lm(formula = POST ~ CLASSTYPE + PRE)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.785  -3.549  -0.247   2.100  17.038 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   10.372      1.810    5.73  5.1e-07 ***
## CLASSTYPE2    -0.957      1.582   -0.60    0.548    
## CLASSTYPE3     4.058      1.632    2.49    0.016 *  
## PRE            0.773      0.170    4.54  3.4e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 4.9 on 52 degrees of freedom
## Multiple R-squared: 0.328,   Adjusted R-squared: 0.289 
## F-statistic: 8.46 on 3 and 52 DF,  p-value: 0.000112

Next we can get the ANOVA table

anova(trig.fit)
## Analysis of Variance Table
## 
## Response: POST
##           Df Sum Sq Mean Sq F value  Pr(>F)    
## CLASSTYPE  2    116      58    2.41     0.1 .  
## PRE        1    493     493   20.57 3.4e-05 ***
## Residuals 52   1247      24                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1