The purpose of this recipe is to use the resampling methods to repeat the ANOVA and compare the results. The dataset used for this experiment is the “Star” from the “Ecdat Package” in R, which is used to explore the effects on learning of small class sizes. In this study, we focus on the effect of class type on students’ math score and the null/alternative hyphothesis are stated as follows:
H0: The variation in the total math scaled score is due to sample randomization only. (i.e, the type of class has no effect on the students’ total math scaled scores) HA: The variation in the total math scaled score is due to something else other than sample randomization (i.e., the type of class may affect the students’ total math scaled scores)
To test the hypothesis, we conduct a single factor (with 3 levels) experiment to analyze the effect of class type on the variability in the math scores. We first conduct a exploratory data analysis, followed by a ANOVA test with model adequacy checking, and then use the resampling methods to repeat the ANOVA and conclude the results.
#Read in the data
library("Ecdat", lib.loc="~/R/win-library/3.1")
## Loading required package: Ecfun
##
## Attaching package: 'Ecdat'
##
## The following object is masked from 'package:datasets':
##
## Orange
data1<-Star
attach(data1)
Factor: The original data have four factors: “classk” with three levels (regular class, small class, regular class aid) indicates the type of class; “Sex” with two levels (girl, boy), “freelunk” with two levels (qualified for free lunch, not qualified for free lunch) and “race” with three levels (white, black, other). In this study, we focus on the effect of “classk” (i.e., the class type).
Continuous variable and Response Variable: The original data have three continous variables: “tmathssk” represents the total math scaled score; “treadssk” represents the total reading scaled score and “totexpk” represents the years of total teching experience. In this study, we treat “tmathssk” as the response variable.
Organization: The original data were obtained from Project Star with a cross-sectional stduy from 1985 to 1989 in the state of Tennessee, it was intended to “test whether students attending small clasess in grades K-3 had higher academic achievement than their peers in larger classes”. The data has 5745 observations with eight variables.
Randomization: The data were randomly selected among the K-3 students in Tennessee, and the students are randomly assigned to different types of classes, however, there is no randomize execution order in the experiment.
In this sample recipe, the effect of class type on the students’ total math scaled scores is studied. An ANOVA is performed to verify if the variation in math scores is due to pure sample randomization or the class type has a contribution effect. The analysis is followed by the model adequancy checking and resampling methods are used to repeat the ANOVA and conclude the findings.
An ANOVA is used as it checks whether the mean of the response variable is the same among several groups. In this study we are testing whether the measn of math scores is the same among different class type groups, therefore ANOVA is an appropriate test. The resampling technique is used the data are not following a normal distribution as assumed in ANOVA.
As discussed in the previous section, the data are randomly selected and assigned, however, they are not randomly executed. There are no replicated and/or blocking used in this experiment.
The boxplot shows that the the math scores do vary among different class types, noted that the median math score in small class is higher than the median math score in regular class and regular class with aide, and it seems that there is no obvious difference between the regular class group and the regular with aid group. Therefore,it is possible that the variation in math scores can be explained by the variation in the class type.
The histogram of the response variable shows that the data is close to a normal distribution, further model adequecy checking is needed to test the underlying population distribution.
#bloxplots
boxplot(tmathssk~classk, ylab="Total math scaled score", main = "Boxplot of Math Scored among Different Class Types")
#histograms
hist(tmathssk, xlab="total math scaled score",main = "Histogram of the Math Score")
According to the ANOVA result, we reject the null hypothesis that the variation in math scores is due to sample randomization only and the effect of class type is shown to be statistically siginificant.
model1 = aov (tmathssk~classk, data = data1)
summary(model1)
## Df Sum Sq Mean Sq F value Pr(>F)
## classk 2 84166 42083 18.6 9.3e-09 ***
## Residuals 5745 13031173 2268
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The qqplot shows that the residuals generated from our ANOVA generally follows a normal distribution, although the data do flutuate a little bit around the tails. The Fitted Y vs. Residuals plots do not show any changing trends of the residuals. Further checking of the normality assumption will be conducted in the following section.
#qqplot
qqnorm(residuals(model1),ylab="Total Math Scaled Score")
qqline(residuals(model1),ylab="Total Math Scaled Score")
#Fitted Y vs. Residuals
plot(fitted(model1), residuals(model1))
We then use the Bootstrapping method to resample the data. To do so, we use the given data to simulate random sampling with replacement and the experiment is carried out for 10,000 times. By comparing the PDF plots of the analytical F-distribution (in red dots) and resampled F-distribution (generated from Bootstrap), we may conclude that the plots are consistent with each other and the assumption of normality is correct. The estimated analytical F statistics at alpha equals to 0.05 further confirm this conclusion: the number from the analytical analysis (under normal distribution assumption) is 2.99 is close to the number from the empirical analysis (without distribution assumption under random sampling) - 3.00. The probability that the F statistic estimated from randomized resampled data is larger than the F statistic estimated from the given data is zero, indicating that it is highly impossible that the variation in the reponse variable is simply due to sample randomization.
#Bootstrap version (with 10,000 iterations)
meanstar = with(data1,tapply(tmathssk,classk,mean))
grpA = tmathssk[classk=="regular"] - meanstar[1]
grpB = tmathssk[classk=="small.class"] - meanstar[2]
grpC = tmathssk[classk=="regular.with.aide"] - meanstar[3]
simclassk= classk
R = 10000
Fstar = numeric(R)
for (i in 1:R)
{
groupA = sample(grpA, size=2000, replace=T)
groupB = sample(grpB, size=1733, replace=T)
groupC = sample(grpC, size=2015, replace=T)
simscore = c(groupA,groupB,groupC)
simdata = data.frame(simscore,simclassk)
Fstar[i] = oneway.test(simscore~simclassk, var.equal=T, data=simdata)$statistic
}
# Now generate a similar plot, comparing bootstrapped distribution to the known F distribution
hist(Fstar,ylim = c(0,0.8),xlim = c(0,8),prob=TRUE,main="F-distribution of the empirical and analytical results")
x=seq(.25,8,.25)
points(x,y=df(x,2,5745), type = "b", col = "red")
# Here is the alpha level from the bootstrapped distribution
print(realFstar<-oneway.test(tmathssk~classk, var.equal=T, data=data1)$statistic)
## F
## 18.55
mean(Fstar>=realFstar)
## [1] 0
# estimate alpha = 0.05 value of the test statistic, given F* from empirical distribution above
# generate quantiles of the analytic F distribution
qf(.95,2,5745)
## [1] 2.997
# alpha = 0.05 value of bootstrapped F* is output here
quantile(Fstar,.95)
## 95%
## 3.022